HOME  |    TRAINING  |   FREE TUTORIALS   |   JOBS
Find out more about our new RSS feed.
FREE Tutorial
BEGINNING PERL PART 5- MORE ADVANCED TOPICS

CATEGORY
SEARCH OUR OTHER TUTORIALS

DESCRIPTION

We've not actually plumbed the depths of the regular expression language syntax - Perl has a habit of adding wilder and more bizarre features to it on a regular basis. All of the more off-the-wall extensions begin with a question mark in a group - this is supposed to make you stop and ask yourself: 'Do I really want to do this?'
Click here to be kept informed of our new Tutorials.


This free tutorial is a sample from the book Beginning Perl.


Some of these are experimental and may change from perl version to version (and may soon disappear altogether), but there are others that aren't so tricky. Some of these are extremely useful, so let's dive in!

Inline Comments

We've already seen how we can use the /x modifier to add comments and whitespace to our regular expressions. We can also do this with the (?#) pattern:

/^Today's (?# This is ignored, by the way)date:/ 

Unfortunately, there's no way to have parentheses inside these comments, since perl closes the comment as soon as it sees a closing bracket. If you want to have longer or more detailed comments, you should consider using the /x modifier instead.

Inline Modifiers

If you are reading patterns from a file or constructing them from inside your code, you have no way of adding a modifier to the end of the regular expression operator. For example:

#!/usr/bin/perl
# inline.plx
use warnings;
use strict;
my $string = "There's more than One Way to do it!";
print "Enter a test expression: ";
my $pat = <STDIN>;
chomp($pat);
if ($string =~ /$pat/) {
print "Congratulations! '$pat' matches the sample string.\n";
} else {
print "Sorry. No match found for '$pat'";
} 

If we run this and momentarily forgot how our sample string had been capitalized, we might get this:

>perl inline.plx
Enter a test expression: one way to do it!
Sorry. No match found for 'one way to do it!'
>

So how can we make this case-insensitive? The solution is to use an inline modifier, the syntax for which is (?i) . This will make the enclosing group match case-insensitively. Therefore we have:

>perl inline.plx
Enter a test expression: (?i)one way to do it!
Congratulations! '(?i)one way to do it!' matches the sample string.
>

If, conversely, you have a modifier in place that you temporarily want to get rid of, you can say, for example, (?-i) to turn it off. If we have this:

/There's More Than ((?-i)One Way) To Do It!/i; 

the words 'One Way' alone are matched case-sensitively.

Note that you can also inline the /m , /s, and /x modifiers in the same way.

Grouping without Backreferences

Parentheses perform the function of grouping and populating the backreference variables. If you have a portion of your match in parentheses, it will, if successful, be placed in one of the numbered variables. However, there may be times when you only want to use brackets for grouping. For example, you're expecting the first backreference to contain something important, but there may be some preceding text in the way. You could have something like this:

/(X-)?Topic: (\w+)/; 

You can't be certain whether your first defined backreference is going to end up in $1 or $2 - it depends on whether the 'X- ' part is present or not. For example, if we tried to match the string "Topic: the weather", we'd find that $1 was left undefined. If we'd tried to do something with its contents, we'd get the warning:

Use of uninitialized value in concatenation

Now that's not necessarily a problem here. After all, we'll find our word in $2 whether or not there's anything preceding "Topic: ". Surely we can just be careful not to use $1?

But what if there's more than one optional field? Say we had an expression that left all but the 2nd and 6th groups optional. We then have to look in $2 for our first word and $6 for our second, while $1, $3, $4, and $5 are left undefined. This really isn't good programming style and is asking for trouble! We really shouldn't backreference fields if we don't need to.

We can resolve this problem very simply, by adding the characters ?: like this:

/(?:X-)?Topic: (\w+)/; 

This ensures that the first set of brackets will now group only and not fill a backreference variable. Our word will always be put into $1.

Continued...


NEXT PAGE



5 RELATED COURSES AVAILABLE
MICROSOFT VISUAL BASIC V6 INTRODUCTION
To go from the fundamentals of Visual Basic programming to the threshold of Advanced level. Gaining in depth prog....
MICROSOFT VISUAL BASIC 5.0 PROFESSIONAL INTRODUCTION
To provide readers with a solid foundation upon which to build Windows applications using Visual Basic 5. Readers....
MICROSOFT VISUAL BASIC 5.0 CLIENT SERVER DEVELOPMENT
This course teaches the skills required to develop client server applications using MS Visual Basic 5.0 Enterpris....
C++ PROGRAMMING
Object oriented programming is fast becoming the leading software design methodology, with C++ becoming ever more....
C PROGRAMMING
This course is design to provide non-C programmers with the essential skills and knowledge necessary to allow the....
 
0 RELATED JOBS AVAILABLE
CONTACT US
Thursday 4th December 2008  © COPYRIGHT 2008 - VISUALSOFT