![]() ![]() |
![]() ![]() |
![]() ![]() | |
© 1997 The McGraw-Hill Companies, Inc. All rights reserved. Any use of this Beta Book is subject to the rules stated in the Terms of Use. |
There is a richness of Control Structures in Perl. Control structures are the traffic police of Perl. In other words, they tell the process which way to proceed in the code, and are your main line of attack in creating algorithms and subroutines in Perl.
Unless you are into using a different control structure each line, it is probably not worth while to memorize them all. Therefore, this chapter's purpose is to only give the more common control structures.
Likewise, the purpose of this chapter is not to give a comprehensive list and usage summary of operators. Instead, the purpose is to relate some of the more common patterns of operators, as well as the common 'gotchas'.
Check out the perlop or perlsyn reference pages that come with Perl if you do want a comprehensive list of operators and control structures.
Perl's Hierarchy of Control Structures and Operators are some of the most flexible (but complicated) in computer languages today. Hence, you should probably just review the chapter to see if you are missing any pieces of knowledge about them. We have divided this chapter into four sections in talking about them.
First, we talk about a thorny topic in perl - how perl determines if a statement is 'true' or 'false'. This is one that many people don't completely understand - since the underlying logic that perl uses is quite complicated.
Second, we talk about the many control structures in perl: if, while, foreach, for , until, and unless. Each one has its uses, although you need not know them all to be effective.
We then talk about how to modify how these control structures execute to our liking. There are three major 'control structure modifiers' that exist; 'redo', 'next', and 'last'(there is a goto, but it will remain un-discussed) In addition, you can tag any piece of code with a label, which we shall discuss as well.
Next, we go into perl's operators. Perl has quite a few operators, and quite a few orders of precedence. We give a few simple rules to maneuver around, and 'not get lost' in how to determine operator precedence.
And finally, we will discuss 'Common Perl Patterns'. Common perl patterns are snippets of code syntax that are used over and over again when dealing with perl; we go over seventeen of these patterns and how they are used.
If you are
just take a look at the common patterns towards the end of the chapter. Otherwise, knowing the way perl operators work is a cornerstone of learning the language.
But before we get into the specific conditionals in Perl, let's go over how Perl decides whether an expression is true or false.
Since Perl only has three datatypes (hashes, arrays, and scalars), it needs to be a little tricky in the way it evaluates expressions to be true or false. In particular, there are four different cases in which a condition evaluates to false:
The following statement will execute ten times, and stop when $counter hits zero:
$counter = 10;
while ($counter--) { print "$counter\n"; }
This prints '10 9 8 7 6 5 4 3 2 1', and does NOT print zero, since the expression has evaluated to false.
Note that Perl has a very special meaning of what zero is in this context. Remember the discussion last chapter of STRING, NUMBER and BOOLEAN contexts? Well:
if (0) { }
evaluates to false, but
if ("0.0000") { }
evaluates to true. Why? Because the second case has 0.0000 i.e.: a string. And since strings always numerically evaluate to zero, if Perl translated "0.0000" to 0 in this case, it would also translate "a" to zero, which would mean:
if ("a") { }
would also be false. Hence, the only case where something is zero is where it is
a: the string "0".
b: a number (not in quotes) that translates to zero. Hence
if (0.000) { }
evaluates to false
If you are dealing with strings, then the false condition cannot be '0' (although if you end up with a string like this it ALSO will evaluate false). The following will print out 'h e l l o' and stop:
@arrayofLetters = ('h','e','l','l','o','', 'a','l','l');
$counter = 0;
while ($letter = $arrayofLetters[$counter])
{
print "$letter ";
}
It does NOT print out the 'all' because after the 'o' in 'hello', $letter becomes '' and the loop terminates.
More subtly, if you are dealing with an array or a hash, and you get zero elements in an array, i.e.: something evaluates to the empty list '()', the loop processing terminates. Hence, the following:
%hashName = ('This','is','a','hash');
while (($key, $value) = each (%hashName))
{
print "$key $value ";
}
prints out 'this is a hash' (or 'a hash this is') and then stops.
What happens here is that %hashName gets one key value pair out of the hash, which is put into the list '($key $value)'. The loop stops when the last hash element has been put into this list.
This form of terminating is very helpful when using function calls. Often when a function call like each is performed, it will return a '()' when done.
The following also terminates after printing 123:
@arrayName = (1,2,3,undef,4,5,6)
$counter = 0;
while ($arrayName[$counter])
{
print $arrayName[$counter];
$counter++;
}
Again, this terminates since the fourth element of array of arrayName is undef.
Anyway, remember these rules as we discuss the Perl control structures below. It of course helps to know exactly how you can make a control structure in Perl in order to apply this knowledge! To that end, we next turn to the syntax of Perl's control structures.
The four Perl control structures most useful to know are while, for, foreach, 'if ...elsif...else' (elsif and else are optional here). Further, there are a couple of very convenient keywords for jumping around within control structures, next and last.
In addition, there are a few more esoteric control structures - unless and until - and one more respectable statement to 'jump around' in control structures: redo. (Perl supports a goto but isn't very proud of it.)
These are the most common forms of each of the control loops, the ones that you shall use 90% of the time. Lets now take a look at each of the forms in detail.
while is almost exactly like its equivalent in C. There is a condition for the while loop to satisfy, and the loop will be executed as long as that condition holds, i.e.: until it is evaluated to false. Also, as in C, the text of the while loop is ONLY evaluated if the while condition is true. In other words, there are no initial executions of the while loop.
Figure 4.1 shows the three forms that the while loop can take, along with the logic which drives them.
fig41.fig
Figure 4.1
The Different Forms of while
1) The condition is first evaluated, then checked to see if the condition is true or false.
2) If, and only if, it is true, are the internal loop commands performed.
Then the condition is checked again, and so on. If the condition is false, then the next command in the program is executed.
The first form is the 'regular' while, or 'vanilla' while. Here, the statements look like:
my $xx = 0;
while ($xx++ < 10)
{
print "$xx\n";
}
where $xx is printed as long as it is less than 10, printing '0 1 2 3.4 5 6 7 8 9", and terminating when the condition becomes false.
The second form of while is the 'one line form'. This is a convenient, short hand form, looking like:
print "$xx\n" while ($xx++ < 10);
which prints out the same thing as the 'longhand' form of while above ('0 1 2 3 4 5 6 7 8 9'). If you want, you can do multiple statements in this form of while loop by separating the statements by commas, and putting the whole thing in parentheses:
(print ("$xx\n"), print (FD "$xx\n")) while ($xx++<10);
although you need to be pretty careful about orders of precedence here.
The third form of while has what is called a 'continue' block on it. A continue block is done after the loop is finished and before the next condition's evaluation. They are most often used with the next keyword to break out of the current loop (see notes on next below).
Therefore,
my $xx = 0;
while ($xx++ < 10)
{
($form = 1, next()) if ($xx == 5);
print "$xx ";
}
continue
{
(print (":IN CONTINUE $xx:"), $form = 0) if ($form == 1);
}
will print out
0 1 2 3 4 :IN CONTINUE 5: 6 7 8 9".
Each time the loop ends, Perl drops down to the continue block. However, only if $form is set to one, does it print out the ':IN CONTINUE: string.
Continue blocks are good for times where you want to break out of a loop prematurely, and then do something before going to the next iteration of the loop. For example,
$FALSE = 0; $TRUE = 1; $WANTED = $TRUE;
while (defined $line = <FD>)
{
($wanted = $FALSE, next()) if ( tooLong($line));
($wanted = $FALSE, next()) if ( tooShort($line));
($wanted = $FALSE, next()) if ( tooFat($line));
($wanted = $FALSE, next()) if ( tooThin($line));
}
continue
{
process() if ($wanted == $TRUE);
$wanted == $TRUE;
}
Here, it makes sense to use continue, since as soon as you find out that the $line is tooLong(), you don't need to check whether it is tooShort(), tooFat() or tooThin(). You can safely skip to the end of the loop, and then go onto the next line.
for is also like its C equivalent. The structure has a starting variable, a test condition, and a incremental variable to act upon each time the loop is executed.
A sample for loops is shown in Figure 4.2, along with its logic:
fig42.fig
Figure 4.2
The Different Forms of for
This works exactly like C. Perl sets the initial condition first, and then proceeds to loop through the body of the for loop as long as the end condition evaluates to true. After each for loop, the last statement in the for (in this case '$xx++') is performed, and the loop starts again. Two things to note here:
1. the last statement in the for (xx;yy;zz) is only performed after the loop has executed, and before the next evaluation of the end condition. The following, for example, will never execute any of the for loop:
for ($counter = 0; $counter == 1; $counter++)
{
}
since the test $counter == 1 is performed before $counter++.
2. All the statements in the for loop can be any legal Perl statement.
The following is the usual way that one uses the for loop, to loop through the elements of an array:
for ($counter = 0; $counter < @arrayName; $counter++)
{
&do_something($arrayName[$counter]);
}
In other words, $counter is set to zero, the loop is performed, $counter++ increments to one, and THEN $counter is tested to see if it is less than @arrayName, which is the number of elements in the array.
However, this is not the limit of usability of the for loop. Any while loop can be translated into a for loop, although it is not always wise to do so. One of the while loops in the previous section becomes:
for ($line = <FD>; defined $line; $line = <FD>)
{
do_function($line); # loop continues until '$line' is blank
do_function2($line); # $line is incremented.
}
Here we read a line at a time and if it is not defined, terminate, and if it is defined, go on to read the next line.
foreach is very similar to the "for..in" structure in Bourne Shell.. It combines much of the logic of for and while. foreach has a built in array manipulator, which iterates through each of the elements of an array or hash, and in the process makes each element writable. foreach is very convenient when modifying several elements of a writable array, or when iterating over an array or hash without having to resort to a counter.
There are four types of foreach loops, shown in Figure 4.3:
fig43.fig
Figure 4.3
foreach Forms
Perl begins with the first element in @arrayName and loops through to the end of the array @arrayName.
The tasks requested in the loop are performed in turn on every element from the starting element to the end of the array or hash.
Now, one of the chief uses of 'foreach' is to modify each of the elements in an array. To do this, all you have to do is modify the element after the foreach, i.e. the bold variable in the following; foreach $element (@array)'. Hence
@array = (1,2,3,4); # set the array.
foreach $el (@array)
{
$el++; # increment each of the elements in the array.
}
print "@array\n"; # prints 2,3,4,5
increments the value of each element in '@array', by making $el each of the elements of array in turn, and then incrementing it. This turns (1,2,3,4) into (2,3,4,5).
You can also use foreach as a type of while loop. The following code:
foreach $key (keys %hash) # for each of the keys in a hash.
{
print $hash{$key}; # print the value associated with that key.
}
goes through each element in the hash, and then prints out what element is associated with that key.
Now, there are a couple of other things to understand about foreach. Although the most common usage is to go through elements as we did above, you can also use any array or list. The following prints out all the letters from A to Z:
foreach $letter ('A'..'Z')
{
print "$letter\n";
}
where '..' is, again, the list construction operator that we encountered last chapter, and the following prints out the return values from a function:
foreach $element (&functionReturningArray)
{
print "$element\n";
}
Although if you try to set any of the above values, you will get an error, i.e.:
foreach $element (1..10) { $element++; }
gives a syntax error. Why? because each of the values in '(1..10)' are read only. Therefore, they cannot be altered in the same way that 'foreach $element (@array)' can modify @array. You will get an error:
Modification of read-only value attempted at script.p line 1
which refers to the fact that the digits 1..10 cannot be modified since they are constants.
Note that if you want to do something such as go through only a select number of elements in a foreach statement, you can use the following:
foreach $element (@array[1,2,3])
{
}
which uses slicing to access array elements 1, 2, and 3 out of @array. We touched a bit on slicing in the chapter on variables, and will do so again in the section on contexts. 'slicing' is a very important concept in how Perl operates.
foreach, like while, also has a continue form, although I've never used it.
The 'if..else..elsif' control structure is similar to C's except that C has 'else if' instead of Perl's elsif.(only Larry Wall knows the reason why). Anyway, the if..else..elsif statement is Perl's way of deciding between different courses of action, and can be used much like Pascal and C's switch statement. Below are some forms
Formally, the if then else elsif syntax looks like Figure 4.4:
fig44.fig
Figure 4.4
if..else..elsif
Here, condition and next_condition may be any Perl statement. The if interprets them as a 'true/false' condition, and then evaluates the corresponding block for the first one it finds that is true (for how Perl evaluates truth or falsity, see either the description in this chapter, or the one in the chapter on 'Perl variables'.) Hence the following:
if ($string1 gt $string2 && $number > $number2)
{
doSomething();
}
elsif ($string lt $string2)
{
doSomethingElse();
}
else
{
doADefaultSomething();
}
works as follows. The first case ($string gt $string2 && $number > $number2) is evaluated to be either true or false -- the code associated with it being executed if true -- the second case is evaluated next if the first evaluates to false,($string1 lt $string2) and finally, the default case is executed if both the first two cases evaluate to false.
Note, again, that this means that only the first condition to evaluate to true is tested, and that control then skips to the first code block AFTER the if. This means that:
if (5 > 4)
{
print "FIVE IS GREATER THAN FOUR";
}
elsif (3 > 2)
{
print "THREE IS GREATER THAN TWO";
}
This will print 'FIVE IS GREATER THAN FOUR'. Since order matters in an if..else..elsif block, think carefully about the order that you write the conditions in the if block. For example, the following code to compare dates will not work if we switch the 'year' and the 'month' conditions, because a month field is less important than the year in determining which date is earlier.
if ($a{'year'} > $b{'year'})
{
print "date a happened later than date b";
}
elsif ($a{'month'} > $b{'month'})
{
print "date a happened later than date b";
}
elsif ($a{'day'} > $b{'day'})
{
print "date a happened later than date b";
}
else
{
print 'date a happened either on the same day or earlier than date b';
}
As stated, order is important here. After performing the first block of code within the control structure (that happens to be true) the processing resumes at the next line after the structure.
Perl also has a short form of if. As an alternative to above you could say:
(print("date a happened later than date b\n"), $gt = 1)
if ($a{'year'} > $b{'year'});
print("date a happened later than date b\n"), $gt = 1)
if ($a{'month'} > $b{'month'} && $gt != 1)
print("date a happened later than date b\n"), $gt = 1)
if ($a{'day'} > $b{'day'} && $gt != 1);
print("date a happened either on the same day or earlier than date b") if (!$gt);
where you use $gt as a tag which tells you whether or not you have found that date a is in fact greater than date b.
Control of Control Structures
There are often cases where you want to nest control structures and then break out of them, as opposed to completing a certain set of iterations. Suppose that we want to break out of a loop, if a condition holds true. Something like:
foreach $line (@lines)
{
# how do we get out, without iterating through all the lines?
}
In this case, Perl provides three ways of moving around in control structures: next, last, and redo. next provides functionality to stop the current version of the loop, and evaluate the next one. last, on the other hand, breaks out of the current loop altogether, and returns to any loop that it was nested in. Finally, redo is a specifically Perlish keyword, which says for the control loop to 're-evaluate the expression again'.
We shall take each of these in turn.
next
Formally, the syntax of next is simple. Put it in control structures at any point that you want to skip to the next evaluation of the loop, i.e.:
fig46.fig
Figure 4.6
'next' used in flow control.
We have already seen an example of next up above, when dealing with the while loop and continue. Here's another example of how next works, this time in nested loops:
for ($xx = 1; $xx < 4; $xx++)
{
foreach $value ( 1,2,3,4,5,1,2,3) # point A
{
if ($value > 3)
{
next; # goes to point A. 'Short Circuits' the loop, to
# go to the next evaluation of the loop.
} # if there was a continue loop here would go there
print "$value ";
}
print "\n";
}
print "DONE";
This prints out:
1 2 3 1 2 3
1 2 3 1 2 3
1 2 3 1 2 3
DONE
In each loop, the values 4 and 5 are skipped because the next routes back to A, and hence skips the print. next then passes control to the foreach, which then evaluates the next loop, in this case, picking the next number.
The following prints out 'next finishes the loop':
foreach $word ('next','finishes','skip','skip','the','loop')
{
next if ($word eq 'skip');
print "$word ";
}
Note that it doesn't print the word 'skip' but still continues till the end of the loop.
next is very handy for ignoring items inside a given control structure. For example, the following ignores any line with comments in them:
foreach $line (@lines)
{
next if ($line =~ m"#"); # ignores lines with # in them.
&doSomethingWithNonCommentedLines();
}
and the following ignores the first 100 lines of a file:
my $lineNo = 0;
while ($line = <FD>)
{
next if ($lineNo++ < 100);
}
As we shall see, next is often used in conjunction with last.
Last
last is just as easy to use, and again has the default behavior to short-circuit the loop totally, turning control to the next block of code following the last.. last has the following, formal, syntax (just put it in where you need to 'break out' of a loop), something like Figure 4.7:
fig47.fig
Figure 4.7
last used in flow control.
Note that last does not totally break out of every loop. It simply cuts to the next controlling block, which may or may not be inside another loop.
Hence, in the following:
1 foreach $value (1,2,3,4,5,1,2,3)
2 {
3 if ($value > 3)
4 {
5 last; # jumps to point A.
6 }
7 print "$value ";
8 }
9 # point A
10 print " DONE";
prints out '1 2 3 DONE'. However:
1 for ($xx = 1 $xx < 4; $xx++)
2 {
3 foreach $value (1,2,3,4,5,2,3)
4 {
5 last if ($value > 3); # goes to point B
6 print "$value ";
7 }
8 # point B
9 print "\n";
10 }
This prints out:
1 2 3
1 2 3
1 2 3
i.e., it short circuits the inner loop, going to line 8 every time $value becomes greater than 3.
last is good for exiting on an error, or after a particular piece of data has been found. The following example returns only the first thousand lines of a file:
my $lineCount = 0;
while ($line = <FD>)
{
last if ($lineCount++ > 1000);
}
The following pseudocode returns an error status if an error occurs while evaluating through the elements (the error is flagged by the subroutine 'errorInLine($line)' returning true, and the subroutine 'doSomething' is called if an error occurred):
1 foreach $line (@lines)
2 {
3 if (errorInLine($line))
4 {
5 $error = 1;
6 last; # breaks out of loop to point A
7 }
8 }
9 # point A
10 if ($error == 1)
11 {
12 doSomething();
13 }
Likewise, suppose you wanted to check if a given array had an element that was greater than 10. last would be helpful here, too:
1 @bigArray = (1,2,11,...... (ie: thousands of elements))
2 foreach $element (@bigArray)
3 {
4 if ($element > 10)
5 {
6 $largeElementFound = 1;
7 last;
8 }
9 }
After the first element greater than 10 is found, we don't have to go through the rest of the array. All that matters is that we found one.
Finally, last is very helpful when you want to force a user to enter data in the correct format before continuing, with a 'retry' if they don't enter the correct data:
1 my ($input1, $input2);
2 while (1)
3 {
4 print "Please enter two values.\n";
5 chop($input1 = <STDIN>, $input2 = <STDIN>);
6 last if (($input1 !~ m"^(\d+)$") || ($input2 !~ m"^(\d+)$"));
7 print
8 "Please enter numbers for input1 and input2! you said $input1 and $input2\n";
9 }
This will force the user to keep entering text until both $input1 and $input2 are integers.
d) redo
We touch briefly on the redo keyword, which is a bit of an oddity. It 'stalls' a loop, going back to the loop like next does, but 're-evaluates' it. The best way to understand redo is in action. The following, for example:
1 my $xx = 1;
2 foreach $element (1,2,3,4,5)
3 {
4 print "$element ";
5 redo if ($xx % 2 == 1);
6 $xx++;
7 }
prints out '1 1 2 2 3 3 4 4 5 5 ', in effect doubling the array. In short, the redo statement looks at a counter, $xx, and based on whether or not the remainder of that counter, divided by 2, equals one (1/2 = 0 mod 1, for example), re-does the expression, going through each element twice. Hence, the following is an infinite loop.
foreach $element (0) { redo }
redo is fairly uncommon, but there are a few places that you shall find it useful. For example, the following insures that each element in a foreach loop will get input that has a 'y' in it:
foreach $question (@questions)
{
$question;
redo if (($answer = <STDIN>) !~ m"y");
push(@answers, $answer);
}
(Not that useful an example, perhaps, but good as a template perhaps!)
Labeling your Control Structures
For those of you who don't think that next, last, and redo are powerful enough in their native form, there are labels you can put on them. This works much like a label for a goto, but is better because you are limited to only going to loops above the one that the next or last is in. This is helpful for doing algorithms in which you have deeply nested constructs:
fig48.fig
Figure 4.8
Labels breaking out of a loop.
As you can see, the label makes the next, last or redo affect the labeled loop rather than the loop that contains the next, last or redo.
For example, the following
1 LABEL: foreach $valueouter (1,2,3) # LABEL this is where the next goes.
2 {
3 foreach $valueinner (1,2,3,4,5,1,2,3)
4 {
5 if ($valueinner > 2)
6 {
7 next LABEL ; # Goes to LABEL (valueout loop)
8 # *instead* of going to foreach $valueinner.
9 }
10 print "VALUEOUT: $valueout VALUEIN $valueinner\n";
11 }
12 print " DONE\n";
13 }
prints out
VALUEOUT: 1 VALUEIN: 1
VALUEOUT: 1 VALUEIN: 2
VALUEOUT: 2 VALUEIN: 1
VALUEOUT: 2 VALUEIN: 2
VALUEOUT: 3 VALUEIN: 1
VALUEOUT: 3 VALUEIN: 2
DONE
What's happening here is that the behavior of the next is being applied not to the default label (the $valuein loop), but instead is being applied to the LABEL. Hence, the next here short circuits the current loop it is in ($valuein) completely, and then causes the $valueout label (where it goes) to jump to the next value.
If last was here instead of next the printout would be:
VALUEOUT: 1 VALUEIN: 1
VALUEOUT: 1 VALUEIN: 2
DONE
because last is now short circuiting both loops. The process flow hits a value greater than two in valuein, goes to the label LABEL, immediately kills both iterators, and then jumps to print "DONE".
Last Word on Perl Control Structures
This is really all you need to know about Perl control structures. You could even forget about LABEL if you really wanted to. In some instances it helps make code really clean, as when looping through multiple arrays and deep nested structures. There are other looping structures in Perl which are like while, for, foreach, and if with a twist.
until is like while, but instead negates the expression. In other words, while (!$expr) and until ($expr) both mean the same thing.
until ($a == $b) # OR while ($a != $b)
{
doSomething(); # does stuff untill $b equals $a;
$a++;
}
do..while is a control structure like while, but it automatically does the first loop before the evaluation. Since the while comes at the end of the loop, the block is executed once before the condition is tested.
do { print "HERE!\n"; } while ( 1 == 0);
always prints out 'HERE', whereas:
while(1 == 0) { print "HERE\n"; }
never prints out 'HERE'.
unless control structure is like if, except the negation. if (!$expr) and unless ($expr) mean the same thing. And finally, do..until does exactly the same thing as do..while but does it 'opposite' (i.e.: it evaluates an expression until a condition is true (not false, as in do..while)
Again, these structures can be useful in certain situations, but they can also clutter up your code. Use sparingly but well.
Introduction to Perl Operators
Perl has quite a few operators. Worse yet, the precedence of Perl operators is quite complicated. Fortunately (at least for people who know C/C++) all the operators that Perl shares with C/C++ have the same precedence as they do in those languages.
These operators can also make for very unclear code, and is responsible for many of the JAPH (Just Another Perl Hacker) scripts out there, such as the one provided by Abigail (abigail@fnx.com):
perl5.004 -wMMath::BigInt -e'$^V=new Math::BigInt+qq;$^F$^W783$[$%9889$^F47$|88768$^W596577669$%$^W5$^F3364$[$^W$^F$|838747$[8889739$%$|$^F673$%$^W98$^F76777$=56;;$^U=$]*(q.25..($^W=@^V))=>do{print+chr$^V%$^U;$^V/=$^U}while$^V!=$^W'
Wouldn't it be easier to read this as:
print 'Just Another Perl Hacker';
Of course this takes all the fun of it, but, it makes a point. It is very easy to get very cryptic with Perlish syntax. Keep this in mind as we discuss them in the next section.
Perl Operator Precedence
Following is a list of all the Perl operators and their order of precedence. This list is taken straight out of the perlop man page. The operators are listed in order of precedence, highest precedence to lowest precedence. Following this list are sections that describe the most important set of operators and give plenty of examples on their usage. Table 4.1 gives the operator precedence list:
Table 4.1 Operator Precedence:
ORDER OPERATOR NAME && DEFINITION
left list operators Includes functions, variables, items in parentheses
left -> dereferencing operator. See chapter on references.
NonA ++, -- increment and decrement. ++$aa adds one to $aa.
right ** exponentiation. $a**$b raises $a to the $b power
right !, ~,+,-\ not, bit negation, reference op, unary +/-(i.e.: '-4')
left =~,!~ matching operators with regular expressions
left *,/,%.x times,divided,modulus, string and list multiplier.
left +,-,'.' (one dot) plus, minus, string operator
left <<, >> binary shift left operator, binary shift right operator
NonA named unary ops functions that take one argument filetest operators
(-f -X split examples, see perlref for more info)
NonA <,>,<=,>= numeric less than, greater than,less than or equal to,
or greater than or equal to
lt,gt,le,ge string less than, greater than,less than or equal to,
greater than or equal to
NonA ==, !=, <=> numeric equal to, not equal to,comparison operator
,eq,ne, cmp string equal to,not equal to,comparison operators
left & binary and... does a bit match on each bit in strings
left |,^ binary or: does an 'or' bit match on each bit in strings
binary xor: does an 'xor' bit match on each strings
left && 'and' operator. Evaluates to true if both are true.
left || 'or' operator. Evaluates to true if both are true.
NonA .. list operator, as in (1..50) == (1,2,3..50)
right ? : conditional operator as in ($a = ($b == 1)? '0' : '1'
right =,+=,-=,*=, equals, plus equals, etc.
**=, &=, <<= &&=
left ',',=> list separator, another list separator as well
left not synonym for ! except lower precedence
left and synonym for && except lower precedence
left or synonym for || except lower precedence
'xor' exclusive or, evaluates to true if just one args is true
Quite a few levels! We next turn to how to deal with this complexity, which is both a blessing and a curse.
Techniques to clarify Perl expressions.
The complexity of this table can cause real hangovers. One of the more frustrating aspects of Perl is that the code can be full of ambiguity if you are not careful.
In other words, what task will Perl do first? In situations which the code seems to be ambiguous, you have three possible options to make it clear:
1.split up the offending statement into multiple sub-statements.
2.put parentheses around a given ambiguity.
3.Use the above Operator Precedence Table to determine precedence.
These three tactics are given in the order which you should prefer them. In other words, splitting up a complicated statement into sub statements should be preferred over parenthesizing, and so on. We discuss each below.
Splitting up Perl Statements
This is the simplest technique, and is the one that should be preferred. One of the bad things about Perl is the power it gives for programmers to 'ramble at the mouth' too long. You can make horrifically complicated sentences in Perl. You can often get a burst of clarity by splitting Perl sentences up. The statement
$num = $sub**power + log($sub2);
can become the following two statements:
$num = $sub**$power;
$num+= log($sub2);
This not only helps you with your coding, it helps others to read your code. You get less bugs, and when it comes to other people maintaining your code, they will thank you. Likewise,
print log $sqrt, " "xlength ($sqrt);
becomes
$log = log($sqrt);
$length = " " x length($sqrt);
print $log . $length;
Of course, it is up to you to decide 'how long is too long'. People just starting with Perl might find the above comforting - whereas experienced Perl programmers may wonder why there are three lines, and not one.
Parenthesize it
Perl also provides the parenthesis '()' as an easy mechanism for disambiguating Perl operators.
In general, if you have a question about an operator's precedence, simply add a parenthesis around the concerned expression, and voila! your precedence question is solved.*
***Begin Note***
Using parentheses to disambiguate operators does not always work in some pathological cases. For example:
&function('a', exit());
will call exit (and perform an exit) before the subroutine is called. Hence, you should say something like:
function('a'), exit();
instead, i.e.: you need to cut the offending statement up into several lines!
***End Note***
However, it is also very easy to take the parenthesis idea to extremes. Consider the following Perl statement:
print (($a)+($b)); # Exhibit A: parenthesisitis.
This is 'parenthesis-itis', the practice of always putting parenthesis around things.
There's a fine line between the above and:
print $a+$b; # Exhibit B: minimalismitis
or
print ($a+$b); # Exhibit C:
Which of the above is the easiest to read? We vote for Exhibit C. 'print ($a + $b);' emphasizes the functional part of the statement. After all, you are printing the sum $a+$b, so they should be grouped together.
Exhibit A is too paranoid, and Exhibit B, although if you get used to it can be rather freeing, also begs the question of whether or not print munges all of its arguments before or after being evaluated. In other words, do Exhibit B evaluate to:
(print ($a))+$b;
or
print ($a+$b);
In this case, the code does evaluate to the second statement - print ($a + $b) - but the lack of parentheses will bite you someday if you aren't careful. If you say something like:
print 'done','printing','arguments',exit();
What does this do? In fact, it exits first without printing anything. Perl evaluates this to
print ('done','printing','arguments',exit());
which then evaluates the exit before passing it to 'print'. Therefore, the program exits. In this case:
print ('done','printing','arguments'), exit();
isn't only more clean, but syntactically correct.
The point of all this is to make code as clear as possible without being verbose. Remember, the person who maintains that code may be you!
Using the Precedence Table
Actually, this should be a last resort, since most of your code should be clean enough that you shouldn't have to deal with precedence rules. The classic example is:
$yy = ++$xx%2;
In other words, does the %(modular) operator go before the '++'? Is
$xx = 4;
$yy = ++$xx%2;
equal to 5%2 or 4%2? Now, we can use the rule table above to figure out that this is actually equal to
$yy = (++$xx)%2
and the '++' goes before the '%'.. But why not say so in the first place?
Anyway if you opt for example #3 above, there are two rules to remember when looking at the operator precedence table:
Rule #1. Items at the top of the table are of higher precedence than items at the lower level.
Consider the following statement.
if ('1' > '0' or '2' > '1' and '3' < '4') # if 1 is greater than 0 OR 2
# is greater than 1 AND
# three is less than 4.
This statement could be ambiguous. To see how it would be executed, we can look at the Operator Precedence table. Since > and < are of higher precedence than 'or' and 'and', we can put a set of parentheses around each '<>', The statement becomes:
if (('1' > '0') or ('2' > '1') and ('3' < '4'))
And since and is of higher precedence than or, we can similarly put parentheses around the and expressions:
if (('1' > '0') or (('2'>'1') and ('3' < '4')))
This is how Perl interprets this statement. For human eyes, this is coming pretty close to 'parenthesis-itis', so we just might consider taking the parentheses off the '>' signs to have the expression read:
if (1>0 or ( 2>1 and 3<4))
This seems like the right balance between ambiguity and verbosity. Again, parentheses are around the point where it makes a logical difference in how the statement is executed. After all, the construct:
if((1>0 or 2>1) and 3<4)
does exactly the opposite.
Rule #2: If confronted with an expression with more than one of the same precedence rules, those rules are executed in the order given by the 'ORDER' column in the Operator Precedence table.
This almost never happens. Most of the time, as in the case of +and *, it does not matter which order the statement is evaluated. In the cases where it does matter, it's better to put parentheses around the offending expression for clarity, or split it up. However, we can for fun interpret
$a = 2**3**$exponent;
to be equal to '$a = 2**(3**$exponent);' since '**' is right associated, and hence the rightmost part of the expression is evaluated first. And:
$a +=$b *=5;
is equal to '$a += ($b *=5);' since it too, is evaluated right to left.Likewise, since subtraction is sensitive:
$a = 10 - 1 + 1;
becomes:
$a = (10 -1) + 1;
Hopefully, you get the idea. Such exercises will help you get used to Perl's precedence levels. However, I would stay away from the precedence table, instead favoring splitting up long statements and parenthesizing statements first.
Common Operators in Perl
As you can see from the precedence table, not only are there a lot of levels to consider when dealing with Perl, there are a lot of operators to consider.
Fortunately, there are two points that will keep you sane here.
First, there is a high amount of overlap in this area between C and Perl. Second, although the number of operators in Perl is large, only a fraction are frequently used.
However, C and Perl aren't exactly the same as far as operators go, and sometimes you do want to use the infrequently used operators.
The following is an introduction to Perl operators, the idea to give you enough to satisfy you for a long time to come. Those who want to get the complete reference, turn to the Perlop man page.
Arithmetic and Increment Operators in Perl
Perl has all the arithmetic operators that C does. In fact, they work exactly the same as their equivalents in C. Just make sure that you are using 'numeric strings' when dealing with them. After all, 'a' + 'b' is interpreted as 0 + 0 in Perl.
***Begin Note***
If you are doing complicated math processing, you are better off making a C module and linking it in with Perl. Perl is simply too slow for calculations.
We briefly touch on this in the last chapter, Perl interfaces. But your best source of information for this is the perlxs manpage, and the package swig included on the CD.
You may also want to check out Math::ematica, interface to Stephen Wolfram's Mathematica, and everything on CPAN under the category 'Math::'
***End Note***
$a = $b + $c;
$a = 'MISTAKE' + 20; # ERROR ('MISTAKE' will be treated as '0', and
# $a becomes 20. (debugger -w will catch this)
$a = (@arguments + @array2) * 20; # sets $a equal to number of elements in
# @arguments plus number of elements in
# @array, times 20.
$speed = (1/2) * $acceleration * ($time **2);
$a = 10; $a++; # incrementor. $a becomes 11;
$a = 10; $a--; # decrementor. $a becomes 9;
There is actually an exception to the string interpolation rule when dealing with incrementing, '++'. There is some special magic that is associated with this operator that allows you to do:
$aa = 'AA';$aa++; # $aa becomes 'AB';
$aa = 'zz'; $aa++; # $aa becomes 'aaa';
$aa = '01'; $aa++; # $aa becomes '02';
$aa = '09'; $aa++; # $aa becomes '10';
This happens for all characters that are alpha-numeric (a-z, A-Z, 0-9. It does NOT happen for non numeric characters. Hence:
$a = '+'; $a++;
does not work.
Perl Conditional Operators
Note that there are two sets of conditional operators in Perl:
the set dealing with numerics (==, >=, <=,>, etc.)
the set dealing with strings (lt, gt, le, ge, eq) And likewise there are two new operators:
$a <=> $b which is a comparison function for numbers. <=> returns -1,0,1 depending on whether or not $a is numerically less than, equal to, or greater than $b.
$a cmp $b which is a comparison function for strings. cmp. returns -1,0,1 depending on whether or not $a is alphanumerically less than, equal to, or greater than $b. These comparison operators will be most helpful when we get into sorting (see next chapter, Special Functions).
Perl interpolates a variable into either string or numeric context based on comparison operators, which is in the same vein as the arithmetic operators. You should not be doing such gymnastics as:
if (9 gt 10) # evaluates to true but does not do what you would want.
# The statement does so since 9 is greater 10 lexically
if ('string' == 'strung') # also evaluates to true but does not do what you would
# want. Two strings are *always* equal
# numerically since they both
# evaluate to zero in numeric context.
whereas, what you probably want is:
if ('aaa' gt 'b') # works, evaluates to false.
if (10 > 5) # works too, evaluates to true.
Perl Logical Operators:
Perl provides logical operators for use in statements. They are:
'&&' logical and (1 && 1 == 1)
'and' synonym for && (1 and 1 == 1)
'||' logical or (1 || 0 == 1)
'or' synonym for || (1 or 0 == 1)
'xor' exclusive or (1 xor 1 == 0) (1 xor 0 == 1) These logical operators behave like their C equivalents in many cases. These always evaluate to either true ('1') or false ('').
if (10 > 9 && 20>10) # evaluates to 1 (true).
# since both subcases are true.
if (10 == 10 || 14 < 10) # evaluates to 1 (true) since
# one of the two sub cases are true.
Short Circuiting
'||' and '&&', 'or' and 'and' have some extra functionality that is very handy, and makes for very readable code:
functionReturningScalar() || warn "Expression was false!\n";
# Example of short circuiting. Tries to do
# 'functionReturningScalar()'. If this returns
# '' or 0 (false) , prints 'expression was false'.
@array = functionReturningArray() || @otherArray || ();
# sets @array equal to the array returned
# by the functionReturningArray(), or
# if this is empty, @otherArray.
printItWorkedIfTrue() && print "IT WORKED!\n";
# opposite of '||'. If 'printItWorkedIfTrue()'
# returns a non-zero (true) value,
# then (and only then) print 'IT WORKED!'
These examples showed a technique called Short Circuiting. Short circuiting is a handy way to make your code more readable and be less verbose at the same time.
In the case of a '||', to evaluate an expression to true (non-zero or '') all Perl really needs to do is find the first expression which is true. Hence, what Perl does is stop evaluating an expression as soon as a true value is found. In other words:
0 || 0 || 0 || 1 || 1; # evaluates first four expressions
# stops at first '1'.
# Does not evaluate last expression..
Likewise, in the case of a '&&', to evaluate an expression to true, Perl needs to have every sub-expression true. In other words, as soon as a false expression is reached, Perl stops:
1 && 1 && 1 && 0 && 0; # evaluates first four expressions
# stops at first '0'. Does not evaluate last expression.
In addition, the synonyms for these operators, namely or and and, do exactly the same thing as their '||' and '&&' counterparts. They short circuit as well, but do it in a very handy way. Since they are so low on the precedence scale (in fact, the lowest) you can say
open FD, "filename or die;
which will open a file descriptor up or die if it cannot open it. The thing to note here is the lack of parentheses. I prefer to put parentheses around this simply because I like having function calls always denoted by (), but in some ways this is clearer (even the fact that it is in English, rather than '||' makes it clearer!). Do what feels appropriate to you..
The Conditional Operator:
Perl borrows from C the very handy conditional operator, the "expression? trueCase: falseCase form". It works like C's in that if you say the following:
$length = (@array > 2)? 'more than two elements' : 'two or less';
This acts exactly like:
if (@array > 2)
{
$length = 'more than two elements';
}
else
{
$length = 'two or less';
}
In other words, one line (condition) ? trueCase : falseCase; can take the place of 6 lines of code. This is extremely handy for shortening code which has lots of separate 'if then else' clauses. Here's a switch statement:
$value = ( $value eq 'Mon')? 'Monday' :
( $value eq 'Tue')? 'Tuesday' :
( $value eq 'Wed')? 'Wednesday' :
( $value eq 'Thu')? 'Thursday' :
( $value eq 'Fri')? 'Friday' :
( $value eq 'Sat')? 'Saturday' :
( $value eq 'Sun')? 'Sunday' :
"Not a day of the week!";
etc. etc. which basically expands the days of the week into their longer forms. This works because the 'false' case itself is a conditional, which in turn has a 'false' case of its own, and so on. Although you may want to write this as:
%days = ('Mon' => 'monday', 'Tue' => 'tuesday', 'Wed' => 'wednesday',
'Thu' => 'thursday', 'Fri' => 'friday', 'Sat' =>'saturday', 'Sun' => 'sunday');
$value = ($days{$value})? $days{$value} :
"Not a day of the week!\n";
where the hash takes place of the bulk of the switch statement, and the only case for '? :' is the case where '$value' is not a day of the week.
Perl File and Command Operators
The usage of File Operators and Command Operators is dealt with in some detail in both the chapter 'Variables', and the chapter 'special Perl functions'. However, here we introduce the concept of file and command operators, as well as their syntax.
Perl has a built-in interface into each operating system that it has been ported to. This interface allows Perl to interact with files on disk and also to execute a shell command directly. This is a very important concept for Perl. It is one of the reasons that Perl is so portable and powerful. We take a look at file handling and shell command execution below.
The <FILEHANDLE> File Operator
'<FD>' reads from the filehandle FD into a scalar you specify on the left hand side. If you want to read one line from the file called "fileName", just do this:
open(FD, "fileName"); # open syntax to process a file.
$line = <FD>; # reads a line out of the file.
close(FD); # closes the file.
After all, you won't have the luxury of having Terabytes of RAM! Many times the important data is on disk (or tape or CD Rom, what have you), and Perl's job is to make reading that data as easy as possible.
The ` ` Backticks Operator
The backticks operator takes a string, interpolates it, and then executes the command as a shell command. The following puts the output from a find command into the variables @lines and $line respectively:
my @lines = `find . -print`; # executes a find operation, puts the results into @lines.
$line = `find . -print`; # executes a find operation, puts the results into $line.
Beware non-portability here, however! The command find is not available on all systems, and backticks (` `) should be avoided whenever portability is important. Use Perl builtins instead. We will discuss these builtins in chapter 11.
Summary of Perl Control Structures and Operators
The above control structures and operators make Perl an extremely 'freeing' language to program in. Compared to languages like C++ or C, Perl becomes much more like a 'natural' language. In fact, you can say a statement like:
'sleep until $sun eq 'up'
which in fact parses in Perl. Note however, that too much freedom can be a 'bad' thing - although they are cool, JAPH scripts are probably NOT your best examples of readable code.
Hence, the purpose of the next part of the chapter is to show some common templates of expressions in Perl, and when they are used. They should be able to handle the vast majority of your programming needs.
Examples: Common Expression Patterns in Perl
As we have said before, the good thing about Perl code is that it is infinitely flexible. And the bad thing about Perl code is that it is infinitely flexible. Hence, the idea of this section is devoted to showing some of the more clean, and common, expression patterns for Perl.
Out of the HUGE number of possible Perl expressions, there are really only a few that you should actually be using.
In fact, I would go as far to say that the smaller amount of expression patterns you have, the better. The less types you have, the easier it will be for others to understand your code, and the easier it will be to maintain.
Likewise, a limited number of expression patterns opens up the possibility for you to build tools to help manage your complexity. There is nothing cooler than writing your own tools that actually debug your programs for you, or give you a 'road map' into what is going on. If you keep your syntax minimal you'll also find it a lot easier when it comes time to build C/C++ extensions.
In other words, let your object-oriented or modular syntax do the work, not the tricks of the interpreter. We discuss object-oriented Perl techniques in the second part of this book.
Here, we examine the more commonly found expressions and our preferred method for parsing them. We call these structures patterns because they are extremely common, and you can almost use them as 'cookie cutters' to create your own, specific solutions.
Pattern 1: Arithmetic Expressions
With arithmetic expressions, use as many parentheses as you can to make the meaning clear. In practice, there are fewer arithmetic expressions than one might think in Perl since it is not the speediest in this area!
Use the precedence table as your guideline. The following statements avoid parenthesitis by, again, functionally separating items that go together by parentheses:
$val = $variable**$exponent + log($sum - $var);
$val = ($var + $val) ** 2;
These two statements seem pretty clear, since intuitively, '**' is much more binding than +, and the function call 'log' is much more binding than '-'. Hence the need for parentheses. However, if you feel uncomfortable with this, you could say:
$val = ($variable**$exponent) + (log($sum-$var));
instead, although that seems overkill. But, again, it is better to have more parenthesis and get the answer right than a minimal amount and get the answer wrong.
Pattern 2: 'if' Patterns with Multiple 'and'/'or' Clauses
These patterns are where one has an if condition with multiple 'and/or' clauses. Here, && and || (or their cousins 'and/or') are the focus points for the expression.
Here, you probably want to limit yourself to putting a parentheses around the statements that are logically tied together. For example
if (($scalar > $scalar2 && $scalar2 > $scalar3) || $scalar3 > $scalar4)
ties the $scalar > $scalar2 and $scalar2 > $scalar3 group together because without the parentheses the statement would be logically wrong. But
if ((($scalar > $scalar2) && ($scalar2 > $scalar3)) || $scalar3 > $scalar4)
doesn't use 'economy of parentheses' and therefore becomes cluttered.
Pattern 3: Expressions in a Condition.
'Expressions in a Condition' are Perl statements inside a compare clause: '>','<','==', etc. In these cases, parentheses force precedence and add to readability; as in the following example:
if (($scalar1+$scalar2) > 5)
In this, the parentheses around $scalar1 and $scalar2 aren't strictly necessary. Since the '+' is higher in precedence than the > sign, the expression means the same without them, but still 'looks' like it might be wrong.
Hence, the parenteses are added for clarity. If the expression becomes too complicated, you can always split it out to enforce readability. As in:
my $var = $scalar1 + $scalar2;
if ($var > 5)
Pattern 4: Functions Without Any Arguments
This expression pattern is an easy one. Perl has several forms for functions, and even more for functions without arguments. All of these:
function;
function();
&function;
&function();
are legal, but you should not use all of these forms.
Parentheses make for clarity here, too, and the '&' seems redundant. Hence,function();
is the correct way to go here.
Pattern 5: Functions with Regular Arguments
Functions with arguments should be focal points in your code. As such, even though you don't need to put parenthesis around the function in the function call:
function $a, $b, $c;
this doesn't 'shout' that this is a function call, either. (it could be a syntax error). And what happens if you say:
print function $a, $b, $c;
Does this say 'print out to the screen the evaluation of function with the arguments $a, $b, and $c', or 'print to the filehandle function $a, $b, and $c'?
Again, the parentheses here on the end of the function calls add to readability, as well as work to disambiguate. It is worthwhile to train your mind to expect that when you see the pattern 'word ()' you are seeing a function call.
function($a, $b, $c);
internal_function($a, $b, $c);
Functions such as print are so common that sometimes it is OK to drop the parentheses.
print STDERR "HERE!\n" # printing to a screen
However, each one of these cases (like print) should be thought out carefully. And if you decide to drop the parentheses, force yourself to always do so (except on rare occasion when not adding parenthesis makes the expression wrong, like (print("HERE"), exit())). Force of habit will make your code easier to read.
Pattern 6: Functions Inside a Function Call.
What about functions that are called inside of other functions? Something like:
function($arg1, internalFunction $arg2, internalFunction $arg3)
In this case, it could be ambiguous. Is the function 'internalFunction2' inside 'internalFunction' (an argument to it) or is it a separate argument to &function? Is it:
&function($arg1, if($arg2, iF2($arg3));
or
&function($arg1, if($arg2), $if($arg3));
Hence, you are better off putting parentheses around every function call. As in:
internalFunction1($arg1, $arg2, internal_function2($arg3, $arg4));
push(@args, extract_array_values());
This will prevent many precedence mistakes.
Pattern 7: Expressions Inside a Function Call
In the case of an expression within a function call, you can drop the parentheses between the expressions. This is because, we can use commas in a list operator just as well as we could use parentheses, since they are extremely low in precedence. Hence:
print ($scalar+length($a), $scalar * $length);
is equal to:
print (($scalar+length($a)), ($scalar * $length));
but the first form seems a bit cleaner.
Pattern 8: Expressions Which are Evaluated Inside a Function Call
There are times when an expression is evaluated from inside the function call. In other words, the internal expression is evaluated first, then the result is sent as an argument to the function. This is the case with Perl built-in functions. Something like:
chop($line = <FD>); # retrieves a line from a file descriptor FD, then
# chops the last character off of it.
could be cut up into two lines ( $line = <FD>; chop($line);) because chop modifies $line. However, it is common practice to use the this form, as it saves typing and is fairly maintainable.
Doing two steps in one (or more) is a fairly common pattern in Perl; in fact, there is a term for it in Perl called chaining (which we will get to in the chapter on 'contexts'.) The important thing here is to always know what you are doing when you chain and to always use parentheses around the logically joinable parts.
Pattern 9: Temporary Copies of a Variable, With the Temporary Variable Being Manipulated.
This pattern is really a spiced up example of Pattern 8. This pattern happens so frequently that it isn't worth it to split the code up into two lines:
($tmp = $line) =~ tr{A-Z}{a-z}g;
# copies $line into $tmp, and then (in $tmp)
# 'translates' all A-Z chars to a-z chars
# (lower case) without touching $line.
($tmp = $number)++; # copies $number into $tmp, and then increments $tmp.
($tmp = $line)=~s{\bWORD\b}{word};
# copies $line into $tmp and then (in $tmp)
# substitutes instances of WORD for word.
Here, again, we use parentheses around the patterns that we are going to evaluate first. $tmp = $line happens, and then the operation (++, tr, s""") happens next.
Pattern 10: Getting the Results of a Pattern Match, Function, or translate, and sticking them in a variable.
This pattern is the flip side of the last pattern. In this case, the variable gets assigned to after a manipulation is made. For example:
$count = ($line =~ tr{A-Z}{a-z}); # gets a count of the upper case characters in $line,
# and sticks it into $count
is an example of the translate operator in action. It substitutes capitals for lower letters in $line, doing so first, and as a side effect, counts the number of upper case characters in $line and sticks them into $count. Likewise
($user, $password, $uid) = ($line =~ m{(.*?):(.*?):(.*?)}s);
# gets the results of a pattern match (see section
# 'regular expressions') and puts that result into
# the array ($user, $password, $uid).
is an example of the match operator, which basically matches a pattern inside $line. Because of the parentheses around $line, the match is done first, and only after the match is done are the results of that match put it into the variables $user, $password, and $uid. Note that, in each case we have taken the concept that we were dealing with, and compress it down into one statement. We could have said:
$line =~ m{(.*?):(.*?):(.*?)}s;
$user = $1;
$password = $2;
$uid = $3;
but why bother? It is longer, and we shall see later on, less precise and more prone to error. Sometimes long expressions make sense in Perl, sometimes they don't.
Pattern 11: Short Circuiting in Executing a Command.
As we saw, the following statement:
open(FD, 'filename') || die();
means 'open the file filename, and tie it to the filehandle FD. If unsuccessful, die.' This is an example of short circuiting.
In this case, parentheses are necessary around "open(FD,..)" since the '||' operator is of higher preference than the ',' operator. If you said:
open FD, 'filename' || die();
This is equivalent to
open FD, (filename || die());
which is not what you want. However, this is what the or and and operators are for. They have extremely low precedence, so:
open FD, "filename" or die;
is a perfectly valid Perl sentence.
Nevertheless, it's a good idea to put parentheses around functions anyway, since they logically bind the arguments of the function to that function.
open(FD, "filename") or die "Unable to open filename!\n";
# opens a file descriptor. If it cannot, dies
even though not strictly needed, since here, the main focus of the statement is the function 'open', not the short circuiting.
Pattern 12: Use of Conditional Operator in Assignment
The expression:
if ($condition)
{
$value = $value1;
}
elsif ($condition)
{
$value = $value2;
}
else
{
$value = $value3;
}
is unnecessarily verbose, and you can use the conditional operator instead:
$value = ($condition1)? $value1 :
($condition2)? $value2 :
$value3;
Here, the parentheses are not strictly necessary around the condition since the '?' is of low priority. However ,as in the following cases:
$value = ($string gt $otherString) ? $string : $otherString;
# sets $value to the highest
# lexical valued string.
@value = (@array1 > @array2) ? @array1 : @array2; # sets @value to the array with
# the greatest number of elements.
having the parentheses makes sense since again it groups what is logically associated together.
Pattern 13: Assignment with Short Circuiting
This pattern is used as a good way of setting a variable to several possible versions on the same line:
$variable = $ENV{'LOGDIR'} || getlog() || 'DEFAULT'; # tries $ENV{'LOGDIR'} first
# function getlog() second,
# and if both blank, sets
# to default.
It has the same effect as Pattern 12, taking the place of needless 'if then' clauses.
Pattern 14: Assignment in 'if then, while, or foreach' Constructs
Pattern 14 takes advantage of the fact that any statement inside a conditional is evaluated before the conditional actually evaluates to be true or false.
Hence, there are two steps to this pattern: 1) evaluating the statement inside the conditional, and 2) using the results of this evaluation to decide whether or not the condition is true or false.
This pattern is very common:
if (@files = getFileNames()) # get files from a function.
{ # evaluates to false if get_file_names
print "@files\n"; # returns (), or undef
}
elsif (outOfFilenames())
{
}
Here, if 'getFileNames()' returns one or more strings in the array to @files, the if will be evaluated as 'true'. If 'getFileNames' returns undef, or '()', the if will be evaluated as false. Either way, @files will be set.
Likewise
while(($key, $value) = each (%hash)) # assigns a $key, $value pair
{ # each time thru the expression.
} # evaluates to false if each returns
# ().
iterates through a hash, calling 'each' before evaluating whether or not each returned a (). And:
foreach $file (@files = <FD>) # get files from a file descriptor
{ # <FD>, and iterate through them, while setting '@files'.
}
both makes a list of files (in @files) and iterates through them afterwards, setting $file to each element in @files. The split in:
foreach $word (split(/,/, $list)) # split up the list by commas,
{ # use each $line of this as an element
} # in the array.
likewise evaluates first, making an array of words out of a comma separated list, something like $list being equal to '1,2,3,4,5'. And finally:
while ($line = getLine()) # get a line from the function 'getLine'
{ # iterate through this function
} # until getline() returns nothing.
repeatedly calls the function getline(), setting the value in $line until $line runs out of values.
Pattern 15: Using Functions and Operators to Assign Values to Internal Perl Variables
Internal Perl variables, like $_ and $@, are variables that Perl uses by default.
As such, Pattern 15 is usually just shorthand for Pattern 14, use of conditional operators in assignment. We tend to think that it makes things unclear, but it is also very common:
while (defined <FD>) # sets $_ to a line from the filehandle <FD>
{ # goes through each line in filehandle <FD>.
chop(); # chops that variable (ie: takes off
} # the last character.}
This is equivalent to:
while (defined $_ = <FD>)
{
chop($_);
}
Likewise, the following will iterate through each @ARGV
for (@ARGV) # again, sets $_, iterates through
{ # each argument.
}
As it iterates, each argument is set to '$_', which you can then access via special functions or by name.
Pattern 16: Iterating Through a Regular Expression
We devote a whole chapter to regular expressions. Think of this as a preview of what is to come.
In this pattern we use a regular expression as an iterator. In the following example, $line is not actually changed. Instead, each number in the scalar $line is picked out and assigned to $1 one by one:
while ($line =~ m/([0-9]+)/g) # simple regular expression that picks out
{ # all of the integer numbers in a
print "$1\n"; # regular expression. (see section
# 'regular expressions')
}
Right now, simply note the 'g' on the end. This stands for 'global' which means in this case 'match as many times as you can'. If you had:
$line = '1 x 2 y 343 z';
as a string, this snippet of code would print out:
1
2
343
Pattern #17: File handle joins as an argument to a subroutine, or function list.
The following assignment operator makes @lines all the lines in files FD, FD2, and FD3:
@lines = (<FD>, <FD2>, <FD3>);
which works by evaluating each of the filehandles in the '()', and then joining these files into one humongous list.
Summary of Perl Expression Patterns
There is nothing magic about these expression patterns. We present them here in order to help you create the most clear, maintainable code possible. Mixing and matching these patterns should give you a framework for the majority of your programming needs.
Chapter Summary
Of all the things in Perl, the precedence table which defines the order in which symbols are evaluated, is by far the most complicated. It has to be, in order to be 'natural'. If you think about it, English is full of complicated (and contradictory) rules, special cases, and so forth to which we seem to have adapted.
Perl's idiom set is extremely rich, but on the downside, it takes a while to get used to. Almost every other language has about half the precedence rules that Perl does. (At least Perl doesn't have contradictory rules like natural languages!)
So how do you learn and work with Perl's idiom set? Well, there are two things you can do:
1) have the precedence table in front of you the whole time you are programming. Likewise, look at the perlop man page, for more examples. This will get you used to the syntax in a hurry.
2) One of the best ways to learn is by imitation. You may also want to have the 'common Perl patterns' in front of you, just to get the hang of Perl syntax.
Once you get the hang of it, Perl's complexity becomes a real blessing. You find solutions to problems that would take ten times longer to write in other languages. You will have fun finding little nooks and crannies of the language, little inventive patterns that are both functional, and somewhat surprising. (I've been programming in the language for 5 years, and I still find these things!)
Anyway, you'll start to have fun. The next chapters deal with the heart of the language, and use these common expressive patterns quite frequently.
![]() ![]() |
![]() ![]() |
![]() ![]() |
COMPUTING MCGRAW-HILL | Beta Books | Contact Us | Order Information | Online Catalog
HTML conversions by Mega Space.
This page updated on October 14, 1997 by Webmaster.
Computing McGraw-Hill is an imprint of the McGraw-Hill Professional Book Group.
Copyright ©1997 The McGraw-Hill Companies, Inc. All Rights Reserved.
Any use is subject to the rules stated in the
Terms of Use.