Orders Orders Backward Forward
Comments Comments
© 1997 The McGraw-Hill Companies, Inc. All rights reserved.
Any use of this Beta Book is subject to the rules stated in the Terms of Use.

Chapter 5: Functions and Scope

Perl is very much like other computer languages in that it has larger units of measurement than the expression. In other words, you can build larger computer structures that can be used interchangeably in other programs. Perl supports functions, and collections of functions into packages or libraries. Perl also supports two types of scope, or the segmentation of variables so that they are only visible to certain parts of the program.

Those of you who are already familiar with these concepts should keep in mind that, just like everything else in the language, Perl looks at functions and scope in a non-traditional way.

The purpose of this chapter is to provide the basic syntax, pitfalls, and principles behind effective use of functions and scope, as well as give several examples of their use inside actual code. Take a look at the perlsub and perlmod man pages for further information...

Chapter Overview

This chapter goes over the first of the two major ways that perl allows you to scale up your code - the combination of the 'function' and the 'scope'. The other way, dealing with packages, libraries, and classes will be discussed in the next part of the book. But for now, this chapter is divided into three main sections.

First, we talk about functions and how they operate; the syntax of functions, how you pass and manipulate arguments to them, how you get values out of them, common subroutine errors, and tips to avoid them, recursion, wantarray, and more.

Then we talk about the other half of the puzzle: scope. Scope is the method used by computer languages so that variables don't 'collide' - proper use of scoping methods means you need not worry about having ten different variables, all with the same name of 'get' in a single file. Perl has two different policies for scope, called lexical and dynamic, and we go over these in detail. We consider their definition in detail, and the places in your programs where each one is used.

Finally, we will go over some examples; programming recursion in perl, using references in subroutines (helps if you've read chapters 7 and 8) and finally using perl itself to settle issues of scope.

Functions

Every computer language has functions, even the simplest of the simple. Perl is no exception.

A function returns a set of outputs given a certain set of inputs. Since functions are, by nature, repeatable and reusable, functions are very handy in cutting down the programming effort. - The code to perform certain tasks can be used in many different places in a program, but only have to be written once.

Functions are often called subroutines, because they do their work out of the main line of program logic. Since they are out of the way, functions allow you to make the main logic flow of a program more clear.

Perl has some special ways of working with functions. For example, Perl has no strict typing which forces you to write a function a certain way. There is no structure in Perl like this C function:

int functionName(int argument1, string *argument2);

In other words, Perl does not require 'types' or 'named parameters' for the function. Instead, a Perl function often looks like:

sub functionName

{

my ($argument1, $argument2) = @_;

# ... do stuff.

return($function).

}

In other words, the function is said to be freely defined or free flowing. As such, the return value for a function can be interpreted in many different ways depending on the context that that function was written in.

Perl also has built-in functions which are functions that are compiled with the language. These are described in Chapter 12, Built-in Functions and Variables in Perl.

Syntax

The general form of a Perl function is:

sub subroutineName

{

my (@argsToSubroutine) = @_: # Not essential, shows

# the way that arguments are passed to subroutines.

 

&doStuff; # Here, do subroutine. Can be a list of commands.

return(@returnValues); # again, the return isn't

# necessary. You simply can, by default, return

# the value of the last expression evaluated.

}

There are six ways to call a function:

$return = &subroutineName(@args_to_subroutine); # Usage 1;

$return = subroutineName(@args_to_subroutine); # Usage 2;

@return = subroutineName(); # Usage 3;

@return = &subroutineName(); # Usage 4;

@return = subroutineName; # Usage 5;

@return = subroutineName @args_to_subroutine; # Usage 6;

Usages 2 and 3 should be preferred since they are the most explicit, yet the least cluttered.

The '&' or '()' here signifies a subroutine. Also, whatever can be put in a list can be put in a subroutine. Passing '()' is the equivalent to passing a NULL set to the list. '$return' and '@return' are return values from the subroutine. Figure 5.1 shows how the two items are related:

Figure 5.1 line art

Figure 5.1

relationship between calling a subroutine and its return value.

The arguments to the subroutine are passed into the subroutine by the incoming array '@args_to_subroutine'. These values are passed back to the left hand side of the equation by the value in 'return(@returnVals);'.

Let's take a closer look at these two elements, the argument stack, and the return stack.

The Argument Stack

The argument stack is denoted by the special variable @_, and is local to the subroutine being called.. Arguments are put into @_ each time you call a subroutine. @_ works much like the stack in C and C++.

Since the argument stack is an array, there is no limit to the number of arguments can be passed to a function, each argument being a scalar. Let's take a moment and play with this concept. If you start with a function that looks like:

sub subroutineName

{

my (@args) = @_;

}

you could just as easily call this function with:

subroutineName(1);

as could:

subroutineName(1,2,3,'alpha','bravo','charlie');

or:

subroutineName(@argumentsToArray);

or even:

subroutineName(%hashValue);

This last example is, technically, allowable although it may not be advised: remember, hashes go into subroutines with no idea of order! It may be useful if you want to use the hash in the subroutine, however. But even here, you may want to use a reference to a hash instead (see references, chapter 9, and passing arguments by references, later in this chapter).

Hence, the length of argument stack does not matter when calling a subroutine. Perl will dutifully make @_ whatever is passed to your subroutine, and it is up to the subroutine to decide whether or not an argument is to be paid attention to.

Now, the length of argument stack does matter within the subroutine. Let's start with the following function call:

$value = add(1,2,3);

add is defined as:

sub add

{

my ($value1, $value2) = @_;

return($value1 + $value2);

}

You are setting yourself up for a bug search here. add ignores the third element in the list. To add insult to injury, the function does not tell you that the third element is being ignored.

Remember, it is up to you to check the validity of the passed-in arguments. Perl will not do it for you.

The variable nature of Perl's argument stack can be used to your advantage. If you want a more robust subroutine, try this:

1 sub add

2 {

3 my (@values) = @_;

4 my $return;

5 foreach $value (@values)

6 {

7 $return += $value;

8 }

9 return($return);

10 }

Here, you have not gotten rid of your bug, you have actually made the subroutine more powerful. By going through each argument and then adding it to $return, (in lines 5-8), you have made a general purpose addition function, one that can take several arguments, not just two.

There are a couple of points that you need to know about the argument stack. These are discussed below.

Manipulating the Argument Stack

There are several ways of manipulating the argument stack First, since the @_ array is like any other array, you can access each element by subscript. Therefore, $_[1] gives the second element of the function (remember, the array subscript in Perl start at zero!).

Second, you can also access the @_ through the shift and pop functions, which have some special 'magic' attached to them when being converted to a subroutine.

mySubroutine(1,2,3,4);

 

sub mySubroutine

{

$firstArgument = shift; # accesses the first element on the stack. $firstArgument becomes ‘1’, @_ becomes (2,3,4).

$secondArgument = shift # accesses the second element on the stack. $secondArgument becomes ‘2’, @_ becomes (3,4);

$lastArgument = pop; # accesses the last element on the stack. $lastArgument becomes ‘4’, @_ becomes (3).

}

In all the above examples, shift and pop are being used here with shorthand, and are equivalent to 'shift(@_)', and 'pop(@_)' respectively. Each time you call them, they take either the first argument (shift) or the last argument (pop) off the stack. They then store the argument in the variable on the left hand side, shortening '@_' as you go.

Following are some more examples of manipulating @_:

sub subroutine

{

my (@array) = @_: # simple. @array becomes the argument list.

}

 

sub sub # No conflict here! even though subroutine name is sub

{

my $firstarg = shift; # shift and pop automatically,

my $secondarg = shift; # access the @_array.

my $lastarg = pop; # $firstarg becomes the first argument, and @_ is shortened

my (@restofargs) = @_; # $lastarg becomes the last, and @_ is shortened

 

my $return;

$return = 'returnValue';# returns scalar returnValue to main return

}

Local @_ Stacks

Another important thing to remember about @_ is that it is localized. This means that you if you call a function inside a function, you needn't worry about the first function impacting the second.

Therefore, you can do things like this:

a(1,2,3);

sub a

{

b(1,2,3,4);

print "@_\n";

}

Here, a's '@_' becomes '1,2,3', and b's '@_' becomes 1,2,3,4. Hence you can use @_ without worry here. The example prints "1 2 3".

This flexibility allows you to do recursion in Perl, in which a subroutine is defined in terms of itself. Recursion is very handy for parsing through text and listing out permutations of strings. For example, the following subroutine will call itself ten times, and then exit:

simple_recursion(0);

sub simple_recursion

{

my ($number) = @_;

return() if ($number > 10);

simple_recursion($number + 1);

}

Perl keeps track of the ten separate argument stacks necessary in order to complete this task, evaluating the above as:

simple_recursion(0)

calls simple_recursion(1);

calls simple_recursion(2);

...

calls simple_recursion(11);

returns '' to simple_recursion(10);

returns '' to simple_recursion(9);

...

returns '' to subroutine;

which, as we shall see, lets you tackle complicated problems a lot simpler than otherwise possible.

Summary of the Argument stack (@_)

In short, having only one mechanism (@_) for passing arguments to subroutines is incredibly powerful. This concept let's you:

Make incredibly broad subroutines. These subroutines perform a general task, no matter how few or many arguments there are. The 'add' function up above, for example, simply adds numbers; not two numbers, or three, or four, but as many numbers as is passed in.

Make quick changes to subroutines, without having to formally declare those changes.

All this functionality while still giving you the flexibility to use such advanced techniques as recursion.

The Return Stack

The argument stack is the main way of having Perl pass arguments to subroutines. (in fact, aside from the sort built-in function, it is the only way!)

Its opposite is the return stack, which is Perl's method of returning values to the calling subroutine.

As you may recall, the usual method of calling a subroutine is:

@values = subroutine($arguments);

In this situation, the return stack of the function subroutine gets copied into the variable @values. And, like the argument stack, the return stack is also in array form.

There are two major ways for a subroutine to return values to the main (calling) routine:

1) by use of the special function return.

2) by default, i.e.: looking at the last expression in the subroutine.

Let's look at both these points in detail.

Return Keyword

The return keyword allows you to immediately cut short a subroutine, returning the values inside its argument stack to the subroutine. Hence, if you say something like:

sub dbroutine

{

my (@argument) = @_;

return(1);

return(2);

}

Then the 'return(2)' will never be called, since the return(1) has already returned the value '1' to the subroutine.

For example, let's write a simple routine to compare two dates. The dates are in the form "MM-DD-YYYY" or "MM/DD/YYYY". The function returns negative if the first date is earlier than the second, 0 if the first date is equal, and positive if the first date is greater than the second:

sub datecompare

{

my ($date1, $date2) = @_;

my ($month1, $day1, $year1) = split(m"[-/]", $date1); # splits by - or / into three elements

my ($month2, $day2, $year2) = split(m"[-/]", $date2); # does same for date2 -- see split

return(-1) if ($year1 < $year2);

return(-1) if ($month1 < $month2);

return(-1) if ($day1 < $day2);

return(0) if (($year1 == $year2) && ($month1 == $month2) && ($day1 == $day2));

return(1);

}

This translates into 'first compare the year, then compare the month then compare the day.' If the first year is less than the second, then we need not go further, and so on.

In short, each return immediately goes back to the place where the function was called, so that we need not evaluate any further. This simplifies the logic, and makes it possible for us to write a rather convoluted bunch of if conditions on several short lines, rather than 'jamming' it together into a complicated if.

Now we could use our date compare function as in:

if (datecompare("1996/11/30", "1995/12/11") < 0)

{

print "1996/11/30 is less than 1995/12/11!\n";

}

to get a freeform comparison of dates. Let's extend this function to return the number of years, months, and days difference between two arbitrary dates (lets assume for simplicity 30 days to the month):

sub datediff

{

my ($date1, $date2) = @_;

my ($month1, $day1, $year1) = split(m"[-/]", $date1);

my ($month2, $day2, $year2) = split(m"[-/]", $date2);

my $days1 = 365 * $year1 + 30 * $month1 + $day1;

my $days2 = 365 * $year2 + 30 * $month2 + $day2;

my $daysdiff = $days2 - $days1;

return(int($daysdiff/365), (int($daysdiff/30))%12, $daysdiff%30);

}

Here we calculate the number of days each date has, subtract them from each other, and then return the:

number of years in element 0(int($daysdiff/365)),

months in element 1 (between 0 and 11)

and days in element 2 (between 0 and 29)

It is up to the place where datediff is called to assure that these three elements are used correctly.

Default Return

You might get tired of typing 'return' all of the time to indicate that you are returning a value to the main program. Enough Perl programmers were tired enough that it was decided that return was 'not a necessary thing', and could be ignored, or made implicit. Therefore, as a shorthand, you will see a lot of code doing the following:

sub subName

{

my (@arguments) = @_;

my $return;

$return;

}

Here the last statement in the subroutine is a scalar. You might also see:

sub subName

{

my (@arguments) = @_;

my @argsToReturn;

@argsToReturn;

}

in which the last statement in the subroutine is an array. You might even see:

sub subName

{

my (@arguments) = @_;

my @argsToReturn;

@argsToReturn = (1..10);

}

in which the last statement in the subroutine is actually an assignment. It assigns an array to the statement @argsToReturn, and then @argsToReturn is returned to the stack.

In each case, it is immaterial whether the last statement is a hash, array, or scalar. In the absence of a return, the last statement that is evaluated in an array is the return stack, or the value that is returned to the subroutine.

Using this logic, the following example is a simple way to return all the lines in a file, sorted alphabetically:

sub sortedLinesInFile

{

my ($file) = @_;

my $fh = new FileHandle("$file") || die "Couldn't open $file!\n";

@lines = sort (<$fh>);

}

@lines contains the sorted lines from the file $file (which is passed in as an argument). Being the last statement in the subroutine, this automatically becomes the return stack. This subroutine could be called as so:

my @fileLines = sortedLinesInFile("my_file");

to get, after execution, all the sorted lines in the variable @fileLines.

Note that it does not matter whether or not it is the last statement in the subroutine positionally. The return stack is the last statement evaluated. For example, if you have a subroutine that is one giant if clause, as in:

sub betweenLowerGreater

{

my ($firstValue, $secondValue, $compare) = @_;

if (($firstValue > $compare) && ($secondValue > $compare))

{

"less than";

}

elsif (($firstValue > $compare) && ($secondValue < $compare))

{

"in between";

}

elsif (($firstValue == $compare) || ($secondValue == $compare))

{

"equal";

}

else

{

"greater than";

}

}

This will do what you expect, namely, return whether or not the '$compare' element is between, greater than, less than, or equal to the two other elements, because the last statement evaluated is either the 'less than', 'in between', 'equal', or 'greater than'.

'wantarray':

Now let's suppose that we want to improve on the date compare function that we were working with earlier. Remember that there were two separate incarnations of it, one that returned a scalar which indicated whether or not the first date was earlier or later than the second:

$earlier_or_later = datecompare('1996/11/11', '1996/12/11');

The other incarnation of the routine returned the number of years, months, and days the two dates were apart:

($years, $months, $days) = datediff ('1996/11/11','1996/12/11');

These are basically the same function. If we wanted to use 'datediff' to implement 'datecompare', we could say:

my ($years, $months, $days) = datediff('1996/11/11','1996/12/11');

$earlier_or_later = ($years < 0 || $months < 0 || $days < 0)? -1 :

($years ==0 && $months == 0 && $days == 0)? 0 :

1;

because we know that one date is earlier than another date if the years or months or days are negative than the other years, months or days. And so on. However, we cannot do something like:

:

if (datediff('1996/11/11','1996/12/11') < 0)

{

# do something

}

because datediff returns an array with years, months and days difference, and not a scalar. It would be great to have a function do 'double duty', able to return a scalar and an array.

The function wantarray is Perl's way to do double duty. wantarray senses whether or not a function is being used in a context that requires an array, or one that requires a scalar. From that information, the function can decide what to return. It is used like:

sub subName

{

my (@arguments) = @_;

wantarray() ? doSomething() : doSomethingElse();

}

If wantarray evaluates to true, this means that a function has called it like:

@array = subroutine();

in which the subroutine is expecting an array. On the other hand, if wantarray evaluates to false, this means that a function has called it like:

$scalar = subroutine();

Which can be represented pictorially as in figure 5.2:

Figure 5.2 (line art)

Figure 5.2

'wantarray', contexts, and usage.

With our knowledge of wantarray, let's rewrite the datediff routine to handle both returning a scalar or an array.

If called in scalar context, the return value will indicate if the first date is less than the second date. If the function is in array context, it will return an array containing the difference in years, months and days.

sub datediff

{

my ($date1, $date2) = @_;

my ($month1, $day1, $year1) = split(m"[-/]", $date1);

my ($month2, $day2, $year2) = split(m"[-/]", $date2);

my $days1 = 365 * $year1 + 30 * $month1 + $day1;

my $days2 = 365 * $year2 + 30 * $month2 + $day2;

my $daysdiff = $days1 - $days2;

wantarray()?

return(int($daysdiff/365), (int($daysdiff/30))%12, $daysdiff%30) :

return( $daysdiff cmp 0);

}

Now, if we say something like:

my ($years, $months, $days) = datediff("12/13/1966", "1/11/1985");

wantarray 'senses' that we are calling datediff from a context that needs an array, and therefore evaluates as true, and returns a three element array. If we say:

my $comparison = datediff("12/13/1966","1/11/1985");

instead, then this returns either -1, 0, or 1 depending on whether or not the first date is less than, equal to, or greater than the second.

Passing Multiple Arrays or Hashes To Functions.

So far we have discussed passing scalars and arrays to a subroutine. You can also pass any other data structures that you desire to a subroutine. These arguments are passed in LIST context, however, so you must remember the cardinal rule of lists here. Lists are mangled into one, giant list, losing any concept of location within the list. In other words, separate elements are joined into one long stream of data.

Here are some bad ideas for function calls. In these function calls, @argument1, @argument2, %argument1, and %argument2 all lose their identity inside the function itself. In other words:

WrongFunction(@argument1, @argument2); # bad idea!

WrongFunction(%argument1, %argument2); # bad idea again!

WrongFunction(@argument1, $scalararg2); # still bad!

If you do this type of thing, it will bite you. There is only ONE array that is important here, from the point of view of the function: @_. This array contains both @argument1 and @argument2. This means that when you actually run the function:

sub WrongFunction

{

my (@argument1, @emptyargument) = @_; # does not work @argument1 gets all the values, @emptyargument gets none.

}

@argument1 is ‘greedy’, and it takes all the arguments passed in to the function into itself. Something like what you see in Figure 5.3:

Figure 5.3:

Figure 5.3

Figure Showing List Cramming

There is no pointer that tells @_ that "the @argument1 array ends here" or "the @argument2 array starts here." Therefore, @emptyargument gets nothing passed into in from the calling routine.

Likewise, hashes lose their concept of hash value pair when passed by value, and they need to be 'hashified' when they get copied out of the stack :

&hashargumentFunction (%hash)

 

sub hashargumentFunction

{

my (%argument1) = @_; # '%hash' gets transmuted to @_ which

# becomes an array. Array @_ gets copied back to hash %argument1!

}

This 'bottleneck' of having only one array (@_) for the argument stack causes three shortcomings in passing hashes or arrays when constructing subroutines. These are:

1) There can only be one array or hash in a subroutine at any given time.

2) If you want to name your arguments, you are forced to do a copy of the data structure into the form that you want (such as 'my (@arrayName) = @_;').

3) If you want to write back to the incoming arguments to the function, you need to directly access the @_ variable itself.

All of these are a bit of a nuisance and a pain, especially point number 1.

There are several cases in which you might want to pass more complicated structures around. These might be when you wish to pass an object, or a reference to a "glob" of data that you get from a database. In these cases, you cannot pass multiple arrays or hashes, again, because of that limitation given by '@_'.

To get around this, you will have to pass references, instead. The following is an alternate form (that works) to pass two arrays to a function. (We will get to the syntax more when we come to references):

&referenceArrayFunction(\@array1, \@array2);

Or two hashes:

&referenceHashFunction(\%hash1, \%hash2);

Or objects:

my $objectName = new Object();

&reference($objectName);

You could, for example, then access the arrays in referenceArrayFunction as follows:

sub referenceArrayFunction

{

my ($array1, $array2) = @_; # array functions.

print "printing out array1 @$array1\n";

print "printing out array2 @$array2\n";

}

where @$array1 dereferences the reference passed into the subroutine, to get back the actual values in the array. But this is just a taste of things to come. We shall talk a lot about these concepts in chapters 7 and 8.

Perl function caveats

As we have seen over and over, Perl has a looser interpretation of how functions are composed (and of functions in general) than do most other languages.

There are no 'rules set in stone' about the number of arguments passed in, how many arguments should be returned, nor even the type of arguments or return values that a function should return. With this expressive freedom comes a cost, a cost that you should bear in mind. Proper contemplation of these caveats could save you hours of debugging. Following are some of the major things of which to be wary.

Caveat #1: Error Checking.

1) Perl has the philosophy that the programmer should worry about matching up the arguments in the function call to the subroutine.

This was hinted at above. If you have a function call that looks like:

sub mySub

{

my ($argument1, $argument2) = @_;

}

and you then call this function in this manner:

mySub($argument1, $argument2, $argument3);

you may be surprised when argument3 drops off the edge of your subroutine into nothingness.

There are three ways to cope with this feature/bug:

1) Generalize the subroutine (make it broader).

2) Put error checking into the subroutine itself.

3) Use the '-w' flag to capture errors.

We shall take each of these in turn.

Generalizing Subroutines

Suppose that you have the following subroutine, which returns the size of a file that you pass in, or returns zero (if the file is a directory).

sub filesize

{

my ($file) = @_;

(-f $file) ? -s $file: 0;

}

This is a situation in which you are probably better off making this subroutine generalized so that it takes an unlimited number of arguments; so that it returns the combined size of all the files passed to it:

sub filesize

{

my (@files) = @_;

my ($file, $size);

foreach $file (@files)

{

$size += (-f $file) ? -s $file : 0;

}

$size;

}

By making the function more generalized, there is no need to have an exact count of the arguments being passed in. The function can handle any number of arguments. You don't have to think about the interface, and how many arguments it takes.

Putting Error Checking in Subroutines

Whether or not it makes sense to generalize the function, it does not hurt to put error checking in the subroutine. Or perhaps even build in a 'usage' for your subroutines, where the subroutine figures out how many values were passed into it. For example, you might want to put a check in the 'add' subroutine up above, to make sure each of the arguments is a number:

1 sub add

2 {

3 my (@numbers) = @_;

4 my @nonNumbers;

5 if (@numbers) { print "Usage: add( @numbers )\n"; }

6 if (@nonNumbers = grep($_ == 0 && $_ ne '0', @numbers))

7 {

8 print "Warning! You passed the following non numbers to add! @nonNumbers\n";

9 }

10 foreach $number (@numbers) { $return += $number; }

11 $return;

12 }

Here, the usage statement is in line 5, where we tell the user of our function exactly how to call the subroutine. The meat of the error checking is in the if statement 6-10. Note that the grep statement here in line 6 is simply a fancy way to determine whether not a scalar is a number or not.*

More to the point, it checks each element in @numbers to see if the element evaluates to zero. If it does, chances are pretty good that it is a string (since strings evaluate to zero in an '=='). Just to make sure, the function compares that scalar with the string zero. This works 99.9% of the time, but you can fool it by passing in, for example, the string '0.00'.

You may want to consider always using '-w', which will always warn you if something is a non-number. However, it is hard to enforce everybody using the '-w' switch if they use your code. See '-w' below.

Another situation that calls for the use of error checking within a function is to make sure that only two arguments are passed to a function. It doesn't make sense, after all, to pass three arguments to a datediff function:

sub datediff

{

my ($date1, $date2) = @_;

warn "You need to pass two arguments to datediff!\n" if (@_ != 2);

# ...

}

This code manually checks how many arguments the user passes in to the function. You can get pretty sophisticated with this warning technique. For example, you could make it so that a user, passing in a special flag, could get the usage of the subroutine:

sub datediff

{

my ($date1, $date2) = @_;

warn "Usage: datediff('MM/DD/YYYY','MM/DD/YYYY')\n"; if ($_[0] eq 'usage');

# rest of subroutine...

}

where if the user said 'datediff('usage') it would print out a special message on how the subroutine is used.

Using the '-w' flag

Let's take another look at the add function from above, line by line:

1 sub add

2 {

3 my (@numbers) = @_;

4 my (@nonNumbers, $return, $number);

5 if (@nonNumbers = grep($_ == 0 && $_ ne '0', @numbers))

6 {

7 print "Warning! You passed the following non numbers to add! @nonNumbers\n";

8 }

9 foreach $number (@numbers) { $return += $number; }

10 $return;

}

Perl provides you with a bundled package of warnings, which we shall talk about extensively in this book) which triggered if you run the script with '-w'. If you use '-w' in your program, you can take out lines 4-9, and instead let Perl do the warning for you:

1 #!/usr/local/bin/perl5 -w

2 add('apples', 'bananas');

3 sub add

4 {

5 my (@numbers) = @_;

6 foreach $number (@numbers) { $return += $number; }

7 $return;

8 }

prompt% perl -w add.p

Argument 'apples' isn't numeric at line2.

Argument 'bananas' isn't numeric at line2.

Here, if you put '-w' on the command line with which you run Perl, or put it in the interpreter line '#!/usr/local/bin/perl5 -w', or even set the variable $^W, Perl itself provide detailed warnings. We shall talk more about this in 'debugging Perl'.

The more checks you put in your code like this, the happier the people who use your code will be. This 'user friendliness' makes all of the difference in how much your code is utilized. Perl lends itself to adding this feedback.

More to the point, it is up to you to decide how much user-friendliness, warnings, and so forth to put into your code. The language doesn't enforce this, you do. This is a tenet that we will see time and time again, and it is a philosophy worth getting used to with Perl.

Caveat #2: Passing by Reference and Passing by Value.

Another behavior that you should be aware of is that Perl never actually makes a copy of the arguments it passes to function. It never really uses 'pass by value' in its functions. Instead, @_ is a synonym for the list that is passed into the function. This code:

$scalar1 = 5;

&function($scalar1, $scalar2);

print "Scalar equals $scalar1\n";

 

sub function

{

$_[0] = 1;

}

actually overrides $scalar1, and prints out "Scalar equals 1".

Likewise, if you try to pass in a read-only element into a function:

&functionWithReadOnlyElements(3.1415925);

 

sub functionWithReadOnlyElements

{

$_[0] = 0; # trying to modify what was passed in.

# Since it is a read only value

# (3.1415925 is a number) this gives an error.

}

This gives the error:

'Modification of read only value attempted at line <lineNo>'.

since you are trying to directly assign to a constant to a number. This is akin to the statement:

3.1415925 = 0;

which is, obviously, absurd.

Summary of Caveats:

Knowing these caveats can help you save hours of debugging effort:

1) You are responsible for doing error checking for subroutine calls. Perl will not help you. This gives you the freedom to do as much or as little verification as you want. (Sort of like going along with the UNIX philosophy of 'giving yourself enough rope to hang yourself'.) If you make your subroutines fairly user-friendly though, you will do yourself a great favor.

2) You would be wise to use -w in your subroutines. Such an error as the number of return values not equaling the number of values on the left hand side of the equals statement will be caught:

'($value1, $value2, $value3) = subroutine();'

where subroutine is defined as

sub subroutine

{

return($value1, $value2);

}

This error is extremely difficult to find by yourself. $value3 simply becomes nothingness. Use -w to protect yourself from such bugs.

3) Perl really never passes by value. It passes by reference instead. Hence, if you change the @_ array, you change the values that you pass in.

4) In addition, be wary that you can only pass one array or one hash into a subroutine and hope to keep that array or hash's identity. If you pass more than one, Perl will flatten the two variables into one, long list. Use references instead, for more than one value.

Summary of Functions

All in all, you should remember three things about Perl's functions:

1) They are called with the following structure:

RETURN_STACK = subroutine(ARGUMENT_STACK);

in which RETURN_STACK is either a scalar, array, or hash and represents the values coming out of the subroutine, and ARGUMENT_STACK is the list of values being passed into the subroutine.

2) The general form of a function:

sub subroutine

{

my(ARGUMENT_STACK) = @_;

# do stuff

return(RETURN_STACK);

}

where @_ is a special variable indicating the function's arguments, where ARGUMENT_STACK is an optional copy of the values coming into the subroutine, and RETURN_STACK represents the values going back to the call.

3) You are responsible for providing all of your own error checking (as much or as little as you would like.) Error checking takes many forms, but two of the most common error checking statements are accomplished by using 'use strict;' and '-w' at the top of your program.

Perl's Scoping Methods

Scope is the policy of variable management. It is absolutely crucial to understand Perl scoping methodology if you want to make programs that are larger than 100 or so lines. This stems from another fact that we have mentioned earlier, that global variables spring into existence - if they are not currently there. Which is, as lots of things are in Perl both a blessing and a curse.

Consider, for example, the following simple subroutine:

sub isZeroByte

{

($file) = @_;

(-z $file) ? 1: 0;

}

Is there anything wrong with this example? Yes, there is, and it will bite you quite strongly some day if you don't learn scoping rules. Remember, by just saying '$file = @_;', you are creating the global variable $file, which is visible everywhere. If you say something like:

$file = "otherFile";

if (isZeroByte("thisFile"))

{

print "thisFile is zero bytes long!\n";

}

 

open(FD, "$file");

$file is silently overwritten in the subroutine 'isZeroByte'. It is important to see here that the last open statement is not going to open otherFile. Instead, it is going to open thisFile, because it was silently changed by the call to 'isZeroByte'.

There is a good chance that you could spend hours tracking down why this is the case, which is because it has changed silently on you.

The purpose of this section is to prevent you from this agony, which I have gone through quite a few times. Imagine if you said to delete $file instead of opening it, for example, and deleted the wrong file!

There are three major concepts you need to know to effectively deal with your programs, and avoid what is called 'variable suicide' (in which you kill your variables by bad variable policy):

1) my

2) local

3) "use strict"

local and my are Perl's actual methods for scoping. 'use strict' is a technique, a package, that you can use to have Perl actually police for you the proper use of variables by always declaring them.

Using my and 'use strict' in your programs can make them nearly bulletproof to the variable suicide example shown above. local, on the other hand, is a hold over from Perl 4 which is used in specific instances that we shall discuss. Ultimately, local should go away, leaving the more stable my in its place.

We will take a look at these three items in some detail. But first, let's deal with a simple issue in Perl: exactly what comprises a scope?

Scope Syntax

A scope is simply an area within which a variable is usable and visible. Fortunately, Perl's rules for scoping are fairly simple. It works on areas in the code called blocks. There are two type of blocks:

1) a special block called a global block. (which is the entire Perl file)

2) each '{ }' in a 'if (condition) { }', 'while { }', 'do { }', and any other conditional loop defines a block. In fact, any use of brackets where you can insert code (i.e.: any place besides brackets which make hashes) defines a block, not just subroutine calls.

Blocks can be nested, internal to other blocks, and generally are completely wrangleable. (within reason! if you try to define a subroutine in an if block, you aren't going to get what you want unless you really know what you are doing!).

Figure 5.4 shows some common Perlish blocks,

Figure 5.4 line art

Figure 5.4

Pictoral Representation of Perlish blocks.

So what do blocks have to do with scoping? The special keywords my, and local figure out where variables are to be visible. Let's look at both of these in turn.

my and Lexical Scoping

The most common way that to avoid globals is by the use of my. If you look back at the subroutines we created, you will notice that we made heavy use of it. We did this for good reason. Every single time you say something like:

my ($variable);

then Perl actually creates space for another, private variable called '$variable', which is good until the block that it is in goes away. For example:

for ($xx = 0; $xx < 10; $xx++)

{

my $data = 2;

}

print "$data\n";

prints nothing, since the block is the bracketed for loop, and here 'my $data = 2;' indicates that a new copy of '$data' is created and destroyed every single time the for loop runs its course. You can see this in the following Perl script:

my $xx;

for ($xx = 0; $xx < 10; $xx++)

{

my $data = 2 if ($xx == 0);

print $data;

}

Here, the only time that '2' is printed out is on the first loop (where the my is executed). All of the other times it simply prints blank. This is because the my variable has been destroyed after the first for loop exits, and is now undefined.

my variables have what is called 'lexical' scope. This means that they have the following two properties:

they are defined for the duration of the block that they are in

they are not visible to subroutines (except, as we shall see, in the case where they are defined at a 'global level').

For example, if you say something like:

my ($xx, $yy) = (1,1);

if ($xx > 0)

{ # OUTER BLOCK

my $variable = 'this is xx\'s';

if ($yy > 0) # INNER BLOCK

{

print "$variable\n";

} # INNER BLOCK

} # OUTER BLOCK

print "This prints nothing! $variable\n";

This prints out "this is xx's", since $variable is visible inside the inner block (marked as 'INNER BLOCK'). Then, after the outer block, the routine prints "This prints nothing!" since after the outer block, both copies of $variable have been destroyed.

However, in the following code:

if ($xx > 0)

{

my $variable = 'this won\'t work!';

function();

}

sub function()

{

print "$variable\n";

}

Here, $variable won't be printed since 'function()', even though it is in block where $variable was defined, is inside a function call, and therefore is invisible to the subroutine.

So, given these rules, guess what happens when you say:

my ($xx, $yy) = (1,1);

if ($xx > 0)

{

my $variable = 'this is xx\'s';

if ($yy > 0)

{

my $variable = 'this is yy\'s';

print "$variable\n";

}

print "$variable\n";

}

This actually prints out

this is xx's

this is yy's

Surprised? Even though you have defined $variable in the meantime as "this is yy's", this is because the two '$variables' are actually altogether different. They happen to have the same name, but they are in different blocks. This small code snippet:

{

my (@array) = (1,2);

{

my (@array) = (2,3);

{

my (@array) = (3,4);

}

}

}

defines three separate arrays, and then destroys them at the end of their respective blocks, even though the syntax above is quite silly.

my Pseudo Globals and static variables

There is a really cool behavior that you should be aware of with my. That is, when a my variable is created at the global level, one that it is at the same level as subroutine declarations, it is visible to all subroutine declarations at the same level.

These are sometimes called static variables, since the variables 'stay around' for the duration of the program.

In other words, if you say something like:

use FileHandle;

my $fh = new FileHandle("readFile");

myRead();

sub myRead

{

print <$fh>;

}

We have said before that my variables are not visible to subroutine calls. This actually does work, however, since $fh is defined at the 'same level' as the subroutine definition. This means that it prints out the entire file via print <$fh>, even though $fh was never passed into the function via a parameter. This behavior is a little alarming at first, but can be quite useful in creating static variables.

Static variables are variables that persist from function call to function call, but are not global. For example, the following function will keep a 'memory' of what has been added to it, returning a larger array every time it is called:

BEGIN

{

my @staticVarb;

sub addArray

{

my (@values) = @_;

push(@staticVarb, @values);

}

 

sub getArray

{

@staticVarb;

}

}

Now, if you call this like:

addArray(1,2,3,4);

addArray(5,6,7);

addArray(8,9,10);

print "@{[ getArray() ]}\n";

This prints out '1 2 3 4 5 6 7 8 9 10' since, even though @staticVarb is a my variable, it is defined at the same level as the subroutine declarations 'addArray' and 'getArray'. This routine also shows a use of BEGIN, which basically tells Perl to 'execute this code first, before anything else' . (BEGIN is described in more detail in chapter 13.)

my Caveats.

As with all things in Perl, my has a couple of caveats you should be aware of.

First of all, the following two statements are not equivalent:

my ($varb, $varb2);

my $varb, $varb2;

The first statement, (my ($varb, $varb2)) does what you would expect, i.e., defining two my variables. The second statement, (my $varb, $varb2') translates more into something like:

my $varb;

$varb2;

In other words, it makes the first variable $varb a my variable, and the second one $varb2 a global variable. This is quite a common gotcha, and we shall see how to overcome it by use strict.

Second, note that if you say something like

for (my $xx = 1; $xx < 10; $xx++)

{

print "$xx ";

}

expecting to see "1 2 3 4 5 6 7 8 9", well, this doesn't work either. This is a bug right now in Perl, and discussions are underway on how to fix it. And finally, notice that:

if ($condition1)

{

my $variable = 'value1';

}

elsif ($condition2)

{

my $variable = 'value2';

}

also doesn't work. You are creating the variable $variable, but it is also being destroyed right afterwards! Hence, you probably want to say something like this instead:

my $variable;

if ($condition1)

{

$variable = 'value1';

}

elsif ($conditino2)

{

$variable = 'value2';

}

in which you declare the my variable beforehand. Finally, one of the biggest caveats is that you cannot use my to localize special Perl variables. If you say:

sub localArgv

{

my ($_) = "\n";

}

this will result in

Can't use global $_ in 'my'

which should be fixed soon (it's on the "to do" list for the Perl folks).

Summary of my

my is Perl's special keyword for declaring that a variable is lexically scoped, which is really just a fancy term to say that it belongs to the block in which it was created. my variables exist as long as the duration of the block in which they were created.

The main thing to remember about my variables is that they are not only visible to blocks that are 'underneath' where they are declared. They are also visible to subroutines declared at the same level. Hence:

if ($condition) { my $a = 1; a(); }

 

sub a

{

print $a;

}

This does not work, whereas:

my $a = 1;

a();

sub a

{

print $a;

}

works, mainly because it is defined at the same level as the subroutine. my is the main way to do scoping.

local

local was the main way that Perl did scoping before the improvements in Perl 5. I only mention it here because there is one place where you still need to use local, and it is with special variables such as $_, which we have seen briefly and shall talk about in-depth in chapter 12.

The following example does not acually create a new variable:

sub subRoutine

{

local($") = "|";

}

Instead, this changes the value of the already existing global variable for the duration of the block that this subroutine is in. This is called dynamic scoping, which refers to the fact that the values of the global variable are dynamic (change) based on where local is called..

Based on this concept, the following code:

$hmm = "permanent variable!\n";

for ($xx = 0; $xx < 5; $xx++)

{

local($hmm) = "temporary change\n";

print "$hmm\n";

}

print "$hmm\n";

prints out:

temporary change

temporary change

temporary change

temporary change

temporary change

permanent variable!'

since, even though the global has been changed five times to 'temporary change!' in the for loop, it is actually a global variable in disguise. So far, this is the same as my variables. But the main effect of this is to have the local variable visible to underlying subroutines

if ($a > $b)

{

local($varb) = 1000;

printvarb();

}

sub printvarb

{

print "$varb\n";

}

This example prints out '1000', since varb is still a global variable and is visible everywhere. It just temporarily changed value for the purposes of the subroutine.

Again, you are only going to want to use local in cases in which you need to use a Perl special variable in a subroutine. Hence:

sub getWholeFile

{

my ($file) = @_;

local($/) = undef;

my $fh = new FileHandle("$file");

$return = <$fh>;

return($return);

}

slurps the entire file 'my_file' into string $return, and then returns its value to the main subroutine. (See section 'special variables in Perl' for more information on $/. It is the variable in Perl that controls how much is read from a filehandle by the '<>' operator. Setting it to 'undef' makes it so <> reads in the entire file, '\n' makes it so <> reads in to the newline, etc.)

If we don't localize this variable, as sure as night follows day a situation will arise in which you are doing a bunch of file slurping and look what happens:

$line = getWholeFile("my_file"); # doesn't localize $/... $/ is now "undef"

my $fd = new FileHandle("my_other_file");

while ($otherline = <$fd>)

{

# process each line by itself? No! you are processing the entire file in one chunk!

}

Without 'local($/)', there is a very subtle bug here. 'getWholeFile' does get the entire file into the string $line, but also, as a side effect sets $/ to 'undef' for the rest of the program. This means that when you are expecting one behavior from <> (to read to the next newline), you get another behavior instead (reading to the end of the file!)

Hence, as of now, the primary use of local in Perl is to prevent this 'action at a distance'. We really don't want to have the behavior that, one thousand lines before we set '$/' to undef, and then find out later on that one of our subroutines uses this changed value. Here are some other common examples of localizing special variables:

sub printoutPipeDelimitedLines

{

my (@fields) = @_;

local($") = "|";

print "@fields\n";

}

in which $" controls what comes between elements when they are interpolated in "" (printoutPipeDelimitedLines(1,2,3) prints "1|2|3").

These are the two most common cases for localizing special variables. For more information on special variables, go to section 'Perl special variables' in chapter 12, and the Perlvar man page.

use strict

At this point you know pretty much all you need to know about effective scope management in Perl, you know how to avoid globals by the use of my and local. Now the question is how do you effectively follow these rules?

As said before, Perl has some caveats with the my special keyword. Especially unbidden is the fact that when you say:

my $varb1, $varb2;

the my keyword takes only $varb1 to be a my variable, and ignores $varb2. Likewise, when you say:

if ($condition1)

{

my $varb1 = "probably a mistake";

}

print $varb1;

this doesn't exactly work as planned, as $varb1 has been destroyed after the edge of the if, and hence, instead of printing 'probably a mistake', it prints nothing!)

Well, there is a way to avoid the use of globals, and effectively use my and local, without tracing through the code above for these mistakes (even when Perl is being obstinate as above), and that is to use the phrase 'use strict'. If you say:

use strict;

my $varb1, $varb2;

this prints out:

'Global symbol "varb2" requires explicit package name at test.p line 2. Execution of test.p aborted due to compilation errors.

indicating that you haven't actually used my and local effectively, and that there is a global variable lurking there somewhere. In this case, it is the global $varb2, which you can correct by saying:

use strict;

my ($varb1, $varb2);

'use strict' is absolutely essential for any program larger than 100 lines, and you shall see it in abundance in this book.

Summary of Scoping Rules in Perl

There are four major things that you need to know about scoping:

1) Scoping happens in terms of blocks, or is a logical piece of code.

2) Variables are defined with the my keyword. This makes a variable bound to a given block ( 'if ($condition) { my $a = 1; }' binds $a to the if block). Variables inside the block get created by the my, and then are destroyed after the block ends.

3) 'use strict', and -w, will save you tons of time (hour upon hour) when it comes to actually debugging programs. They enforce that all variables that you create are my variables.

4) local is used to localize special variables (such as $/ and $_) where you want to get a special behavior out of Perl operators (like having <> slurp in an entire file). See section 'special variables' in chapter 12 for more information.

Examples of Subroutines

In this section we take what we have learned about scoping, and concentrate on form in the creation of subroutines. Let's take a look at examples of three types of subroutines:

1) subroutines that use recursion

2) subroutines that use references

3) subroutines that use wantarray

With that in mind, here are five examples of subroutines.

Examples of Subroutines That Use Recursion

Recursion can be used to turn subroutines that would usually take fifty or more lines of code into elegant ten-liners. Here are two examples of subroutines that use recursion to good effect.

The following example prints out a directory tree. The routine, named find, is called:

simple_find("directory_name");

The subroutine looks like:

Listing 5.1: simple_find.p

1 use strict;

2

3 sub simple_find

4 {

5 my ($input) = @_;

6 my $file;

7 print "input\n" if (-f $input);

8 if (-d $input)

9 {

10 opendir(FD, $input);

11 my @files = readdir(FD);

12 closedir(FD);

13 foreach $file (@files)

14 {

15 simple_find("$input/$file"); # recursive call

16 }

17 }

18 }

This example works by looking at the input, printing out the input if it is a file (and hereby stopping the recursion) (line 5), or opening the directory and recursively applying itself to each of its contents. (line 14-16).

Notice the heavy use of my here. We enforce its use via 'use strict' (line 1) and then proceed to make all the variables internal to the subroutine my variables. This isn't just good practice, it is necessary here. If we didn't do this, the call to 'simple_find' (line 15) would overwrite the @files variable and give a big mess.

Likewise, the following will give all of the combinations of a string, returning an array of them. It works by 'picking apart' the string, and then calling itself on the substrings. Usage:

@combos = combinations("string")

Subroutine:

Listing 5.2: combinations.p

1 sub combinations

2 {

3 my ($string) = @_;

4

5 my %return;

6 return($string) if (length($string) == 1);

7 my (@letters) = split(//, $string);

8

9 my ($xx);

10 for ($xx = 0; $xx < length($string); $xx++)

11 {

12 @letters[0,$xx] = @letters[$xx, 0];

13 my ($first_letter, $sub_string) =

14 ($letters[0], join('', @letters[1..length($string)]));

15

16 my @permute_array = combinations($sub_string);

17 grep($return{$first_letter . $_} = 1, @permute_array);

18 }

19 return(keys %return);

20 }

This is a little bit difficult to envision. Note that in line 12 we swap each of the characters for the first one, in line 16 we do the actual recursive call, and in line 17 we 'mark' the fact that we have seen certain combinations of strings.. (using a hash). Pictorially, it kind of looks like Figure 5.5:

Figure 5.5

Figure 5.5

The combination function, and how it works.

Although this picture doesn't do the combination function true justice. In particular, line 6 (return ($string) if (length($string) eq '1'); prevents us from recursing indefinitely, And if we had forgotten one my variable, this function would not have worked. The my again makes it so each of the variables @permute_array, %return, $xx is associated with only one combination function. If we had not done this, then as the combination function was called recursively, the variables would have collided values.

In the large, the combinations of the larger string are defined in terms of combinations of each of the smaller strings. These two functions are simple examples of the expressive power you get by using recursion. Once you get in the habit of looking at certain operations as recursive, you can form very elegant solutions, especially in Perl!

Examples of Subroutines that use References

Although we haven't talked about them yet in detail, references will be your primary way to deal with complicated data structures. Hence, the following example shows you a couple of things that you can do with references when you pass them into functions. The following, for example, merges two hashes into one return hash. It is used as follows:

my %hash = hashMerge(\%hash1, \%hash2);

And the subroutine itself looks like:

Listing 5.3: hashMerge.p

1 use strict;

2

3 sub hashMerge

4 {

5 my ($hashref1, $hashref2) = @_;

6 my ($key, %return) = ('', ()); # () is an empty hash.

7 foreach $key (keys %$hashref1)

8 {

9 $return{$key} = $hashref1->{$key}; # $hashref->{$key} gets $key for a hash reference.

10 }

11 foreach $key (keys %$hashref2)

12 {

13 $return{$key} = $hashref2->{$key};

14 }

15 return(%return);

16 }

This works by stuffing %return with the hash pointed to by %$hashref1 first, and then stuffing %return with %$hashref2. Since no error checking is made here, the second call may overwrite the values of the first call. For example, if called with arguments like:

%hash1 = (1 => 2);

%hash2 = (1 => 4);

my %return = hashMerge(\%hash1, \%hash2);

 

Then return will become ( 1 => 4) because the second hash is overwriting the first.

The following switches the values of two arrays. It is called like this:

&switchArrays(\@argument1, \@argument2);

And corresponds to the following subroutine:

Listing 5.4 switchArrays.p

1 use strict;

2 sub switchArrays

3 {

4 my ($argument1, $argument2) = @_; # Array References.

5 my @tmp;

6 @tmp = @$argument1;

7 @$argument1 = @$argument2;

8 @$argument2 = @tmp;

9 }

This works because lines 7 and 8 'reach in' to the reference and set the value that the reference points to. Sort of like Figure 5.6:

Figure 5.6 (line art)

Figure 5.6

Simple reference manipulation

We shall have a lot more to say about references later. You can get a lot more complicated than this, having references to references, references to references to references, and so on. See Chapter 9 for more detail, or be patient, we will get there soon.

Subroutine Wantarray examples:

Let's see a couple of more examples where wantarray may be useful. Consider a subroutine in which it makes no sense to call a subroutine in array context. Say you are computing compound interest, based on an amount, a percentage, and a number of time units:

Listing 5.5 compoundInterest.p

1 use Carp;

2 use strict;

3

4 my $origAmount = 100.00;

5 my $percent = .05;

6 my $timeUnits = 60;

7 my $amount = compoundInterest($origAmount, $percent, $timeUnits);

8

9 sub compoundInterest

10 {

11 my ($origAmount, $percent, $timeUnits) = @_;

12

13 carp "You need to use this function in scalar context!\n" if (wantarray());

14 my $xx;

15 for ($xx = 0; $xx < $timeUnits; $xx++)

16 {

17 $origAmount += $origAmount*$percent;

18 $origAmount = sprint("%.2f", $origAmount); # rounds to two decimal places.

19 }

20 $origAmount;

21 }

Here, line 13 will protect you from yourself, raising a flag if you say something like:

my ($value1, $value2) = compoundInterest($origAmount, $percent, $timeUnits);

because the left hand side is expecting two values, when you are only providing one. Or perhaps you may decide that you want to have two usages:

1) in scalar context, return just the final value (based on the three parameters

2) in array context, return an array of values (from time units 0 to $timeUnits)

You might want to implement it like:

Listing 5.6 compoundInterest2.p

1 use strict;

2

3 sub compoundInterest

4 {

5 my ($origAmount, $percent, $timeUnits) = @_;

6

7 my ($xx, @arrayOfAmounts);

8 for ($xx = 0; $xx < $timeUnits; $xx++)

9 {

10 push(@arrayOfAmounts, $origAmount);

11 $origAmount += $origAmount*$percent;

12 $origAmount = sprintf("%.2f", $origAmount);

13 }

14 wantarray()? @arrayOfAmounts: $origAmount;

15 }

where if you use this function like:

foreach $value (compoundInterest(100,.05, 10))

{

print "\$$value ";

}

This prints out a list of ten values, something like:

$100 $105 $110.25 $115.76 $121.55 $127.63 $134.01 $140.71 $147.75 $155.14

Whereas if you call it like:

$value = compoundInterest(100, .05, 10); print "\$$value\n";

you get only the last value. ($155.14)

Scope Example

Finally, let's consider a couple of extended scope examples. Look at it simply as several examples crammed together.

In the first example, we denote the value of each of the variables at any given time, next to the variable itself, along with the type of variable it is. Hence:

print $my_variable1; # undef;

indicates that$my_variable1 is not visible at the scope specified. Let's look first at the effect of scope on visibility in subroutines:

Listing 5.7: scopeExample.p

1 $global = 'global';

2 my $my_at_global_block = 'my at global block';

3 if (1 > 0)

4 {

5 my $my_in_if_block = 'my in if block';

6 my $local_in_if_block = 'local in if block';

7 beginSub()

8 }

9

10 BEGIN

11 {

12 my $my_in_begin_block = 'my in begin block';

13 local($local_in_begin_block) = 'local in begin block';

14 sub beginSub

15 {

16 print $global; # 'global';

17 print $my_at_global_block # 'my at global block'

print $my_in_begin_block; # 'my in begin block'

18 print $local_in_begin_block; # undef

19 print $my_in_if_block; # undef

20 print $local_in_if_block # ;local in if block'

21 }

22 }

Here the subroutine 'beginSub' shows how each of the different variables prints out:

1) $global (line 16) prints out because it is a global.

2) $my_at_global_block (line 17) prints out because the sub beginSub { } declaration is defined in a sub-block of the global file.

3) $my_in_begin_block (line 18) prints out because the declaration of sub beginSub is defined inside the BEGIN block where the variable is declared, and because beginSub itself holds a reference to it.

4) $local_in_begin_block (line 19) does not print out because by the BEGIN block has gone away by the time beginSub() was called, and there is no such thing as references to a local variable.

5) $my_in_if_block (line 20) does not print out because the subroutine beginSub() is not defined in the if block itself, like 'if () { my $a; sub beginSub { print $a }}'

6) $local_in_if_block (line 21) does print out because it is actually a global in disguise (a copy of a global) and globals are seen everywhere.

Now let's take a look what happens when we start making variables with the same names:

Listing 5.8 scopeExample2.p

24 while ($xx++ < 1)

25 {

26 my $my_in_while_block = 'while1';

27 local($local_in_while_block) = 'while1';

28 local($global) = 'while1';

29 while ($yy++ < 1)

30 {

31 my $my_in_while_block = 'while2';

32 print $my_in_while_block; # 'while2'

33 print $global; # 'while1' not 'global'

34 print $local_in_while_block # 'while1'

35 }

36 print $my_in_while_block; # 'while1'

37 }

38 print $global; # 'global';

Here, in turn:

1) in line 32, the $my_in_while_block prints out 'while2', because it was defined in line 31, and this 'covers' the previous definition in line 26.

2) in line 33, the $global prints out 'while1', not global, since the local definition in line 28 covers the previous global definition.

3) in line 34, the $local_in_while_block prints out 'while1', since it was defined in line 27, and locals are seen everywhere.

4) in line 36, the $my_in_while_block prints out 'while1' since the previous definition at line 32 goes out of scope.

Summary

The most important things to remember out of this chapter are:

1) subroutines have the form:

sub subName

{

my (@arguments) = @_;

return($return_value); # or @return_value

}

2) subroutines are called by:

my $value = subName(@arguments);

3) Perl defines a bunch of related code as blocks which are delineated by a '{ }'.

4) the keyword my makes variables non-global (in what is termed lexical scope) and my variables cannot be seen in subroutine calls, but can be seen in blocks that are 'below' (i.e.: inside) the block where the my variable was defined.

5) 'use strict' and '-w' are crucial in warding off typical bugs and typical errors. If you don't use them, code at your own peril.

In the next chapter, we will go into more detail about an important issue that was raised here with wantarray.

We shall talk about contexts and what they mean to Perl. Contexts basically account for 80% of Perl's functionality (and headaches, of course) so if you are just starting with Perl, you will definitely want to pay close attention.

Orders Orders Backward Forward
Comments Comments

COMPUTING MCGRAW-HILL | Beta Books | Contact Us | Order Information | Online Catalog


HTML conversions by Mega Space.

This page updated on October 14, 1997 by Webmaster.

Computing McGraw-Hill is an imprint of the McGraw-Hill Professional Book Group.

Copyright ©1997 The McGraw-Hill Companies, Inc. All Rights Reserved.
Any use is subject to the rules stated in the Terms of Use.