Orders Orders Backward Forward
Comments Comments
© 1997 The McGraw-Hill Companies, Inc. All rights reserved.
Any use of this Beta Book is subject to the rules stated in the Terms of Use.

Chapter 6: Contexts in Perl 5

This chapter goes over a very important concept that sets Perl apart from most computer languages: Contexts.

Contexts are Perl's way of making the language seem more natural and less computer-like. With contexts, variables can mean different things depending on where they are put in an expression. This helps the language flow because you capture your thoughts in Perl-ish 'sentences'. In other words, sometimes I get the feeling that I am speaking Perl, not programming in it, because it feels like a natural language to me. (But then again, I've probably got to get out more!).

As usual, the on-line documentation is extremely helpful and detailed. Once you have mastered the basic concepts in this chapter, the relevant man page is perldata which deals with contexts.

It is essential that you understand contexts in order to use the power of Perl.

Chapter Overview

The richness of perl's contexts are a feature pretty much unique to perl itself. Since most other computer languages don't make as much of an emphasis on context as perl does, we will go into them in great detail. This chapter is divided into three sections which try to take the process of learning contexts and make it painless as possible.

First, we shall go over the definition and simple examples of scalar contexts. Scalar contexts are places where scalars, arrays or hashes are converted to a scalar; they account for about 80% of your lines of code. We talk about the basic process of this conversion, as well as 'gotchas' that people have when dealing with scalar context.

Next, we go into list contexts, and a special form of list context called an array context. List contexts are places where your datatypes (scalars, arrays, or hashes) are converted internally into a 'list'; we talk about slicing, and the possibility of your arrays getting truncated in a nasty side effect of array contexts called 'void' contexts.

Third, we take a look at what might be called 'context patterns'. These are ways that you can recognize if a perl statement is in either scalar, or list, context.

And fourth, finally, we take a look at putting together all the building blocks to make perl do some pretty cool things; we shall talk about using the power in perl's contexts to put into one line the functionality of what other statements require you to do in twenty.

Introduction to Data Context

Much of the confusion of programming in Perl comes from people using contexts incorrectly, although this is not overt: they simply don't realize that this is the mistake that they are making. The point of this section is to get familiar with the concept of context, see why it is important, and then drive home how to make sure you are using contexts correctly. In other words, the most important rule that you should learn from this section should be:

  • Make sure that you are using the variable in the context you meant it to be used in.

  • So, what is a context? A place to start to understand context is to think about natural languages. Certain words in natural languages are used in totally different ways, depending on the context in which the word appears. The word, set for example, has the most definitions by far of any word in the English language.*

    Well, at least according to the Guinness Book of World Records. If you take that for the gospel of truth, then you'll agree. In any rate, it has a lot of definitions.

    Set means four different things in the following sentences:

    1) He set the book down on the table.

    2) After the point, it was game, set, and match.

    3) The agenda was set when the VP came by

    4) She got set in her ways.

    In computer terms, data contexts are simply clues to the interpreter on how to interpret a given dataset (rather than word) in given place. In other words, Perl 5 parses each expression, and decides then and there what the meaning of each variable is going to be, or how it is going to be interpreted.

    This means you can say:

  • @arrayName = $scalarName;

  • @arrayName = @arrayName2;

  • @arrayName = %hashName

  • And have all of the above be syntactically legal. However, these statements mean quite different things. Note the different symbols again: '@' for arrays, '%' for hashes, and '$' for scalars. These symbols are important, for if you are off by a symbol, you are saying something completely different to the interpreter than what you may have intended.

    What these Perl sentences mean is based on the relationships between the variables on the right and left hand side of the expressions. Again, this is much like a natural language, only with a lot less shades of gray.

    Unlike natural languages, there are definite rules to determine what means what in each context. We cover the rules below.

    The two basic contexts in Perl 5 are:

  • Scalar

  • List

  • In addition, you can also look on a third type, which is simply a helpful distinction, rather than something built in to Perl itself:

  • Array

  • Array contexts are simply a type of list context. This section explores the subject of data context in more detail. A thorough understanding of context will greatly strengthen your Perl 5 programs.

    Scalar Contexts

    Scalar Contexts are expressions in which Perl datatypes are interpreted as a scalar, the basic Perl variable which holds an arbitrary string or number.

    Following are some simple examples of scalar context:

    $scalarName = "This is a scalar"

    $scalarName = "@arrayName";

    $scalarName = @arrayName;

    This can be represented pictorially in Figure 6.1.

    fig6.1 (line art)

    Figure 6.1

    Captioned 'scalar contexts in Perl'

    Here, the equal sign is being used to figure out the context of the right hand side by looking at the left hand side of the equal sign.

    In other words, the equals sign 'gauges' what the right hand side is supposed to be; and in the above examples, both sides of the equation are interpreted as scalars. Note in the third example that an array (or a hash, not shown here) can also be interpreted as a scalar in the correct context.

    Therefore, a variable on the right hand side (in this case @arrayName) does not determine the context. The left hand side is the determining factor. '$scalarName =' says to Perl 'I am assigning to a scalar.' Therefore, whatever is on the right hand side (i.e. @arrayName) should be interpreted as a scalar. In this case, $scalarName gets assigned the number of elements in @arrayName.

    List contexts

    List Contexts are places in which a Perl dataset is interpreted as a list. Again, lists are groups of scalars, which have a distinct order to them so you can look them up by their position in the list.

    Here are some simple examples of list context:

    ($scalar1, $scalar2) = (@array1);

    ($scalar1, $scalar2) = ($scalar2, $scalar1);

    ($scalar1, @array1) = (@array1, $scalar1);

    We have already seen some list contexts, in the form of subroutine calls:

    @returnStack = subroutine();

    in which the subroutine returns an array to the @returnStack on the left hand side of the expression.

    Anyway, all of these can be looked at pictorially like Figure 6.2:

    fig62 (line art)

    Figure 6.2

    caption 'List contexts in Perl'

    The important thing here (as it was in scalar contexts) is that the left hand side of the expression determines that it is a list context. You can tell an expression is going to be a list context by the parenthesis around the left hand side of the expression.

    In each one of these examples, the group of scalars on the right side are assigned, one at a time, the scalars on the left hand side of the equal sign. Hence:

    ($scalar1, $scalar2) = ($scalar2, $scalar1);

    is a simple shorthand for:

    $tmp = $scalar1;

    $scalar1 = $scalar2;

    $scalar2 = $tmp;

    The difference between them is that you don't need a temporary variable, and it takes one line instead of three to write.

    List contexts are special in the sense that they munge (how's that for a technical term!), or collapse all of their elements into a big, long list before doing assignments. Therefore, if you say something like:

    (@array1, @array2) = (@array2, @array1);

    This looks pictorially something like:

    fig63

    Figure 6.3

    caption list assignment in list context. A caveat.

    We have already seen this talking about functions: @array1 here gets all of the elements in @array2 concatenated with the elements in @array1. @array2 gets nothing! @array1 has been greedy, and eaten up all the arguments on the left hand side of the equation.

    You can usually tell if something is in list context by the parentheses around it. Another thing that you should be aware of with list contexts is that balance of number of elements is important. If you say something such as:

    ($scalar1, $scalar2) = ($scalar2, $scalar1, $scalar3);

    then $scalar3 is dangling off the end, and therefore gets assigned to nothing. This logic can be very difficult to track down and may not be what you intended. The '-w' flag, discussed before, can help you some, but will not always catch these types of errors.

    Array Contexts

    ARRAY contexts are a special form of LIST context in which the list is an array.*.

    Technically speaking, there really is no difference between an array context and a list context. Internally, Perl turns all arrays into lists, and then does assignment. It's just a helpful distinction because of the ($a,$b) = ($b,$a,$c) phenomenon in which $c does not get assigned to anything. This happens in list context, where there is a fixed number of elements on the left hand side of the '='. This never happens in array context, since Perl's arrays don't have a fixed number of elements.

    Here are some simple examples of array contexts.

    @arrayName1 = @arrayName2;

    %hashName1 = %hashname2;

    @arrayName = (1,2,3);

    Pictorially, what is going on is:

    fig64

    Figure 6.4

    caption 'Array contexts in action'.

    In the first example, we copy @arrayName2 into @arrayName1 . In the second, we copy %hashName2 into %hashName1. In the third example, we assign the array @arrayName to a list (1,2,3). These examples shows that you can have an array on one side of the equals, and a list on the other, and the assignment works fine.

    In other words, array and list contexts are basically interchangeable. You can assign an array to a list, and vice versa. However, lists and scalars contexts are not interchangeable. Hence,

    $scalarName = @arrayName;

    has meaning, because @arrayName is an array, and this gets interpreted as the number of elements in @arrayName. '$scalarName =' hence forces @arrayName to be interpreted as a scalar.

    However,

    $scalarName = (''list', '2');

    is NOT OK. Again, this is due to the need for the list to balance its number of elements with the elements on the left hand side of the equation. The second form will lose data if you try it, since the list ('list', '2') needs to get forced into scalar context. However, it does not do so by being interpreted as the number of elements in the list (i.e.: like @arrayName did.) Since it is a list, it gets trimmed to fit and the element list gets dropped off.*

    Not quite true, again, but a helpful distinction. If you say something like:

    ($a, $b) = (1,2,exit());

    then what do you think will happen? The exit won't drop off, but instead will get evaluated, and your program will end. In other words, you can use parentheses as a way to separate arguments, much in the same way as a ';' separates arguments. Hence,

    ($arg1, $arg2, $arg3) = (shift(@array), shift(@array), shift(@array));

    shifts off arguments one, two, and three from the array @array, and assigns them to $arg1, $arg2, $arg3, much like:

    ($arg1,$arg2,$arg3) = splice(@array,0,3);

    does.

    What About Hashes?

    It is important to note that there is no hash context. When there is a hash on the assignment side of the equal sign, it denotes an ARRAY context in disguise. The interpreter converts them from hashes into arrays. If you say something like:

    %hashName = @arrayName;

    you are actually making a hash where the keys are even numbered pairs, and the values to those keys are the odd number pairs. If @arrayName equals the list '(1,2,3,4)' then:

    %hashName = @arrayName;

    will make %hashName the value '(1=>2, 3=>4)'. Hence, you can look on hashes as denoting a special type of an array, when it comes to contexts.

    This example is exactly the same as doing something like:

    %hashName = (1 => 2, 3 => 4);

    which you may recognize from the discussion on variables, which again, is a usage of a context in disguise. There are a couple of points to be made here:

    1) Note that when you assign a hash to a list like this, you run the risk of dropping elements. If you say:

    %hashName = (1=>2, 3=>4, 5);

    The '5' will be dropped, along with giving the mandatory 'severe warning':

    Odd number of elements in hash list

    Because, well, hashes need 'key value' pairs to work

    2) This key value nature of hashes is why we have been using the '=>'. '=>' is really a spruced up comma, hence:

    %hashName = ( 1,2,3,4 );

    and

    %hashName = ( 1 => 2, 3 => 4)

    are identical. It is just that the second one makes it easier to read. In fact, there is a pair of elements here, and the => has some special properties (if you use it, you don't have to put quotes around strings on the left side of the =>,::

    %hashName = ( this => 'is', a => 'hash');

    is legal syntax.

    3) If you assign a hash to an array, going the opposite way, you will lose order in the hash.

    If you say something like:

    @array = %hash;

    the key value pairs of the hash will be put into the array. And again, since you don't know which elements in a hash are 'first', the array will come out in semi-random order. Hence,

    @array = %hash;

    ($key, $value) = (pop(@array), pop(@array));

    %hash = @array;

    is an inefficient way of removing a random hash element, and returning it to '$key, $value', and:

    ($key, $value) = splice(%hash, 0,2);

    does the same thing.

    3) Finally, if you put a hash in scalar context, the hash returns an idea of how many key value pairs are inside the hash. Hence, if you say:

    $usage = %hash;

    where

    %hash = (hash => 'name');

    returns:

    1/8

    which gives you an idea of how big the hash in fact is (it indicates that there are eight 'buckets' (places to put keys) in the hash, and only one of the buckets is filled. This gives you an idea of how fast the hash access is because the closer the number on the left to the number on the right, the more efficient the hash is.

    Slicing

    There is one other thing that you should be aware of with array contexts, and that is a concept called slicing. Slicing is a way for assignments to be given into part of an array or hash, without effecting the whole thing. If you say something like:

    @array[1,2,3] = @array[3,1,2];

    Then you are in effect saying:

    $array[1] = $array[3];

    $array[2] = $array[1];

    $array[3] = $array[2];

    But you are doing it all at once. This means that you don't need to make any temporary variables (the above, as stated, would simply make @array's elements 1, 2, and 3 all $array[3].)

    This syntax works for hashes, too. Hence,

    @keys = (1,2,4);

    @hash{@keys} = @hash{reverse(@keys)};

    will do the same as the following:

    $hash{1} = $hash{4};

    $hash{2} = $hash{2};

    $hash{4} = $hash{1};

    only again, you don't need any temporary variables. The one thing to remember about slicing is that the following is not desirable:

    $hash{@keys} = $hash{reverse(@keys)};

    which, because of the '$' in '$hash', interprets this in scalar context, and therefore evaluates as:

    $hash{3} = $hash{3};

    because the number of elements in @keys is 3.

    Ways to determine context.

    One of the easiest ways of determining context is if there is an assignment involved. The rule is simple: if the variable on the left hand side has a '$', then the Perl sentence is in scalar context. If the left hand side is a list, then the Perl sentence is in list context. If the variable on the left hand side is a '@' or '%', then the Perl sentence is in array context.

    We have seen several examples above of conversions between assignments. But what happens if you don't have an assignment?

    Fortunately, there is a clear set of simple rules to determine which context a given variable is in. These determinations are:

  • by built-in function

  • by operator

  • by location

  • The following sections demonstrate the rules for determining context.

    Using Built-in Functions to Determine Data Type

    Perl has a bunch of handy, built-in functions, like print and time. These are functions that provide common functionality. We cover these in the chapter 'Perl built-in functions'.

    Certain built-in functions always require lists. Therefore, a variable given to that function will always be in LIST context. Likewise, certain built-in functions always require scalars. Therefore, a variable given to them will always be in SCALAR context. A simple example of a built in function that forces a certain context is the scalar function.

    As you might expect, the scalar function forces any variable that you give it into scalar context. This:

    scalar(@value)

    will interpret value as a scalar. So if you say something like:

    @arrayName = scalar(@arrayName);

    then @arrayName becomes a one element array, something like what is occurring in diagram 6.5.

    fig65

    Figure 6.5

    caption 'built in functions and context.'

    Following are examples of built-in functions determining contexts:

    Listing 6.1: internalFunc.p

    @array = ('arrays', 'have', 'fleas');

    print int(@array); # int always takes a scalar, hence, in SCALAR

    # context and prints '3' (number of elements in array).

    print sort(@array) # sort takes a LIST/ARRAY, and sorts it alphanumerically.

    # (by default) hence, prints out 'arrays fleas have'

     

    print int(1,2,3,5); # Legal, but A MISTAKE

    # int takes a scalar and (1,2,3,5) is a read only list.

    # hence 1,2 and 3 are thrown out,prints '5'. use -w!

    grep($_ > 10, @array); # 'grep' is a function that takes an expression in first

    # element, array second. hence'@array' is in array context

    @chars = grep($_ gt 'a', (split(//,function()));

    # Here, the 'split(//, $function)' is in array context.

    # this says, make an array of characters that come out of

    # function() and are greater than the letter 'a'

    Unfortunately, the context of built-in functions is dependent on the function itself. So how do you determine the context of a given function? For a starter, you can go to the chapter 'Perl 5 built-in functions'. This will give you a pointer to the most important Perl functions and the types of arguments they take. Alternatively, you can go to the perlfunc man page which lists all of the internal functions and the arguments they take.

    The Perlfunc manpage lists:

    chr SCALAR

    This means that the chr function takes a scalar, and only a scalar. chr is a built-in function to print out the ASCII value of a number ('print chr(72);' prints 'H' for example). If you then say:

    chr(@arrayName);

    this is probably not going to give you what you want. @arrayName becomes a scalar, and this scalar is interpreted as a number, and the ASCII value of the number of elements in the array is what is actually printed. If you so happen to have 72 elements in @arrayName, then this will print 'H'!

    Using Operators to Determine Data Type

    The assignment operator (=) is special. As we have seen, based on the right hand side of the expression the equal sign determines whether or not something is in scalar or array context.

    Any other comparison or assignment operator, such as '.=' (which adds a scalar onto the end of another scalar) or '==' (which compares two numbers together) forces each of the variables on either side of the expression to be scalars, and the whole expression to be in scalar context. If you say something like:

    if ($size > @arrayName);

    Then Perl is interpreting as in Figure 6.6:

    fig66 (line art)

    Figure 6.6

    caption 'a non assignment operator forcing scalar context'.

    Here, the operator '>' indicates that both the scalar $size and the array @arrayName is going to be interpreted as a scalar, and that the following example can be interpreted as: 'if the scalar size is greater than the number of elements in the array array'.

    This rule for scalar interpretation includes all the comparison operators (==, ne, >) and increment/decrement operators(++,--) Perl 5 gives extremely strange errors if you try to do such weird things as decrement an array (@arrayName--), or increment a hash (%hashName++). Simply don't attempt these. They won't do anything useful, and you may even end up crashing Perl and getting a core file!

    The statement "@arrayName--" does not mean 'take the last element off of arrayName'. Since @arrayName is in scalar context, this statement is interpreted as 'try to decrement the number of elements in @arrayName', something akin to saying:

    (3)--;

    And since the number of elements in an array is a read only value, this is a syntax error. Use pop instead, like:

    $element = pop(@arrayName);

    Here are a few examples of contexts for variables with operators other than assignment. Note that they are all in SCALAR context.

    Listing 6.2: errors.p

    $a++; # incrementor of $a. Since SCALAR context.

    @a--; # decrementor of @a? SCALAR context natively on an array. SYNTAX ERROR!

    $a[1]++; # better. $a[1] represents a scalar, hence is viable in SCALAR context.

    if (%a == %b) # legal syntax -- but A MISTAKE

    { # %a resolves to SCALAR context, as well as %b.

    print "matched!\n";

    }

    Are you trying to compare two hashes? This does not work, since you are interpreting '%a' and '%b' in scalar context! You need to go through each element in the hash one at a time:

    die if ($a{'a'} eq $b{'a'});

    #legal and better. $a{'a'} represents SCALAR and is viable in scalar context.

    do

    {

    subroutine('a');

    } if (@a > @b);

    # legal -- but REALIZE WHAT YOU ARE

    # COMPARING. NOT the elements of @a and

    # @b, but the number of elements in each

    # set.. @a is a array in SCALAR context, and

    # resolves to a scalar (no. of elements.)

    Are you trying to compare two arrays, element by element? This does not work, either. You need to go through each element one at a time.

    @a .= @b;

    # Are you concatenating two arrays?

    # NO. this is a syntax error, since @a

    # resolves to a SCALAR as well as @b. Use push instead.

    This does not work since @a and @b, attached by a non assignment operator '=', are both scalar values. Hence @a is read only in this context, and unwriteable. Use push instead, something like:

    push(@a, @b);

    Finally, consider this comparison example:

    if ((3,4,5) < (4,5,6)) # legal -- butA MISTAKE.

    { # < forces construct into SCALAR context.

    print "HERE!!!\n"; # Since (3,4,5) and (4,5,6) are lists,

    } # (3,4) and (4,5) are thrown away, and only

    # 5 and 6 are compared together.

    Again, since '(3,4,5)' and '(4,5,6)' are both in scalar context, this expression simply compares 5 and 6, and throws away the other two elements. Consider what you are trying to do, as well. Compare each individual element with each other, and only print 'HERE' if one is always less than the other? Or compare the number of elements in the array? In each case this would have a different solution, so be clear about what you want to do.

    Advanced Contexts

    Let’s go over a couple more examples just to hammer the points in. So far, we have been a bit simple, in that we have not done any subscripting on variables, and we haven't tried tricks. The three Perl symbols ('$'='scalar', '@'='array', and '%'='hash') can be extremely powerful to express what you need to say. But they can also be very misleading if you type them incorrectly. The following shows another wrinkle of how Perl determines contexts, and the order in which they evaluate:

    @arrayIndexes = (1,2,3);

    @arrayIndexes2 = (2,1,3); $arrayName[@arrayIndexes] = $arrayName[@arrayIndexes2];

    # tricky. AND A MISTAKE.

    # @array turns into a SCALAR context. as

    # well as @array2, because the left hand

    # side variable ('$scalar[@array]' is a

    # scalar. Probably meant

    # @scalar[@array] = @scalar[@array2], if you want to copy an array

    # slice.

    Again, even a mess like the above example can be evaluated if you know the rules about contexts, and remember how to determine what is a scalar, and what is an array. The first thing to notice about the above example is that the left hand side of the equation $arrayName[@arrayIndexes] denotes a SCALAR, and NOT an array, since it starts with a '$'. Therefore, this assignment is in scalar context, and both, @arrayIndexes and @arrayIndexes2 get evaluated as scalars. This is just a fancy way of saying:

    $scalar[3] = $scalar[3];

    You probably meant something like:

    @array[@subscripts] = @array2[@subscripts];

    # somewhat tricky as well. OK if you are copying an array slice .

    # @array in ARRAY context, hence

    # @subscripts and @subscripts are in array

    # contexts. Hence, if @subscripts are

    # numeric, will copy the array slices.

    instead, since '@array[@subscripts]' now denotes an array (because begins with an '@'), and the expression is evaluated in array context.

    See the next example to see how tricky context can become. Try to follow along. If this makes sense to you, then you are well on your way to understanding how contexts work.

    @array1 = (1,2,5,4,3);

    @array2 = ('this','is','a','mistake');

    $scalar = (@array1, @array2); # another tricky one. and a

    # MISTAKE. @array1 and @array2 are

    # in SCALAR context because of $scalar

    # on the left hand side. Hence,

    # this is evaluated as

    # $scalar =

    # (number of elems in @array1,

    # number of elems in @array2); but

    # And since lists and scalars don't mix,

    # the first element in the list gets

    # dropped off in translation to a scalar.

    # Hence, in this case

    # $scalar gets assigned the value 4, which is the number

    # of elements in @array2!

    As you can see, context can get complicated in a hurry, just as the English language can get complicated in a hurry. Remember, one person's idea of complex may be well within the comfort area of another person. One way to deal with Perl’s complexity is to avoid it. If you are not comfortable with doing something like:

    @arrayName[3,1,2] = @arrayName[1,2,3];

    to be a shorthand for

    $arrayName[3] = $arrayName[1];

    $arrayName[1] = $arrayName[2];

    $arrayName[2] = $arrayName[3];

    Then don't do it! It’s better to be explicit and understand what you are doing, than to confuse yourself by complicated syntax. However, it also clutters up your code. The best thing is to learn how Perl deals with variables via the constructs above, and then apply these principles to your own code.

    Using Location to Determine Data Type

    Location rules also dictate how Perl handles variables. There are four major rules to remember here. We preface the rules with the punctuation marks you need to look for:

    1) " " - scalars, and arrays are interpreted in SCALAR context inside quotation marks, in a process known as interpolation.

    The relevant syntax for interpolation looks like this:

    print "@arrayName\n";

    2) ( ) - user functions: scalars, arrays, and hashes are interpreted in ARRAY context when they fall within user functions.

    The relevant syntax for function calls looks like this:

    myfunction($scalar1,$scalar2);

    3) [ ] - array references: scalars, arrays, and hashes are interpreted in ARRAY context when they fall within [ ], the symbols for array references.

    The relevant syntax for array references looks like this:

    $arrayRef = [$scalar1, $scalar2, @scalar3 ];

    4) { } - hash references: scalars, arrays and hashes are interpreted in ARRAY context when the fall within { }, the symbols for hash references.

    The relevant syntax for hash references looks like this:

    $hashref = {'key1' => 'value1', 'key2' => 'value2'};

    Lets go through each of these in detail.

    Context rules with Interpolation

    Scalars and arrays inside the double quotes ("..") are interpreted in SCALAR context, as far as the variable to the left hand side of any equal sign is concerned. There is a further twist here. Through the trick of interpolation, arrays are expanded into their corresponding elements, rather than denoting the number of elements in the array. Therefore, if you say something like:

    @arrayName = (1,2,3,4);

    $scalarName = "@arrayName";

    This prints out '1 2 3 4'. Note the difference here. If you had just said:

    $scalarName = @arrayName;

    Then $scalarName would have gotten the value '4' instead. Perl is doing the following, in Figure 6.7:

    fig67

    Figure 6.7

    caption 'interpolation in Perl'

    Hashes are not expanded inside double quotes. Neither are lists. The following does not do what you might expect, and print out 'this is a hash' and '1 2 3 4' respectively:

    %hash = ("this" => "is", "a" => "hash");

    print "%hash";

    print "(1,2,3,4)";

    Instead it prints out '%hash' and '(1,2,3,4)'. It treats the '%' and '(' as characters, not special symbols.

    Special symbols in interpolation

    So what happens if you decide that you actually want to print out a '@', or a '$'? Well, you have two options:

    1. backslash that character in double quotes. The following statement:

    print "\@array\n";

    prints out '@array'.

    And note, you don't need to worry about printing spurious backslashes. If you backslash something that doesn't need backslashing, the backslash disappears. In other words, "\@" is ALWAYS a @, "\\" is always a \, and "\#" is always a #, even though '#' does not need backslashing.

    2. Use single quotes instead.

    Interpolation only works with double quotes. If you say:

    print '$scalarName';

    It will do exactly as you told it to do, which is print the string '$scalarName'. Single quotes are your way to tell Perl to be literal. The '$', '@', '%', etc. actually mean a dollar sign, an at sign and a percent sign, rather than signifying a variable. If you want a literal ' (single quote) this is the only case in which you need to backslash inside single quotes.

    This:

    print 'This is an @ (at) sign';

    prints out "This is an @ (at) sign". And:

    print 'This is a \' (single quote)';

    prints out "this is a ' (single quote)". And

    print 'This is a \ (backslash)';

    prints out 'This is a backslash'.

    Contexts and function calls

    As we saw last chapter, when you are passing variables into a function, they are interpreted strictly as an array. We didn't say it at the time, but here is a perfect example of the importance of context in Perl. This is a case in which the assignment's role in determining context takes a back seat to the function call's role in determining context. When you say something like:

    $return_value = functionName($array1, $array2);

    You might think that '$array1, $array2' would be somehow munged into a scalar because of the equals sign. But no, the arguments are insulated from being translated into a scalar simply because they are in a function call.

    This logic allows you to pass in arguments as you please. On the other side of the tracks, so to speak, the function call looks like:

    sub functionName

    {

    my (@arguments) = @_;

    # ... function call here.

    return($return_value);

    }

    Arguments are passed by arrays, and the special variable '@_' is assigned all of the values passed in to the function. What's going on pictorially here is:

    fig68 (line art)

    Figure 6.8

    caption 'Function names and contexts'.

    The boxes indicate what is being copied to, and where. Attentive readers will recognize this as the figure included in last chapter. For more details and pitfalls on writing user functions, please refer to the last chapter.

    Array References and Contexts.

    As we shall see, references are Perl's way of making 'pointers' to data (so you make complicated data structures, a two dimensional array being an array of pointers to their arrays). A reference constructor, denoted by '{}' (for hash references) and '[]' (for array references) is always a special type of scalar that is insulated from its surroundings. In other words, it, and its elements, cannot be coerced into scalars. This means that if you do something such as:

    @array1 = (1,2,3); @array2 = (4,5,6,7);

    $arrayRef = [@array1, @array2];

    @array1 and @array2 are not interpreted as 'the number of elements in @array1 and @array2'. This means that arrayRef above does NOT become 4, dropping off the first element, as if you said:

    $arrayRef = (@array1, @array2);

    Instead, think of the internals of [ ] and { } as a mini LIST context, insulated from the outside world, just like subroutine calls. Therefore, $arrayRef becomes [1,2,3,4,5,6,7], which is read as '$arrayRef points to the array that has the values [1,2,3,4,5,6,7]'.

    Likewise, just like in the case with lists, if you put TWO arrays into one of the above constructs, then they will be interpreted internally as one big list and lose their identity.

    Here are some more examples of this insulation.

    @array = ([1,2,3,4,5],[6,7,8,9,10]);

    Here, each of the array references is treated as a scalar, and @array becomes 2 elements long, each being an array reference. This is, by the way, Perl's way to construct a two dimensional array. (we shall see next chapter how to get the data out!) This example:

    sort([1,2,3,4,5]);

    is just plain nonsense, since built-in function sort is expecting an array, and you are passing it a reference (which, being a scalar becomes a one element array).

    And:

    if ({'a' > 'hash' } > {'another' > 'hash'})

    is also nonsense, since this does not somehow get the number of elements in the hashes, and compare them (as 'if (@a > @b)' would do with arrays). Instead, it returns a nonsense value.

    Again, the importance of special characters rears its ugly head in Perl, and if you haven't gotten them straight, it will cause you no end of pain. Remember:

    1 $scalar = (@array);

    and

    2 $scalar = [@array];

    are very different! 1 makes $scalar the number of elements in @array, 2 makes $scalar a reference to @array itself.

    Control structures and contexts.

    Control loops have their own special rules for contexts. However, most of these rules are fairly natural and pretty elegant (if you think about it). Table 6.1 shows the basic rules:

    Table 6.1

    Construct Context

     

    if,unless (...) scalar

    while ( ...) scalar

    foreach (...) list

    for (;;) scalar

    for () list

     

    The first for construct is the standard 'for ($xx = 0; $xx < 10; $xx++)'. The second for is the less standard 'for (@array)' construct, which is really a synonym for foreach.

    The main thing to remember here is that each one of these context rules works fairly well with its associated construct. Consider while, for instance. while's job is to evaluate the expression which exists inside the parentheses after it, and, if true, evaluate the code associated with its block. This example of a while loop:

    while (defined $line = <FD>)

    {

    }

    evaluates the expression 'defined $line = <FD>' one at a time, using the knowledge that <FD> will return the first row, then the second row, and so forth. while therefore evaluates this in scalar context. Hence,

    while (@array)

    {

    }

    is either an infinite loop or a no-operation, since @array evaluates to the number of elements in the array, either a number greater than zero, (in which case while evaluates as true and it is an infinite loop) or a number equal to zero (in which case it evaluates to false, and the while lope ends.) Therefore, it makes sense for while to evaluate expressions in a scalar context.

    Likewise, it makes sense for foreach to evaluate expressions in a list context. The job of foreach is to iterate over a list of elements. This means that when you say:

    foreach $line (<FD>)

    {

    print "$line";

    }

    you are basically getting the same result as the equivalent while loop (printing out the lines in the file denoted by the file handle <FD>) but realize that with the <FD> being evaluated in list context, you are slurping the entire file into an array which foreach subsequently processes.

    On the other hand, with this code:

    while (defined $line = <FD>)

    {

    print "$line";

    }

    the scalar context makes it so that only one line is being processed at a time.

    Finally, consider if. Since if does things in a scalar context, you can say things like:

    if (@array)

    {

    }

    to test if array has any elements or not, or:

    if ($returnValue = myFunction())

    {

    }

    which will test if the function 'myFunction()' returns a true or false value, and only do the associated if clause if $returnValue is true.

    Summary

    Although there are only five basic rules for understanding 99% of contexts, they can get interesting fast. You can twist and bend Perl syntax into whatever shape that you want, although that is not the best policy sometimes.

    Instead, the point of is that if you understand the examples above, you will know contexts pretty well. At least enough to come up with clever solutions of your own. Hopefully, your code will not be as 'flashy' as some of the examples below. The below code exchanges terseness for understandability. You should weigh how much of this is worth it compared to the ability to understand your own code down the pike, let alone have somebody else understand it.

    Here are the five rules again, for review:

    1 If the variables are tied together by an operator that is not the (=) sign, then the variables both to the right and left hand side are both scalars.

    2 If the variable is inside a special function then that function determines the variable context.

    3 If the variables are tied together by an assignment operator (=), then the left hand side of the statement determines the right hand side's context.

    4 If you have variables in certain contexts (user defined functions, array references, and hash references) then those variables are in list context, and 'insulated' from assignment. (i.e.: in $a = myFunction(@b);, @b does not get coerced to be a scalar, but is a list)

    5 Items in a while loop are natively evaluated in scalar context, as are items in an 'if' clause. Items in a for loop are in list context.

    Now lets take a look at how Perl's syntax can bend using these five rules. One of the main complexities as we shall see, in Perl syntax is that you can design it such that the output of one expression is used as the input of the other. This process is known as chaining. With chaining, the possibilities for manipulating data are endless, as we shall see next.

    Examples

    We said earlier that Perl's policy towards contexts are used set Perl apart from other languages. This section is designed to show off that flexibility.

    One of the big things that Perl does is lets you fit together functions as if they were tinker toys. Say that a function so happens to return an array. Then the output of that function could be passed on to another function, which in turn alters the output in some way. Look at this example:

    my @uppercase = split(//, uc($variable));

    where $variable = 'this must be so' returns

    ('T'.'H','I','S',' ','M','U','S','T',' 'B','E',' ','S', 'O');

    Taking the result of this, and then reversing it, say:

    my @uppercase = reverse(split(//, uc($variable)));

    returns

    ('O','S',' ','E','B',' ''T','S','U','M',' ','S','I','H','T');

    Now go ahead and join the resulting string:

    my $uppercase = join('',reverse(split(//,uc($variable))));

    returns:

    'OS EB TSUM SIHT'

    In short we have done something that looks like Figure 6.9:

    fig69

    Figure 6.9

    Stringing functions together.

    With a few lines of code, we have done a relatively complicated task: reversing a string and making it uppercase, no less! We did this by taking relatively simple functions (join, reverse, split, uc) and realizing the contexts that those functions require. The only thing remotely difficult here is the 'split(//' part, and that is because split takes a regular expression in the first part. This results in splitting up the string into component characters, which we will talk about at length in chapter 11.

    We have also happened to re-invent the wheel. The following does exactly the same thing, without dealing with split:

    my $reverseUC = uc(reverse('this must be so'));

    Since reverse can take a scalar argument, as well as an array.

    There are two points to be made here. First, since it is relatively easy to tell, even by trial and error, what contexts functions take, it is easy to do this stacking. After all, there are only two choices at any given point, scalar or list. Second, it pays huge dividends to know what Perl's internal functions do, inside and out. You'll save yourself a lot of time just by not doing this elaborate stacking.

    With this in mind, here are some other examples of putting contexts together to do pragmatic things.

    Example 1: Reversing a hash:

    The statement:

    %hash = reverse(%hash);

    takes a hash that looks like:

    %hash = ('key1' => 'value1', 'key2' => 'value2','key3' => 'value3');

    and turns it into a hash that looks like:

    %hash = ('value1' => 'key1','value2' => 'key2','value3' => 'key3');

    It works, again, because of contexts. Since reverse takes an array as its only argument, %hash turns into an array, something like:

    ('key1','value1','key2','value2','key3','value3');

    Reverse then reverses it, to become:

    ('value3','key3','value2','key2','value1','key1');

    This then gets 'stuffed' back into the hash with the values now becoming the keys (the odd elements) and the keys then becoming the values (the even elements). Note that this 'reversal' is non-determinant, i.e.: if you have two or more keys with the same values, then you will end up with a hash containing the first value encountered, i.e.:

    ('key1' => 'value1','key2' => 'value1','key3' => 'value1');

    becomes:

    ('value1' => 'key1') or ('value1'=> 'key2') or ('value1' => 'key3')

    Example #2: Reading from standard input until a certain type of character is pressed.

    This is used all the time. If you say:

    while (($line = <STDIN>)!~m"end")

    {

    print (join('', reverse(split(//, $line)))); # sample code

    }

    then $line = <STDIN>, in scalar context, will catch a line of input that someone types in at a keyboard. The regular expression '!~ m"end"' checks to see if that line has the string 'end' in it. (Again, see the section regular expressions for more detail, they are absolutely essential for understanding Perl!)

    If it doesn't have the string end in it, ('!~ means doesn't match) then go ahead and do the while loop, which in this case, means print out the reverse of what they just typed. (a 'palindrome').

    Example #3: Splitting up a string into chunks 10 characters long:

    This example is a 'teaser'. It shows you what you can do with regular expressions, and is meant to entice you to read the section on them (regular expressions really are quite useful).

    Anyway, here is the code:

    $line = 'aaaaaaaaaabbbbbbbbbbcccccccccc';

    @split = ($line =~ m".{1,10}"sg);

    A little bit of explanation is in order. The m".{1,10}" means match one to ten characters in the string line ('.' means any character, {1,10} means one to ten characters matched, the more the better) and the sg is a special signal to Perl to make it so the text that matches go into an array (in array context). Something like Figure 6.10:

    fig610

    Figure 6.10

    Regular Expression array context

    This does quite a bit of work, something equivalent to the code:

    while ($counter < length($line))

    {

    push(@split, substr($line, $counter, 10));

    $counter+=10;

    }

    which itself is a good example of contexts in general (and which you might want to do yourself until you have got regular expressions down).

    Example #4: Checking to see if a file has the same number of lines as another file:

    my $FD1 = new FileHandle("file1");

    my $FD2 = new FileHandle("file2");

    if (@{[ <$FD1> ]} == @{[ <$FD2> ]})

    {

    print "Same Number of lines!\n";

    }

    This example uses our old trick to make function calls into arrays, i.e.: @{[ functionCall()]}. Now, you should have an idea of how to decode this (when we get to references, you'll have even more of an idea). When you say:

    [<$FD1>]

    it reads the file descriptor $FD1 in array context, getting all of the lines out and putting it into a reference. The construct:

    @{[ <$FD1> ]}

    then dereferences the reference, turning it into a real array. And finally:

    if ( @{[ <$FD1> ]} == @{[ <$FD2> ]} )

    takes the two arrays, turns them into scalars (meaning the number of elements in the array) and then compares that value, to see if they are equal. If you don't want that much magic going on, you can say:

    @lines1 = <$FD1>;

    @lines2 = <$FD2>;

    if (@lines1 == @lines2)

    which is probably a saner way of doing it, anyway.

    Example #5: Finding out whether or not one file has more occurrences of the word 'the' than another file..

    Just to show you how sickening contexts can get, this last example, in one line (OK technically six), counts the number of occurrences of the word 'the', and compares it to another file's occurrences . Here's the code (brace yourself):

    undef $/; # makes it so perl gets the whole file in one <> read (see special variables)

    my $FD1 = new FileHandle("file1");

    my $FD2 = new FileHandle("file2");

    if (@{[ <$FD1> =~ m"\bthe\b"sg ] } > @{[ <$FD2> =~ m"\bthe\b" ]})

    {

    print "file1 has more occurences of the word than file2!\n";

    }

    Confused yet? Well, again, it is simply a matter of unraveling contexts. ($/ makes it so the whole file goes into file descriptors on one read of <>, full read mode.) Again:

    [ <$FD1> =~ m"\bthe\b"sg ]

    is in array context so this means 'open up $FD1, and get all of the data' (because of $/... if this wasn't there, it would get one line.). Then, use regular expressions to match occurrences of the with 'bordered words around it (i.e.: \b means match at a word boundary, a string like:

    the sin of the flesh

    would match at the places where 'the' is bold, but

    theocracy

    would not because there is no word boundary on the right in theocracy. Anyway, in the string 'the sin of the flesh',

    [ <$FD1> =~ m"\bthe\b"sg ]

    becomes

    [ 'the','the']

    because of the two occurrences of the word 'the' that matched. Hence:

    @{[ 'the','the']}

    becomes the array:

    @array = ('the','the')

    which in the scalar context:

    if (@{[ .... ]} > @{[ ... ]}

    becomes the number:

    2

    which indicates the number of occurrences of the word 'the' in the string 'the sin of the flesh'. Whew!

    You probably would be better off doing this:

    undef $/;

    my @words1 = split(' ', <$FD1>);

    my @words2 = split(' ', <$FD2>);

    my @the1 = grep(m"\bthe\b", @words1);

    my @the2 = grep(m"\bthe\b", @words2);

    if (@the1 > @the2)

    {

    }

    which does the same thing as the above example, but makes it a little more explicit. grep, if you aren't familiar with it, is Perl's way of making a 'filter' on an array. The grep above, for example, only lets pass into the array @the1, the words that match the pattern '\bthe\b'.

    Or you may want to unroll this syntax even further. Start simple, and then get fancy.

    Orders Orders Backward Forward
    Comments Comments

    COMPUTING MCGRAW-HILL | Beta Books | Contact Us | Order Information | Online Catalog


    HTML conversions by Mega Space.

    This page updated on October 14, 1997 by Webmaster.

    Computing McGraw-Hill is an imprint of the McGraw-Hill Professional Book Group.

    Copyright ©1997 The McGraw-Hill Companies, Inc. All Rights Reserved.
    Any use is subject to the rules stated in the Terms of Use.