Orders Orders Backward Forward
Comments Comments
© 1997 The McGraw-Hill Companies, Inc. All rights reserved.
Any use of this Beta Book is subject to the rules stated in the Terms of Use.

Chapter 2: Perl at 30,000 Feet: An Overview of Perl

This chapter demonstrates how to run Perl 5, as well as introduces Perl syntax. The following chapters build on this one, so much so that you might consider this chapter a map for the chapters to come. Before proceeding, I strongly suggest that you have Perl installed, and the man pages print out, so that you may test the code examples given here.

If you are already familiar with Perl you might want to go to the next chapter. For a Perl veteran, the examples in this chapter will be fairly simple. However, if you want to get back into Perl programming after being a bit rusty, this chapter is for you.

Chapter Overview

One of the best ways of learning a language is to look at it as a whole, and without concerning yourself about the parts. This chapter is designed around that principle; to give you a smattering of everything, before we take out our microscope and focus on a bit at a time. Hence, there will be four major parts in this chapter.

First we will go over some of the simple uses of perl, uses which have been around as long has perl has been around; solving problems which 'fall in the gap' between what an operating system provides, and what companies provide products for.

Second, we shall go over how to actually run perl on Unix and Windows 95/ Windows NT platforms. In particular, we shall show several ways how to make perl work with the windowing platform, and some of the 'flags' that we can use with the perl executable.

Third, we give an overview of perlish syntax. This is where we enter the '30,000 feet' territory which gives the chapter its name. We consider perl variables, functions, comments, statements (simple and compound), control structures, and common perl errors. In short, we will discuss in 20 pages what we will later expand to 200.

And finally, we shall give some examples of perl in action, going over some examples of how you can become instantly productive in perl. We shall consider six such examples ranging from logging into an internet service provider to interacting with excel via OLE.

As the chapter says, this is the 30,000 foot view. For those of you who like to be closer to the ground, the rest of the book shall go a bit more methodically.

(bc)Introduction

Perl was designed as a 'just do it' language, and many of our Perl programs today are simple time-savers. The standard saying is that 'Perl was designed as a combination of the best features of sh, C, awk, and sed', the standard UNIX toolkits. Perl, in other words, was and still is a language for tool building.

Although we will see that Perl is much, much more - it is capable of managing million dollar projects, mission critical data, and large websites - never forget that one of its primary functions was as tool builder. If you find yourself confronted with a repetitive task, chances are that you can code a simple, 10 line script that will save you lots and lots of time.

Some Simple Uses of Perl

As we have seen, the original use of Perl came out of the shortcomings of other shell tools. One of the primary uses of Perl (still is) is to do simple administrative tasks, and do them quickly.

Consider the following tasks:

1) Emailing members of your team when a process is done.

2) Cleaning up a disk with lots of junk files on it.

3) Performing version control (checking to make sure that a database, or files, applications, etc. are up to date.)

4) Summing up a large amount of text into a report (what Perl was actually designed for in the first place!)

Consider an environment which has lots of these little, critical path problems. These tasks can't be ignored, because if you do so, then other people can't do their job. But if you pay attention to them, and do them by hand, then you can't do your job! In other words, they become the housekeeping tasks that take over a project.

If you have lots of junk files, then you potentially will run out of disk space. If an important process finishes, and you don't inform people automatically, you are setting yourself up. If you forget, they will be twiddling their thumbs waiting for an already finished process! And effective reports can save you and your coworkers lots of time.

This is where Perl is a lifesaver, and this is why I was attracted to it about 6 years ago. Perl has a very short 'ramping up' time: people can learn it, and learn it quickly. Add on the fact that Perl has become a serious programming language, object-oriented with debuggers, embedded documentation, and links to C/C++, makes it the perfect language for learning programming technique.

Perl can allow you to become almost instantly productive in Perl, yet it won't 'close any doors' on you. You can learn object-oriented programming, modular programming, database interfaces, even effective data management, and at the same time be productive in your job. What a concept! You won't have to take off six months to be productive. The magic of Perl is that it allows you to learn while doing.

  1. Running Perl

There are quite a few ways of running perl; each operating system tends to develop its own. Unix users favor using the #! syntax at the beginning of their commands, Macintosh users favor running their commands through a GUI, and Windows NT users haven't actually figured out what they like the best quite yet.

Therefore, we will go through a number of these uses. We will start with the generic way of running a perl command (that will work everywhere, except perhaps for macintosh) and then turn to specialized ways of running perl for Unix and windows.

But first of all, we shall talk about what is actually 'going on' when you run a script through perl in the 'Generic' portion below.

Running Perl Generically

  1. Running Perl

Perl is an interpreted language that runs on a multitude of different operating systems. This means that you don't need to worry about not finding it on a obscure system (there are versions of it for the Amiga brand of computers, for example).*

If you really need the AmigaOS version of Perl, or for several others, go to:

http://mox.perl.com/perl/info/software.html

and click on the /CPAN/ports link there. This will take you directly to the place where 'Alien' Perls live)

With Perl, you don't have to worry about compiling your program into a form that the computer can read, as you would C or C++. The general syntax for a Perl command that will work on pretty much any platform is:

prompt% perl ScriptName;

where 'prompt%' is the prompt that you would get in a UNIX-like environment, andor:

C:\> perl ScriptName;

which is the equivalent syntax on an NT or Windows 95 through MSDOS. box, at a MSDOS window. (which you will need to build by double clicking in the MSDOS icon). In other words, Perl takes whatever input which is in the file ScriptName, and checks its syntax, i.e., it verifyingies that what you have givengave it is a correctin fact a Perl file, and then executes it if it is.. If that script is written in Perl, executes the file right then and there after

Note that the above usage assumes that you have a file called 'ScriptName'. AOr, alternatively, if you don't want to have a file created, you can type out a simple one line Perl script and save it into a file by , you can saying:

prompt% perl -e 'ScriptText';

which is simply Perl's waya way for Perl to interpret the text between the single quotes on the command line as a small Perl program and run with it. The following command:

prompt% perl -e 'print "Hello, World\n";

prints out "Hello, World\n" on the screen and then exits.

Running Perl Tenets

As you can see, this philosophy differs quite a bit from other computer languages.See how simple? Almost allany other languages - apart from such languages like tk and shell - requires a certain amount of effortrigor to run a given program in order to make something like this work.

Let's say that I was using C/C++ instead. On my UNIX box, in order to write this and use it, I'd have to.

a) open the editor

b) type in the following file and save it with a specific suffix, namely '.c':

#include <stdio.h>

void main()

{

printf ("Hello World!\n");

}

c) close my editor

d) type 'cc hello.c' (assuming we named it as such), to compile the program into machine language in a file called 'a.out'. This is also different on different platforms.

e) type 'a.out'

This cycle is edit/formally define/compile/execute. Perl avoids it almost completely.* Look at the sample Perl code again:

prompt% perl -e 'print "Hello, World\n";'

If you are a C programmer, you willIf you are a C programmer, note the lack of a main() subroutine, the lack of declarations, and the lack of everything except to print out 'Hello World' and then a carriage return.

Or more accurately, lets you decide how much of the cycle you want to participate in. If you want to formally define things, you can. If you want to make a subroutine main, you can. You just don't have to. It just that it is not enforced by Perl. Adding as much or as little as you want in rigor is a common topic all throughout the book.

In other words, if you give a string or file which is the proper Perl syntax to Perl, whatever you have programmed in ScriptName will be executed. There is no manual compilation or linking steps required, and no executable created. This is all done transparently by Perl itself.

So much for the 'simplicity of running perl' part. However, there is another facet to perl and how it executes your scripts that you will appreciate in times to come.

The other main thing to be aware of in running a perl script is the fact that perl checks the syntax fully before it actually starts running.On the other hand, Unlike other 'simple to use tools' like older versions of Tcl or shell, pPerl isn't so slack as to only execute a part of your program, before finding a syntax error and dying. There are no half measures in Perl. Perl checks the full syntax of the command before executing it.

These other languages execute the language a statement at a time, which means they This differs quite strongly from shell scripts or languages such as tcl, which can go halfway through a script and then die, with very unpleasant effects. If Perl does not like the script, it will complain loudly. Listing 2.1 is an example of a simple compilation mistake:

prompt% perl -e "print 'hello, world' print 'hello, world2'"

 

syntax error at -e line 1, near "'hello, world' print"

Execution of -e aborted due to compilation errors.

This dies because there is no semi-colon between the two print statements. For more information on common syntax, and common syntax problems, see section 'General Perl Syntax' below.

Now, this is not the only generic way to run perl through the command line. Perl also offers a number of 'perl switches' which ease the task of doing certain common actions with perl. The switch '-e' that we used above was one example. We now turn to look at some others.

Perl Switches

There are quite a few specialized switches that Perl recognizes when executing a Perl program. One of them, '-w', you will hear about quite a bit in the course of this book. It allows Perl to give warnings if the variables in programs are probably not being used correctly, if there is a problem with a function infinitely deep in recursion, and numerous other potential problems.

Common switches that we will be concerned with in this book include:

-w warning switch: you will almost want to make this a standard. This is because it does much of the debugging for you by warning you when variables are not declared, used only once, and lots of other things.

-d to run the Perl debugger.

-D to use the 'debugging' flags which gives indications of how Perl is parsing the program. This switch is not available by default; you need to compile Perl with the 'debugging flag' (see section 'installing Perl' for more details on how to do this). '-D' will be the focus of chapter 22 when we talk about the debugger.

-c to use the syntax checker. With '-c', Perl checks the syntax of the program without actually running it.

-S to execute a script via searching in the users path. This flag is used to make Perl search through your path in order to find the script after '-S' and to execute it if it has the correct permissions. Used most often with a method for running perl we shall describe below.

These are just a few of the switches that are available for use in Perl. If you are interested in further command line switches, we strongly suggest that you consult the Perlrun manpage which comes with this documentation for more details.

(cd) Running Perl Specifically

Unless using the perl switches, as above, hardly anybody at all ever types 'perl scriptname' to get a perl script to run. It is somewhat redundant - you want people to be focusing on the script after all, and not on the perl executable.

In this section, we will therefore be concentrating on the different ways to run perl on unix, and windows NT/95.

Running Perl on Unix More Ways to Run Perl - UNIX

Unix is the parent Operating System which perl was developed on, and as such has Now suppose you want to get away from typing 'Perl' on every line to execute a Perl program. If you are on UNIX, you may want to do something such as:has a couple of well-entrenched ways of executing a perl script. They are listed below.

the #! Method

The #! Method is available to any shell that supports the #! Syntax for programs - which means pretty much any shell out there. The trick is to simply put the full path to the perl interpreter at the top of the script and change the permissions of that script so it becomes executable. Therefore, if your perl binary was located in /usr/local/bin, you would say something like:

#!/usr/local/bin/perl

at the beginning of your script to run it, and then you would say a statement like:

prompt% chmod 755 script.p

to change the permissions of the script such that it was runnable for everybody. If we do this, assuming Perl is located in /usr/local/bin.our Then, the simple 'Hello world' example becomes:

#!/usr/local/bin/perl

print "Hello World\n";

After creating this Perl program, type the following at the command line:

prompt% chmod u+x script.p

which changes the mode on the script to be executable to the user. After changing the modifications to the script, Then, the program script.p can be executed without explicitly saying that it is a perl scripttyping Perl:

prompt% script.p

This line is then internally translated by your shell intoThis is basically equivalent to:

prompt% /usr/local/bin/perl -w script.p

and makes for a convenient way to run your scripts. You can also run your scripts with flags in them, hence:

#!/usr/local/bin/perl -w

turns into

prompt% /usr/local/bin/perl -w

when put onto the command line.

2) Running scripts via 'eval'

This shortcut does not always work correctly, and is definitely not bombproof. If you move your script to a place where Perl is not installed in '/usr/local/bin', your command will not work.al/bin', or porting to Win95/NT, this will NOT WORK.

The result is something like 'command not found'. Some operating systems, like HP/UX, also throw away long names like:

"#!/this/is/an/extremely/long/path/to/perl/so/it/will/not/work"

by simply truncating the line to the first 32 characters.

For a muchMUCH better method you can say something like:

#!/bin/sh

eval 'exec perl $0 -S ${1+"$@"}'

if $running_under_some_shell;

This takes advantage of the fact that we know that sh ( the bourne shell ) has a command named eval (which evaluates a given string as a shell script) can execute a command named perl which happens to be in the user's path.

This otherwise horrid construct will make your scripts much more portable - at least to other Unix machines. For example, if Perl is installed in '/usr/local/bin' on one machine, and '/usr/bin' in another, you can run the same script on both machines, since the eval 'exec ..' makes it so you can dynamically determine where Perl is.

Otherwise, when you move your scripts over to the new machine, it will no longer find the perl executable, and therefore your shell scripts will not work.

 

More Ways to Running Perl on Windows 95 and Windows NT- Win95/NT

Since perl has become popular on Windows 95 and NT rather recently, the number of ways of running perl on NT has become large. Time will tell which one is the best, but for now I will give them all, apart from the 'generic' use which we discussed above.

But as it stands, you cannot use theWin95/NT does not have available the #! SSyntax by default. The MSDOS shell 'cmd.exe' is just not sophisticated enough to understand #!, and at any rate scripts on NT are made executable by certain extensions to the files (like '.bat' or '.exe') and hence having a '.p' or '.pl' extension will not work. to create paths, at least by default. However, there are three basic ways you can run Perl scripts on the Win32 platform:

Using a NT to UNIX converter

If you want to use the #! Syntax,#! so your scripts are totally portable from UNIX, you should install gnu-win32 and bash for Win95/NT (http://www.cygnus.com) or djgpp (http://www.delorie.com).

Both provide packages for basically turning your NT box into a Unix box. If I had to choose between the two, I'd choose djgpp because it works on a larger group of platforms (Win31(!), Win95, and Win NT), but that's a pretty slim margin. They both give you Richard Stallman's excellent GNU utilities ( http://www.fsf.org ) which make the NT world a little bit easier for Unix users.

BothIt products comes with the CD the disk that is a associated with comes with this book. If you have them installed and set up, you can type:this book, and can be found at ftp://ftp.cygnus.com/pub/gnu-win32/latest. This package emulates UNIX functionality in a windows environment. With this package in place, you could say something like:

C:\> bash

tTo get a working UNIX shell, and assuming that script.p has '#!/path/to/perl' as the first line, you can say:

bash$ script.p

to execute the Perl program.

This method is especially helpful for those of you who need to run Perl scripts transparently between platforms. It is a bit of a hack, but all you have to do is either:

1) install perl inside a path equivalent to the path where you have perl on your Unix box. ( something like C:/usr/local ).

2) install bash inside ( C:/bin ) and make a copy of bash.exe to sh.exe.

Then, consult the section on running perl on Unix above. Either of these ways can work wonders if you are dealing with a perl project that has to work on both machines. If you don't have this 'cross-platform runnability', you will find it an extreme annoyance when you have to change each and every script, just so perl can run on one platform or another.

Making Standalone Scripts via Batch FilesExecutables

If you don't want to have the overhead of getting a Unix-to-NT converter,use #! (saying no to gnu), and STILL want to say simply 'ScriptName.bat' to invoke a script, the Windows release of Perl provides two different scripts in the standard distribution to help you out.

1) the script pl2batbatwra.bat in the standard distribution. If you say something like:

C:\> pl2bat a.pl

then pl2bat will take a.pl, strip off the '.pl' and create a batch file called a.bat. Then you can run the perl executable as:

C\> a

which will then run your script named 'a'. When using this method, you can only have scripts with no extension, or have an extension with a '.pl' on the end. This is because if you have a script named 'a.p', and you call the program pl2bat with:

C:\> pl2bat a.p

then it will create a script, 'a.p.bat', which does not know how to execute correctly due to the DOS command line not knowing what to do with the '.p.bat' extension.

2) the script runperl.bat in the standard distribution. 'runperl.bat' takes the opposite tack of pl2bat.bat. Instead of copying the code that you have into another file and appending a header, you copy runperl.bat into something like a.bat and then this batch file runs your original script. 'runperl.bat' is available inside the 'bin' directory associated with your perl installation. If you say something like:

C:\perl\bin> copy runperl.bat a.bat

and then type

C:\perl\bin> a

then DOS will run the associated script 'a' for you.

'runperl.bat' has some benefits associated with it over pl2bat. First of all, since you are copying the wrapper runperl.bat around instead of the underlying code (as you are doing with pl2bat) you don't maintain two separate versions of the code.

Second, running your perl scripts is a one step process. Once you have made the original copy from runperl.bat to a.bat, each time you make a change to your original script 'a', you can run the script with the associated changes straight off.

However, it shares the same limitations with runperl.bat in that you are forced to have perl scripts without a suffix. It also causes some overhead due to the fact that a function named 'exec' has to be run each time you start up a script that has been 'runperl'ized.

p.bat, which, when run on a Perl script, puts header information at the beginning of the program to turn it into a DOS 'bat' file.

This is pretty messy; about 20 lines of extra header, but it works for the most part. By the time of this publication, batwrap.bat should work to make your scripts portable from UNIX to NT, doing all the 'dirty' work for you in making your scripts runnable everywhere.

Making Perl scripts associated with perl into icons

Onon Windows95/NT, you can also make it so you can double click on an icon to launch a Perl application. To do this, go into Windows Explorer, and select a Perl file (such as ScriptName.p). Now, select Open With, and scroll down the list that this option provides (see Figure 2.1) until you find the executable 'Perl':

fig21.gif

Figure 2.1

Caption: 'associating a Perl script with an icon'

You have now associated all of the files that end with '.p' with Perl. Hence, if you name all of your scripts with a consistent suffix (script1.p, script2.p, script3.p, etc.) you will only need to do this step once.

There are advantages and disadvantages about associating a Perl script with an icon. The good thing is that you can double click on an icon to get the behavior that you want. However, when you use Perl this way, you lose out on a lot of the power that Perl provides in the form of command line arguments. In other words you could say:

perl script.p

and

perl script.p -input_file my_input

instead of making two different scripts. (one that handles the an input file and one that doesn't.)

Another disadvantage about associating an icon with a Perl script is that only a couple of switches will work, and you will have to hard code them into the script with the associated Perl variable:

$^D = 512;

$a =~ m"pattern";

turns on the debugging flag for regular expressions, for example. See perlrun for more detail.

And yet another disadvantage of this method is that your scripts will execute, and then disappear after they are done executing. Perl is a DOS application and as such opens a dos window to do its processing. Therefore, when you click on that icon you have formed via associating a script with the perl executable, your script may run really fast showing output, and then - boom - close down the MSDOS prompt, taking your output with it.

To get the flexibility of the command line in a GUI form, you might also consider making a perl/tk GUI out of your scripts. We consider this in chapter 12, 24, and 25.24 Or, in fact you might con.sider using one of the tk examples we develop in chapter 12 (runscript.p - which captures the text from a perl script run into a GUI window). You might also want to consider cutting and pasting Perl code into Visual Basic (instructions on CD)

Summary of Running Perl on UNIX and Win95/NT

In short, you can run Perl pretty much anywhere with the following statement:

C:\> perl 'script_name.p'

where script_name.p is a legal series of Perl statements. If you want more convenient usage for Unix, you can use a shell trick:

#!/usr/local/bin/perl

<text_of_script_here>

But this will only work out of the box for UnixNIX. For NT, your best bet is something like:

C:\perl\bin> copy runperl.bat a.bat

where 'C:\perl\bin' is the place where perl is installed on your machine, and 'a.bat' is the perl script name that you have, plus the '.bat' extension. This solution however will only allow you to make scripts without extensions like '.p' or '.pl'; in order to run your scripts in a Unix-like way, you are best off getting the djgpp or gnu-win32 packages described above. In order to get the same behavior for Win95/NT, you need to pick up a UNIX emulator such as gnu-win for Win95/NT, which we have made available on the disk that comes along with this book.

General Perl Syntax

Up to this point you should have Perl installed and running on your system. You should also have a general idea of some of the tasks that Perl can do for you. Finally, you know how to create and invoke a Perl program. Now let's take a 30,000 foot view, so to speak, of the general syntax of Perl.

In this section we will take an introductory look at:

1) special characters

2) functions, variables and subroutines

3) gluing those elements into statements

4) the most common errors that are made in Perl.

I may be accused of syntactic terrorism here by throwing you directly into the thick of Perl, but let's take the chance. Perl is notorious for all of its special characters. To the untrained eye, a code noise Perl script can look as familiar as the clicks and hums of the !kung sound to an English speaker..

But this disorientation is only temporary. Once you take the chance to learn a few Perl principles and take a look at a few Perl scripts, Perl seems very natural. It 'flows well'. In fact, of all the programming languages that I have used, it seems the most like English.

All this said, if you are new to Perl, just accept the strangeness for now -- I have simply found that one of the best ways of learning a language is to simply jump in.

Perl Variables.

Unlike most languages, Perl represents its different types of variables by special characters, not by declaring them as 'char name[80]' is done inin C. Also, youUnlike other languages, you can have variables that are the same name, but actually point to different variables. This is shown below:

@variable, $variable, and %variable

point to three separate types of variables. @variable is an array, $variable is a scalar, and %variable is a hash. These are the three major types of variables and we take each in turn.

Scalars

$ denotes a scalar, which is a variable that contains anything, numbers or letters, special characters, and of any size.

One can remember the denotation of scalar by thinking '$' as an 's' as in 'scalar', or 'single'. Its purpose is to point to a single value, which could be anything (a string, a number, a bunch of binary data, whatever). I will have lots to say about this in chapters to come, but here are a couple of examples of scalar assignment:

$string = "This is a string!\n";

$number = 33.4;

In other words, when assigning a scalar to a string, you use either "" or '' to indicate when a string begins or ends, and do not need these quotes if you are assigning a number.

Unlike other languages, you don't need to worry about how these variables are actually stored on the computer. They grow and shrink on demand. A common mistake for C programmers is to say something like:

print $string[0];

and expect to see 'T' be printed. They expect $string to be an array of characters, as it would be in C. This just doesn't work. Instead, this statement looks for the first element in the list called @string. You have to do something like:

print substr($string,0,1);

to do that, which takes a substring of the text in $string.

Arrays

@ denotes an array, which is a variable that contains a list of items (scalars) which can be referred to by number. An example is $ARGV[0], which is the first element in the array @ARGV, or the command stack.

If a scalar is a single value, then an array is a bunch of scalars. You can remember it by thinking '@' = 'at', 'at' = 'array'. Since arrays are bunches of scalars, you can get at an individual element in an array by subscripting that array with an [<element_number>].

Again, we have devoted pretty much the next chapter to these variable types, but here are a couple of examples:

@arrayName = (1,2,3,4,'hello','goodbye'); # sets the array '@arrayName'

print $arrayName[0]; # prints '1'

print $arrayName[5]; # prints 'goodbye'

Note a couple of things about these examples. First, you can intermix any type of text in an array (strings, numbers, anything). Second, you don't need to tell Perl how many elements are in the array. Perl does it for you automatically.

Hashes

% denotes a hash, which is a variable which contains a series of items that can be referred to by a string instead of an element number by place as in arrays. An example is $ENV{'PATH'} which refers to the environmental variable 'PATH' in the operating system.

Hashes are Perl's way of making a dictionary. If you think about it, when you open up a dictionary, and look for a definition, you are looking for a string, and trying to get the value for that string. For example,

dog (n.) mammal of the canine family, domesticated by humans.

Perl's way of encoding this relationship would be to say something like:

$dictionary{'dog'} = '(n.) mammal of the canine family, domesticated by humans.'.

See how this works? This statement says 'the definition of 'dog' in the 'dictionary' is '(n.) mammal, etc. etc.,'

Here are a few more examples:

%hashName = ('key' => 'value', 'key2' => 'value2' );

print $hashName{'key'};

print $hashName{'key2'};

The first example sets a hash. Just like a dictionary, a hash contains a series of 'key value' pairs, indicated by the syntax ''key' => 'value''. To get a value out of them, say '$hashName{'key'}, which corresponds to the definition of 'key' in %hashName. So, '$hashName{'key'} prints 'value', and '$hashName{'key2'} prints 'value2'.

Hashes are infinitely valuable. They are also the hardest structure to get used to for programmers of other languages. You simply have to get used to them through examples (of which we shall have several).*

Unfortunately, I can't think of a good mnemonic for them! Unless you think of the percent sign as sort of a scratch mark -- which is sort of a 'hash' mark -- but that really is stretching it... Ah well.

FileHandles

The FH in open(FH, "file") and close(FH); denotes a filehandle. A filehandle is a variable that allows Perl to get data from and write to files, as in $line = <FH>; denotes a read on a filehandle. The read process actually reads a line from an input source and stuffs it into variable $line. The next call to $line = <FH> will read the next line, and so on.

File Handles are a bit of an oddity in Perl. They don't have a special character denoting them (like hashes = '%', arrays, and scalars do). They are pretty much a left-over from the early days of Perl from when it was strictly a shell-like language.

You may prefer to use

my $FH = new FileHandle("file");

which does the same thing as

open(FH, "file");

but does it in an object oriented way. It makes a filehandle look like a scalar, which is then nice because you only have three special cases to remember. FileHandles are Perl's main way of interfacing with the 'outside world'.

These are just the basics of variables. How to manipulate them, their operators and so forth will be covered in the next chapter, section 'Elements of Perl'.

Other Oddments

Here are some of the other items you shall see in the following examples.

Functions

function_name() denotes a function call in Perl, to either a built-in or programmer-predefined function.

To make a function call in Perl, you simply say something like:

function_name('argument1','argument2','argument3'):

and the corresponding function definition is:

sub function_name #denotes a function definition in Perl.

{

my ($argument1, $argument2, $argument3) = @_;

}

Again, this is a simple overview of what functions look like.

Regular Expressions

One of the most useful features of Perl -- one that sets it aside from almost every other language out there -- is its ability to match patterns in a string. Say you had a variable (a scalar) that looked something like:

$scalarName = 'this is a scalar with a pattern in it';

Perl gives you the ability to look for a pattern inside that variable. This is called pattern matching and it is done by what are called regular expressions. If you say something like:

$scalarName =~ m"pattern"; # denotes a 'matching' regular expression in Perl.

then Perl looks into the string $scalarName, doing something like what you see in Figure 2.2:

fig22.gif

Figure 2.2

Caption 'how Perl matches a regular expression'

If you then say something like:

if ($dogName =~ m"bowser")

{

print "HERE!\n";

}

then this will only print 'Here' if the variable $scalarName contains the pattern 'bowser'.

Now, say you want to search for, and replace, a pattern. If you say something like:

$line =~ s"pattern"other_pattern";

then this looks for the string 'pattern', and replaces it with the string 'other_pattern'. Something like what is in Figure 2.3:

fig23.gif

Figure 2.3

caption 'Substitution' in regular expressions

This comes in extremely handy, for creating reports, sifting through log files looking for pertinent data, giving a web page the ability to do keyword searches (if you have ever seen Yahoo, or Lycos, you know what I mean). You can do thousands of things with regular expressions. They are really a language unto themselves. We have devoted an entire chapter to their usage.

Simple Perl Syntax Rules to Remember

Up to this point, we have seen some important concepts in Perl including the variables -- scalars, arrays, and hashes -- filehandles, functions, and regular expressions. Now how do you glue these structures together in a meaningful way?

Perl syntax is a difficult thing to grasp, and is even a more difficult thing to define. When you are speaking, you are aware if someone isn't 'getting' the language. The statement 'He cleaned the brush with the dog' is just plain wrong, but does it help anyone who is learning the language to say 'Oh, you mixed up the object of the sentence up with the noun in the prepositional phrase?'

Probably not. The explanation is more complicated than the problem it solves. In this case, one is better off pointing out what the error looks like, and how to correct it.

Hence, we shall take the same approach. If you want a complete guide to the syntax of Perl, see the Perl man page named Perlsyn. In fact, one could consider this small section to be a primer to that document.

Now, in the rough, a given Perl program consists of a series of comments, statements (simple and complex), and declarations.

Comments

Comments are, well, 'comments': places to document what you are doing in your code. They are indicated by a '#'. The rule for comments is simple. If you place a '#' marker, then anything that follows that # up to the newline is ignored by Perl. Hence:

exit(); # exits the program, because of foo

exits the program and the comment tells other programmers why you are doing it.

d) Simple and Compound Statements

Simple statements are like sentences in Perl. Tell the computer to do something, and then finish the intention with a semicolon. In other words, they are of the form

do_something;

There are no restrictions on whether or not your statements are on one line or not. Perl is a free form language, whitespace doesn't count. Hence you could say

do

something;

and Perl wouldn't mind...

Listing 2.1 shows some simple statements:

Listing 2.1

chop($answer = <STDIN>);

This says to take a line of input from the keyboard, assign it to the variable $answer, and then chop off the end character (i.e.: get rid of the newline).

open (FILE, "fileName")

|| die "Couldn't open fileName";

This says to open the file named fileName. If it can't be opened, stop the program with the statement 'Couldn't open fileName'.

print "answer equals 'y'" if ($answer eq 'Y');

This statement says to print 'answer equals 'y'' only if the variable $answer equals the value 'y'.

function_call('value','value2','value3');

This is an example of a user function call. It calls the function 'function_call' with the arguments 'value', 'value1', and 'value3'.

All of these are valid simple statements.

Compound statements are basically like compound sentences are to English: they contain multiple simple statements in them, and group a set of statements to be executed based on a 'modifier' or 'clause'. They look like:

if (something) { simple_statement1; simple_statement2; }

while (something) { simple_statement1; simple_statement2; }

In other words, compound statements are groups of simple statements controlled by a conditional ('if') or a looping mechanism ('while'). The main thing to remember here is that they are opened and closed by matching squiggly brackets. ('{' and '}'). Perl is rich in these control structures.

Sample compound statements are shown in listings 2.2, 2.3, and 2.4:

Listing 2.2 if_block.p

if ($answer eq 'Y')

{

print "You answered yes....";

print "Deleting file now\n";

unlink($filename);

}

Listing 2.3 reading_file.p

my $FD = new FileHandle("log_file");

while ($line = <$FD>)

{

if ($line =~ m"ERROR")

{

push (@error_list, $line);

}

}

Listing 2.4 countdown.p

$variable = 10;

print "Counting Down from 10 to 1!\n";

while ($variable)

{

print "$variable ";

$variable--;

}

The first example prints out 'You answered yes...Deleting file now' and then goes ahead and unlinks the file pointed to by the variable $filename\n", but does so only if the variable answer equals the character 'Y'.

The second example goes through each line of the file 'log_file' and checks to see if the line has the string 'ERROR' in it (as per regular expressions). If it does have the string 'ERROR' in it, then it gets added to the variable @error_list.

The third example prints out 'Counting Down from 10 to 1'<newline> 10 9 8 7 6 5 4 3 2 1'. The variable $variable is being decremented, one at a time. When the count reaches zero, then 'while ($variable)' ceases to be true, and the compound statement ends.

Simple and complex statements are the heart of Perl. Although you need to declare subroutines, (and by option declare packages) most of Perl is just telling the computer to do stuff.

Declarations

Finally, declarations simply associate a piece of code with a name. The only thing that you need to declare in Perl are subroutines. We have seen their form already and they look like:

sub subroutineName

{

do_something;

do_something_else;

}

Notice again that this is a simple extension of the compound statement. There are open and closed brackets at the beginning and ending of the subroutine definition.

Common Errors

All of these rules of syntax, variables, and functions translate into some common errors for beginning Perl programmers. If you want a full list of errors, please turn to the perldiag man page (that comes with the distribution).

Perl errors can be classified into two types: variable and syntax errors.

Variable Errors:

Here are some simple variable errors that pretty much everyone makes:

 

1) forgetting '$', '@', or '%' in front of variables.

Since all variables in Perl are prefixed by a special character, a common error is forgetting that special character. Hence:

variable = 'value';

array = (1,2,3,4);

is not going to work. Its going to say:

Unquoted string "variable" may clash with future reserved word at script.p line 4.

Can't modify constant item in scalar assignment at a.p line 4, near "'value';"

script.p had compilation errors.

Perl is not clairvoyant: you need to tell it somehow that you are dealing with either an array, hash. or scalar.

2) Using an array when you mean a scalar, a scalar when you mean an array, and so forth.

Perl's syntax here is flexible, but can do unexpected things. If you say:

$scalarName = @arrayName;

this is not an error, per se. It just may have unexpected consequences. This simply translates @arrayName (a list of elements) into a form that $scalarName can understand. In this case, it sets $scalarName equal to the number of elements in @arrayName.

This is called context and we shall have much more to say about this in the section 'Contexts in Perl'.

Syntax Errors

Likewise, here are some common syntax errors that people make:

1) Assuming that perl has mMulti-line comments

Since Tthere are no multi-line comments in Perl, but bbegiinning Perl programmers often forget this, and try to 'bounce' (match) between the '#'. For example,

# this is the beginning of a comment.

which is multi line but won't work #

Again, there are no multi line comments in Perl. This simply will not work, instead complaining about the 'which' statement on line 2 not existing.

2) Forgetting to put a semicolon after each statement.

Listing 2.5 shows a piece of code that has this syntax error:

Listing 2.5 mistake1.p

print "Hello, World" # mistake.

print "The above line has a mistake in it!";

Now, when you type:

prompt% perl mistake1.p

The result is:

syntax error at mistake1.p line 3, near "print"

Execution of mistake1p aborted due to compilation errors.

because Perl is interpreting this as

'print "Hello World" print "The above line has a mistake in it!\n"'.

and ignoring all of the white space in between, concatenating the two statements together into one. Perl thinks that the second 'print' is an argument to the first 'print' and gets confused.

3) Balancing of quotation marks.

Since Perl uses single quotes ('') or double quotes ("") to mark strings, a common error is to omit either an opening quote or a closing quote.

Hence --

print 'aha; print 'here'

will not work because the first quotation marks after 'aha;' aren't closed. Likewise:

print "asdfasdf; print "here";

does not work because the " aren't balanced.

2) Balancing of parenthesis, and forgetting commas in arguments.

Perl uses () a lot. In particular, it is used in functions, lists, and conditionals. Since Perlish functions look like:

function_name($argument1, $argument2, $argument3);

And lists look like:

( 1, 2, 'string3' );

And conditionals look like:

if ($variable1 > $variable2) { do this(); }

A common error is to not balance or forget the parenthesis. To say:

function_name($argument1;

($variable1, $variable2, $variable3;

if $variable1 > $variable2

are all mistakes. As is:

function_name ($argument1 $argument2);

because Perl uses a comma to separate two elements.

3) Balancing of '{}'.

Since complex statements use '{}' to determine their end, a common error is omitting them. Hence, the following are syntax errors:

Listing 2.6

Error #1: using a 'then' in an if clause

if ($condition) then do_something($variable1, $variable2);

Error #2: Forgetting a bracket at the end of a function name

sub function_name

{
do_something();

Error #3: Forgetting a bracket at the end of a while condition:

while ($condition)

{

do_something();

Summary of 30,000 foot view of Perl

If you are a C programmer, or a shell programmer, or a Pascal programmer, (or Fortran, Ada, or BASIC), then at least some of the syntax above should look pretty familiar to you. That's because Perl is the ultimate pidgin language, developed out of combinations of already existing languages.

If you stick to the above rules, and keep your Perl programming simple, you should be able to learn it fairly quickly. The above syntax should satisfy you 90% of the time. Lots of people get in trouble with Perl because they get overly complicated, overly fast. Perl has extreme expressive power, as we shall see. Sometimes you can, for convenience sake, ignore these rules.

You are doing yourself a favor by sticking to the basics. The important thing is to gain mastery of the main concepts of Perl: that '$' stands for a single, scalar variable, '@' stands for an array, and '%' stands for a hash, which is sort of like a dictionary. And be as vanilla as you can when programming, at first. Then and only then should you try more difficult syntax.

Perl Examples

Each of the examples below represents a concrete instance that I have encountered in the workplace. In each case, the problem was solved by a simple Perl script.

Do not be alarmed if your first reaction to these programs is 'what is that mess on the page?'. This is especially true if you have never seen a Perl script before. This is a natural reaction. If you are unsure what to make of the syntax, refer to the capsule definitions we have given up above.

Each of these parts of Perl is discussed in much detail in the chapters that follow. We are just providing enough here to get you started.

Figure 2.4 shows a list of the special symbols that we have covered above, and what they mean:

text24.txt

Figure 2.4

caption special characters in Perl.

This is just a small taste of Perl. Take a look the following real-life problems and Perl solutions for a more wide eyed view.

Example #1: Accessing Data, and Printing Data from a bunch of flat files, ASCII format.

Background: Let's start with a subject that is near and dear to the hearts of many a business: the processing and flow of data. Consider that you have a bunch of reports (hundreds), all in flat files with extensions '.rpt', and all of the format in Table 2.1:

Table 2.1

Person: Expense Amount Rationale for Expense

---------------------------------------------------

Joe N. 45.00 Office supplies

Sally S. 415.00 Travel to conference

....

In other words , these are flat files. The first eleven characters represent the name, the next eighteen represent the amount, and the final twenty represent the reason.

Now suppose you want to nail down the travel budget for a given person. This simple Perl script will do the job:

Listing 2.7 -- simple_report.p

1 # Usage: simple_report.p (file_list).

2 #!/usr/local/bin/perl

3 my @files = @ARGV;

4 my ($file, $line, %expenses);

5 foreach $file (@files)

6 {

7 my $FH = new FileHandle("$file");

8 while ($line = <$FH>)

9 {

10 my ($person, $expense, $reason) =

11 (substr($line,0,11),

12 substr($line,11,18),

13 substr($line,29,20));

14 if ($reason =~ m"travel"i)

15 {

16 $expenses{$person} += $expense;

17 }

18 }

19 }

20 foreach $person ( keys %expenses)

21 {

22 print "$person => $expenses{$person}\n";

23 }

Now for a little bit of explanation of what is going on here.

First, the usage of this script is going to be:

simple_report.p <file_list>

in which file_list is a list of all the files given as an argument. We see this by the line '@files = @ARGV;'. @ARGV is simply a list of all the arguments given at the command line. This statement simply copies the argument list to another variable which makes more sense to the programmer.

The second line (my ($file, $line, %expenses)) simply states which variables we are going to be using. We didn't have to do this (only subroutines need to be declared), but the keyword my lets Perl do some pretty cool, extra debugging checks.

Third, the code goes through a foreach loop. This is a construct like while, but it goes through each element in an array, setting the value of what is termed the index variable (foreach $file (@files)) to that element.

foreach $file (1,2,3)

{

}

sets $file to 1 first (perform the loop), 2 second (perform the loop), and 3 third (and then perform the loop) This construct goes through each of the files on the command line.

For each file, the code:

1) opens it (my $fh = new FileHandle("script");

2) goes through each line in the file ( while ($line = <$FH>)

3) for each line, splits it up into its three component parts, based on position in the file ($person, $expense...) = (substr($line,0,11), ...) This cuts up the file based on characters.

4) checks to see if the reason for the expense had to do something with travel (if ($reason =~ m"travel"i)) 'I' means to match without regard to case (case 'Insensitive').

5) If so, add the amount to the expenses for that person ($expenses{$person} = $expense;)

What we have done is gone through all the lines in all the files that we have, and basically collapse the report into the one hash called %expenses. After we are done, expenses will contain (again, in a dictionary) the information on how much each person spent. If you don't believe this, consider that the first time we see a name, say 'Joe', the value $expenses{'Joe'} is empty. Next, when we say:

$expenses{'Joe'} += 200.00;

it will increment this to 200.00.

The second time we run through, and say:

$expenses{'Joe'} += 50.00

the hash has remembered that Joe's expenses were 200.00, and thus increments them by 50 (to 250.00).

So after all is said and done, we have the data we need to collapse all of the files into a summary of who spent what. The statement:

foreach $person (keys %expenses)

{

print "$person => $expenses{$person}\n";

}

is a simple way of spilling all of this information out. 'keys %expenses' is basically a way of getting out what words are in our dictionary. If we only had two people who spent money on travel ('Joe' and 'Sally') then 'keys %expenses' would output who had what.

Example #1b: Accessing Data, and Printing Data from a bunch of flat files -- Excel format.

Lets take the same example, and use Perl to tie to Excel spreadsheets instead. We may be getting ahead of ourselves a little, but oftentimes PC reports don't come in the form of ASCII files (although, with Excel's help you can easily make them ASCII .) You will need the ActiveWare port of Perl (see section 'Installing Perl on NT with ActiveWare' last chapter). You will also need Excel.

Consider if you have the same problem, but in this case, you have hundreds of Excel files, all that look like:

fig25.gif

Figure 2.5 -- Excel spreadsheet

caption 'Excel spreadsheets interacting with Perl'.

Now these Excel spreadsheets have exactly the same data, as the text report listed above. It is just the format that is different. One example goes through flat files and the other through Excel. Hence, the logic will be the same, but we will be using OLE to do the various functions (since Excel can't be accessed directly). Listing 2.8 shows this way of using Perl:

Listing 2.8 Excel1.p

1 #!/perl/bin/perl

2 use OLE;

3 @files = @ARGV;

4 my $app = CreateObject OLE "Excel.Application" || die "Couldn't open Excel\n";

5 my ($file, $counter, %expenses);

6 foreach $file (@files)

7 {

8 $app->Workbooks->Open($file);

9 $done = 0; $counter = 1;

10 while (!$done)

11 {

12 my ($person, $expense, $reason) =

13 ($app->Range("A$counter")->{'Value'},

14 $app->Range("B$counter")->{'Value'},

15 $app->Range("C$counter")->{'Value'}

16 );

17 if ($reason =~ m"travel"i)

18 {

19 $expenses{$person} += $expense;

20 }

21 if (!$person && !$expense, && !$reason)

22 {

23 $done = 1;

24 }

25 $counter++;

26 }

27 $app->Workbooks->Close($file);

28 }

29 $app->Quit();

30

31 foreach $person ( keys %expenses)

32 {

33 print "$person => $expenses{$person}\n";

34 }

Note that this is almost a direct translation over of the logic from the flat file example, but since we cannot directly access the Excel spreadsheets (to do the manipulation ourselves), we need to go through the interface called OLE.

OLE lets you control windows applications like Excel through Perl. You create a new OLE object through the call

my $app = CreateObject OLE "Excel.Application" || die "Couldn't open Excel\n";

and then proceed to open up files via:

$app->Workbooks->Open($file);

and close them with:

$app->Workbooks->Close($file);

The '$app->Range("A$counter")->{'Value'}' call then gets a value out of the opened Excel spread sheet, which we then manipulate via Perl's syntax. The scalar $counter call is incremented to access each and every value. We access:

$app->Range("A1")->{'Value'};

first, then

$app->Range("A2")->{'Value');

second and so on. We therefore iterate through each column in the Excel Spreadsheet, sucking the data out, and putting it into a hash.

Run this by saying either:

C:\> perl Excel1.p *.xls

or (if you have bash):

prompt$ Excel1.p *.xls

from the directory where the Excel files live.

This is really cool, going through thousands of Excel spreadsheets and summing their contents, which can save quite a bit of time. However, note that the syntax for OLE is a little bit more complicated than the regular Perl syntax. In fact, it is a good example of Perl's object oriented capabilities, which we shall go into great detail when we get to the second part of the book. If you install ActiveWare's port, there is also the oleauto man page, which shows more about how to automate OLE.

For now, if you are comfortable with this syntax, go for it. A good way to get comfortable with this syntax is to look at Visual Basic programs which do OLE. The syntax is almost exactly the same. We shall also, occasionally make some OLE examples.

If not, a good way to get around in Windows is to save Excel spreadsheets as flat files, and then use the 'flat file' metaphor in the example above this one. This makes life easy, since you can then manipulate the reports in straight text format.

Example #2: Emailing Members of Your Project When a Process is Complete

Suppose there is a process that is extremely important, and needs attention when it is finished. Everyone on your group needs to be notified when it is done. One solution is to email a notification message to everyone. With the Perl system command, you can perform any command the local operating system allows.

Listing 2.9 MailAfterDone.p

1 #!/usr/local/bin/perl5

2

3 system("important_process");

4

5 open (FILE, "> /tmp/process_complete");

6 print FILE "hey all -- the process is complete!\n";

7 close(FILE);

8

9 foreach $user ('abel', 'baker', 'charlie')

10 {

11 system("elm -s 'process is complete' $user < tmp/process_complete");

12 }

OK, now for a pseudo-code explanation of what happened: a) do the process system("important_process");

b) open a temporary file, write the message that we are to send in that file, and close the file. This actually creates a file on the system which you can look at through any editor.

c) go through the list of users 'abel', 'baker', and 'charlie', and mail off the message. This occurs in the line (system ("elm ...")). You may, of course, substitute with the command line mailer of your choice. There is also a module called Mail::Send (which we have included on the CD) that does this sort of thing for you (but requires the command 'Sendmail', which is primarily a UNIX command).

Or you could make your own simple wrapper module, which we will also do in the section on objects.

This example shows that at its most simple, Perl has a great role as 'traffic cop'. Keep portability in mind, however. This example will not work on an NT system unless you substitute elm with the proper NT command. In particular, for Microsoft Exchange, you would access this functionality via OLE Microsoft Exchange is fully OLE automatable.

Example #3: Connecting to an Internet Service Provider

Example #3 is an example of a convenience script, one that saves you strain on the typing muscles. The idea is to connect to an Internet Service Provider from a Linux system. Before this script, it was necessary to:

make a temporary copy of a file

edit that copy, and change one thing

save the file

invoke a shell script on the copied file.

Why do all that if Perl can turn it into a one step affair?

To connect to an outside network, Linux provides pppd and chat which basically automates the login process. A chat file looks like this (not Perl syntax, but still pretty ugly).

chat.p:

exec /usr/local/bin/chat -v \

TIMEOUT 10 \

ABORT '\nBUSY\r' \

ABORT '\nNO ANSWER\r' \

ABORT '\nRINGING\r\n\r\nRINGING\r' \

'' AT \

'OK-+++\c-OK' ATH0 \

TIMEOUT 45 \

OK ATDT 555-1212 \

CONNECT '' \

BIS '' \

Username:--Username: YOUR_ACCOUNT_HERE \

Password: PASSWORD \

'>' ppp

 

After creating the script, then run something like:

pppd modem /dev/tty01 19200 'chat'

to connect to the ISP. (pppd is the process which actually creates the link).

chat is an appropriate name for this process. When run, chat starts a dialogue with a server on the other side of the phone line. The process:

1) picks up the phone (with the 'OK-+++' line)

2) dials the number (with the 'OK ATDT 555-1212' line)

3) waits till it sees the string 'Username:', sends the string 'YOUR_ACCOUNT_HERE'.

4) when it sees the string 'Password:', sends the string 'PASSWORD'.

5) sends the string 'ppp' after logged in.

Now this is all fine and dandy, but what happens if you aren't assigned a stable password? This is true with systems which use a SecurId card. With a SecurId or similar card, the password fluctuates each minute. This means the password changes which makes things very secure, but also very inconvenient for scripting. In fact, this security feature makes such a script as the above one impossible.

Consider, what do you put in the 'PASSWORD' entry in the script? If there were a stable password, foobar, you could say something such as:

Password: foobar

and it would connect fine. But the SecurId card makes this impossible, since the password is constantly changing. What is needed is a way to vary the chat script every single time you log in, depending on the reading on the SecurId card. Here is where Perl comes in.

Listing 2.10 login_helper.p:

1 #!/usr/local/bin/perl

2

3 open(TEMPLATE,"chat_template"); # opens the chat file above for reading

4 open(CHATFILE, "> /tmp/chat"); # opens a 'temporary' chat file for writing

5

6 print "What does your SecurId card say:\n"; # prints out to STDIN, asking a question

7 chop($answer = <STDIN>); # gets an answer from STDIN.

8

9 while ($line = <TEMPLATE>) # goes through the file, a line at a time.

10 {

11 if ($line =~ m"YOUR_ACCOUNT_HERE"g) # looks for the string

12 {

13 $line =~ s"YOUR_ACCOUNT_HERE"$answer";

14 }

15 print CHATFILE $line;

16 }

17 close(TEMPLATE);

18 close(CHATFILE);

19

20 system("pppd modem /dev/tty01 19200 '/tmp/chat'");

21 unlink("/tmp/chat");

And that's it. What this script does is:

1) opens the 'chat_template' file listed above (line 3)

2) opens a 'temporary' chatfile called '/tmp/chat'. (line 4).

3) prompts the user for a string (line 5 and 6). This is where you would enter your SecurId number.

4) loops through the file, a line at a time (line 8)

5) looks for the pattern 'YOUR_ACCOUNT_HERE', and substitutes it with what the user typed in (line 10 through 13). This is an example of 'regular expressions', the process of 'matching'

6) prints out the line, substitutions included, to a temporary chatfile (line 14).

7) closes the file handles, flushing the output to disk) (line 16-17)

8) launches the command ("pppd modem /dev/tty01 19200 /tmp/chat") via a 'system' call(line 19).

9) deletes the temporary file that you just created (unlink("/tmp/chat");)

When run, the script output is:

What does your SecurId card say:

Perl is now waiting for your input. Suppose you type:

What does your SecurId card say:

passwd1

Perl will then execute the 'pppd modem /dev/tty01 /tmp/chat'. And voila! if you have given the correct temporary password, then you will get connected. To see why, simply look at /tmp/chat'. It will look exactly like 'chat_template', with one difference:

....

....

Password: passwd1 \

 

 

See how this works? Perl has copied the file, with ONE difference: it has filled in the password that you have supplied. The chat function now works totally seamlessly with dynamic passwords. Just type:

prompt% login_helper.p <PASSWORD>

and Perl connects to the Internet for you. And, since the password is dynamic, you need not worry about security

This little trick works well to automate any program that uses a config file. If a program reads from a flat ascii file, you can simply make a copy of that file, 'flip a couple bits' to modify its execution, and then run it from the command line.

Example #4: Unsupplied Functions on Differing Systems: cat

Now, we (hopefully) aren't going to get in a religious war here, but one of the good things that Perl does is give the power of the command line interface to Windows applications. Unfortunately, this particular issue is the cause of much active debate (and flame wars) on the internet. Hence a bit of diplomacy is in order here.

One might say that the command line interface is 'obsolete'. Some folks are perfectly satisfied with GUI applications. On the opposite end, one might say that GUI interfaces are overhyped. Some folks are only satisfied with command line interfaces.

Whatever. Like it or not, the truth is that GUI applications have their benefits and drawbacks, and command line tools have their benefits and drawbacks. Neither are predominant, or we would be living in a 'one or the other' world.

This section concentrates on the prospect of giving you the power of both on the Windows platform (although the commands listed here will work on UNIX as well).

There are three well known UNIX functions which have no direct equivalent functions in the 'command.com' world: cat, grep, and find.

<side note> This is not exactly true. There are several packages out there that do functions cat, grep, and find. These packages give UNIX functionality to NT. Three of these packages are: NuTCracker (which costs a considerable amount of money), MKS toolkit (which costs money) and the gnu-win32 project (which is free, but takes some time and knowledge to install). There are also public domain tools which do these functions via GUI. But the point still stands. If you are doing things in bulk, and you need ultimate flexibility, you are probably better off programming it yourself.
<end side note>

cat displays file listings to the screen. Take the following file:

List me out to the screen..

I dare you..

 

The command 'cat file' will echo out the contents of that file, kind of like the Window shell's type: *

%prompt cat file

List me out to the screen..

I dare you..

Here is cat as programmed in Perl.

Listing 2.11 cat.p:

1 #!/usr/local/bin/perl

2

3 foreach $file (@ARGV) # we iterate through the command list

4 {

5 open(FILE, $file); # we open each file from command line

6 while ($line = <FILE>) # we iterate through each line in the file.

7 {

8 print "$line\n"; # we print out the current line we are looking at

9 }

10 close(FILE); # we close the file.

11 }

In pseudocode, the process is:

1) loop through the argument list via 'foreach $file (@ARGV)'

2) open a file handle to that file: open(FILE,$file);

3) loop through the lines in the file: while ($line = <FILE>);

4) print that line to the screen: print "$line\n";

5) close the file: close(FILE).

Here is the minimalist form of cat:

Listing 2.12 cat_minimal.p

1 #!/usr/local/bin/perl

2

3 while (<>) { print; } # we iterate through every line in the file,

4 # and print it out, transparently

5 # opening and closing files.

Surprisingly, this code does exactly the same thing as the verbose version above. It uses a lot of special variables that Perl uses as defaults. 'while (<>)', for example, goes through each line, in every file specified on the command line. And 'print;' by itself, prints out the same line set by the 'while(<>)' loop, the one set in the '$_' variable.

We stay away from minimalist Perl in this book. It hinders, rather than helps, understanding the code, and can be downright difficult to debug for anything but the most simple scripts. However, it is helpful if you need to write a simple, throwaway script. Those of you interested in minimalist Perl can go to the Perlvar man page for more examples.

Problem #5: Unsupplied Functions on Differing Systems: grep

grep looks for a pattern in a group of files, and prints out matching lines. For those of you unfamiliar with grep, it is extremely useful for debugging, tracking down dependencies in code, looking for examples of code usage, and about 1,000,000 other things. Here is the simplest version of grep in Perl (minimalist again):

Listing 2.13 minimal_grep.p

1 #!/usr/local/bin/perl

2

3 $pattern = shift @ARGV; # make first argument the pattern we are looking

4 # for, and shift it off the @ARGV array.

5 while (<>) # go through each line in each file

6 { # @ARGV array.

7 if (m"$pattern") { print; } # match the pattern against this line

8 }

Lets take the above, and expand the code to see what really is going on (we will even add a command switch ,'-l', to grep for fun (which only shows which files match a given pattern, not the patterns themselves) :

Listing 2.14 fuller_grep.p

1 #!/usr/local/bin/perl

2

3 use Getopt::Std; # gives you access to 'simple' command line processing.

4

5 getopts('l'); # adds the command switch 'l' Both take no arg.

6 # sets the variable $opt_l.

7

8 my $pattern = shift @ARGV; # The pattern '$pattern' is first argument

9

10 foreach $file (@ARGV) # we now loop through files on command line.

11 {

12 open (FILE, $file); # we open each file.

13 while ($line = <FILE>) # we go through each line of the file.

14 {

15 if ($line =~ m"$pattern") # is the pattern in the line?

16 {

17 print "$file" if ($opt_l); # if so, and -l given, print filename

18 last if ($opt_l); # we've found the pattern.. we don't

19 # need to go through the file again.

20

21 print "$file: $line" if (!$opt_l); # if no(ie:!) -l given, print

22 # the file, and line.

23 }

24 }

25 close(FILE);

26 }

As in the chat example, the processing for grep is straightforward:

1) include the module Getopt::Std. This gives us the function getotps, which lets us do command line processing.

2) process the arguments on the command line getopts('l'). If we say something such as 'grep.p -l', this will set $opt_l as a variable, which reminds Perl that the user has typed '-l'.

3) Perl provides a special array variable, @ARGV, which corresponds to the arguments on the command line. We shift, that is take the first element off the array, and set it to the variable $pattern ($pattern = shift @ARRAY)

4) loop through each file in the argument list, using 'foreach $file (@ARGV)'.

5) open each file for reading to search for patterns (open(FD, $file);)

6) go through each line of the file using the 'while ($line = <FILE>)' construct.

7) look for the pattern (if $line =~ m"$pattern") in each line of the file.

8) if the line matches the pattern, check to see if the user has typed '-l' on the command line. If so, simply print the name of the file and go to the next file. ( print "$file" if ($opt_l); last if ($opt_l);)

If the user did NOT type '-l', print each occurrence to the screen (print "$file: $line\n" if (!$opt_l))

That pretty much sums up the processing. Using the file grepfile listed below:

PATTERN1

PATTERN2

PATTERN3

we can say:

%prompt grep.p PATTERN1 grepfile

grepfile: PATTERN1

This simply matches 'PATTERN1' in the file grepfile. The output looks like:

%prompt grep.p PATTERN grepfile

grepfile: PATTERN1

grepfile: PATTERN2

grepfile: PATTERN3

This matches every line, since 'PATTERN' is in all lines in grepfile. The command:

%prompt grep.p -l PATTERN grepfile

gives the output:

grepfile

This simply shows that PATTERN is somewhere inside file grepfile.

There are thousands of versions of grep, and most of them are written in Perl. Amongst the more common:

cgrep.p -- context grep. If a pattern match happens, give the surrounding 5 or so lines that contain that pattern.

rgrep.p -- recursive grep. Look for a pattern recursively, through a directory.

greplist.p -- look for a LIST of patterns in a file, rather than just one.

All three of these commands are very useful, and we shall see how to implement them in chapter 12. Let's now look at one more example in the realm of the simple commands: find. UNIX buffs are very familiar with find. If you are in system administration, it is what makes your job possible.

Example #6: Unsupplied Functions on Differing Systems: find

find lists out files in a directory given a certain criteria. Suppose you forgot where a file named 'cow_report' was located in an environment with thousands of files. find is a handy way to, well, find the file. If you are from the windows world, its like the Explorer GUI, but much more powerful. For example, suppose there was the following directory structure:

directory1/

file3

subdirectory/

file2

cow_report

In UNIX, to locate file 'cow_report' the command is:

find directory1 -name 'cow_report' -print;

And given the criteria 'the name of the file is cow_report', this will print out:

directory1/subdirectory1/cow_report.

Quite cryptic, this syntax, and it is no good for portability. Even inside the world of UNIX, you cannot count on find to be equal syntactically. AIX's find and Solaris's find may return subtly different values given a find expression. And of course, Windows NT does not have a bundled command line version of find. This is where Perl comes in. Lets make a find that says something like:

prompt% find.p cow_report directory1 directory2

which will look for cow_report in directory1 and directory2 and print out exactly the same output as the find above. Here it is:

Listing 2.15 find.p

1 #!/usr/local/bin/perl5

2

3 use File::Find; # 'File::Find' is a pre-packaged library that we use.

4 # Supplies the function 'find' given below.

5

6 my $pattern = shift @ARGV; # gets an argument off of the command stack @ARGV.

7 my @directories = @ARGV; # we take the directories from rest of command line.

8

9 find (\&matchPattern, @directories); # function call in Perl, with callback.

10

11 sub matchPattern # A subroutine in Perl.

12 {

13 if ($File::Find::name =~ m"$pattern") # if the file name ($File::Find::name)

14 # matches the pattern given:

15 {

16 print "$File::Find::name\n"; # print it out.

17 }

18 }

Note that this is a bit of a paradigm shift from what we did with grep. Now, the details of looping through directories, as well as the inconsistencies between Windows NT and UNIX, and how they handle directories, are hidden behind the function find. This is a good example of programming abstraction.

The pseudocode for the program find.p is:

First include the module 'use File::Find;'. This gives the function find, which we will use later.

Then take the first argument off the command line:

find.p cow_report directory1 directory2

and stick it into the variable $pattern.

find.p cow_report directory1 directory2

The rest of the command stack we assume are directories, and stick them into the array @directories.

The line 'find (\&matchPattern, @directories);' is a little tricky: what does it mean? First, find is a function. And @directories is a list of directories, a simple array which is passed into the function as an argument. But what about \&matchPattern? '\&matchPattern' is what is termed a callback. Callbacks are not functions. They are pointers to functions which are often passed to a function to tell it how to perform.

In this particular case, the function find does all the looping and iterating through the directories, and calls the function matchPattern each time it loops through.

Each time it loops through, it sets the variable $File::find::name to the name of the file or directory it is dealing with. And in each case, it calls the function \&matchPattern.

In the 'cow_report' case, the command:

prompt% find.p cow_report directory1 directory2

on the directory tree:

directory1/

file1

subdirectory/

file2

cow_report'.

Results in:

directory1/subdirectory/cow_report

The detailed flow of processing is described below.

Loop #1:

$File::find::name set to 'directory1';

Calls \&matchPattern: does 'directory1' contain the string 'cow_report'? No..

(i.e.: $File::find:name =~ m"$pattern")

Loop #2:

$File::find::name set to 'directory1/file1';

Calls \&matchPattern -- does 'directory/file1' contain 'cow_report'? No...

Loop #3:

$File::find::name set to 'subdirectory/file2';

Calls \&matchPattern -- does 'subdirectory/file2' contain 'cow_report'? No...

Loop #4:

$File::find::name set to 'subdirectory/cow_report'

Calls \&matchPattern -- does 'subdirectory/cow_report' contain 'cow_report? Yes...

print it out!

find, like grep, is infinitely useful, and is the wellspring of thousands of commands. In particular, you could possibly combine this example with the one that did reporting. Then, when you ran the program, automatically look through all of the Excel spreadsheets in a certain directory. Or perhaps, use find to delete garbage files (although I would be very careful about doing this, since this involves doing something that could have serious repercussions on your system if done incorrectly.)

Summary

This has been a bit of a whirlwind chapter. It touched a lot of subject matter, and touched it quickly. If you are new to Perl and don't understand all the syntax, don't worry. All of these aspects of Perl are covered in the chapters to come, along with a lot more examples. The point of this chapter is to get you acclimated to Perl syntax, and start thinking in the 'Perlish' way.

We hope to make Perl syntax almost second nature to you and make you able to pound out these scripts in absolutely no time. This is essential when we get to the main heart of this book which is object oriented programming in Perl.

If you must have a list of things to keep in mind:

1) Perl requires no declarations, except in the case of subroutines. Hence you can just do what you need and exit.

2) Perl can be run in quite a few ways. The most common are saying:

prompt% perl <script>

at the command line, and placing:

#!/usr/local/bin/perl

at the beginning of a Perl file.

3) The important thing in getting productive in Perl is to start constantly thinking in terms of automating tasks. What in my job, am I doing day after day? And how can I change this? If you think this way, you will get hundreds of programming examples for practice and at the same time improve productivity on the job.

4) be CAREFUL when you program Perl scripts, especially if you are going to do such nasties as delete backup files (see above). Always use caution, wrapping sensitive commands such as 'unlink' with prompting that will tell you what a given problem is going to do. And learn some rigor in your programming, that is take the time to learn the next few chapters, fairly well.

If you are going to remember any syntax, remember the table above with all the special characters and their definitions. The next few chapters are going to be in-depth analysis on these symbols, and how they work.

Orders Orders Backward Forward
Comments Comments

COMPUTING MCGRAW-HILL | Beta Books | Contact Us | Order Information | Online Catalog


HTML conversions by Mega Space.

This page updated on October 14, 1997 by Webmaster.

Computing McGraw-Hill is an imprint of the McGraw-Hill Professional Book Group.

Copyright ©1997 The McGraw-Hill Companies, Inc. All Rights Reserved.
Any use is subject to the rules stated in the Terms of Use.