Orders Orders Backward Forward
Comments Comments
© 1997 The McGraw-Hill Companies, Inc. All rights reserved.
Any use of this Beta Book is subject to the rules stated in the Terms of Use.

Chapter 18: Turning Old Code into Object Code

This chapter takes the next, big step in the process of learning object oriented programming. First, we went over the syntax of how Perl objects work (Chapter 16). Then we went over a mock class, in which we implemented a fundamental class (the Piece class) for the game Strategio, and saw some of the common methods that are in objects.

Now is the time to look at your own code to see how much of it should become modules and objects, and how much should be left alone. If you are just starting to learn Perl, this chapter will help you dig into somebody else's code that you may need to maintain.

Chapter Overview

This chapter is divided into two sections. First, we will consider some simple decisions that you will have to make. The primary design decision that anybody faces when doing object oriented programming is:

To decide on whether or not the overhead of making a object is worth the flexibility it gains you.

We will talk about this overhead, consider three different cases, go through the pros and cons for object versus module, and then implement what we decide is the best choice.

Second, we will consider making a scavenger hunt through old code, to search for objects. We will consider the ftp.p and expect.p scripts that we have written, and dig through these to unearth any objects we can find. Then, we will finally rewrite expect.p in an object oriented way to see how much it buys us.

The emphasis here will be on examples. There is a lot of work to do here so let's get going! The first step is to look at the Module vs. Object design decision that all object oriented programmers need to make.

Design Decision: Modules Versus Objects

Modules versus objects is the first decision when you are planning your potential code. The syntax for objects and for modules are a bit different. The mindset for modular programming and object oriented programming is different too: modular programming seems to be easier for most people to grasp.

So we need to recognize the benefits and drawbacks of each of these two different approaches. Since Modular programming is the easier one, we start with it.

Modular Programming Features

A quick review of modular programming is in order here. Modular programming is the process of taking subroutines out of code and centralizing it into a separate place, the package.

In Perl, these packages are named '.pm' files, the function to include them is called use, and they are included via the variable @INC which searches the disk to include them. But these principles are pretty much the same for any computer language; here they just come under different guises.

The concept of gathering subroutines together into packages is very powerful, but notice that it does not deal with any of the issues associated with data. There is a distinction between:

a) the functions in the module themselves

b) the data that the functions take as arguments.

For example, suppose that you were writing a 'Vector' module, and defined the following functions:

Listing 18.1 VectorModule.pm

1 package VectorModule;

2 sub add

3 {

4 my ($firstvec, $secondvec) = @_;

5 my $returnvec = [];

6 my $element;

7 die "You cannot add vectors of differing lengths!\n"

8 if (@$firstvec != @$secondvec);

9 for ($element = 0; $element <= @$firstvec; $element++)

10 {

11 $returnvec->[$element]

12 = $firstvec->[$element] + $secondvec->[$element];

13 }

14 $returnvec;

15 }

16

17 sub subtract

18 {

19 my ($firstvec, $secondvec) = @_;

20 my $returnvec = [];

21 my $element;

22 die "You cannot subtract vectors of differing lengths!\n"

23 if (@$firstvec != @$secondvec);

24 for ($element = 0; $element <= @$firstvec; $element++)

25 {

26 $returnvec->[$element]

27 = $firstvec->[$element] - $secondvec->[$element];

28 }

29 $returnvec;

30 }

Note that the data ($firstvec, $secondvec, $returnvec) is separate from the functions (add, subtract). To use this module, you would say something like what is Listing 18.2

Listing 18.2 use_vecmodule.p

1 use VectorModule;

2 my $vec1 = [0,0,0];

3 my $vec2 = [0,1,2];

4 my $vec3 = VectorModule::add($vec1, $vec2);

where $vec1 and $vec2 are defined by the client, and $vec3 is returned by module. This separation between data and functions is the limiting factor in scaling up modular code. This is because if you change the module to use a different representation for vectors, then the underlying code that you use in the clients will break.

The clients that use the module will still be representing the vector in an 'old way', and will not work with the new representation inside the module. This means that you will need to change the module and all clients that use the module. This is the flaw that object oriented programming overcomes.

Object Oriented Programming and Encapsulation

The great advance in object oriented programming is that it eliminates the separation between data and functions. A capsule definition of object oriented programming could be:

Object Oriented programming is the process of taking modules, and the data that these modules use in their interface, and stuffing that data inside the module so that multiple copies of the module can exist.

Another term for this is encapsulation. The definition follows:

Encapsulation is the process of taking functions and associated data, and 'packing' them in objects so that the internal structure of the object can change without needing to change all of the clients surrounding it.

This is the main difference between objects and modules. Modules are packs of functions, whose functionality can be reused effectively. Objects are packs of functions combined with their own private bits of data.

To illustrate this point, let's rewrite the 'VectorModule' function as an object:

Listing 18.3 VectorObject.pm

1 package VectorObject;

2

3 use overload '+' => \&add, '-' => \&subtract;

4 sub new

5 {

6 my ($type, @elements) = @_;

7 my $self = {};

8 $self->{elements} = \@elements;

9 bless $self, $type;

10 }

11

12 sub length

13 {

14 my ($self) = @_;

15 my $elements = $self->{elements};

16 my $length = @$elements;

17 return($length);

18 }

19

20 sub add

21 {

22 my ($self, $other) = @_;

23

24 my ($element, $returnelem) = ('', []);

25

26 my ($elements1, $elements2) =

27 ( $self->{elements}, $other->{elements});

28

29 die "You cannot add vectors of differing lengths!\n"

30 if ($other->length() != $self->length())

31

32 for ($element = 0; $element <= $self->length(); $element++)

33 {

34 $returnelem->[$element] =

35 $elements1->[$element] + $elements2->[$element]

36 }

37 return (new VectorObject(@$returnelem));

38 }

39 sub subtract

40 {

41 my ($self, $other) = @_;

42

43 my $element;

44 my ($elements1, $elements2) =

45 ( $self->{elements}, $other->{elements});

46

47 die "You cannot subtract vectors of differing lengths!\n"

48 if ($other->length() != $self->length())

49

50 for ($element = 0; $element <= $self->length(); $element++)

51 {

52 $returnelem->[$element] =

53 $elements1->[$element] - $elements2->[$element]

54 }

55 return(new VectorObject(@$returnelem));

56 }

Now, forget about the code for a second, and concentrate on the interface, how people will use the code. With the module, if you wanted to add two vectors together, you would say something like this:

Listing 18.4 use_vecmodule.p

1 use VectorModule;

2 my $vec1 = [0,0,0];

3 my $vec2 = [0,1,2];

4 my $vec3 = VectorModule::add($vec1, $vec2);

and, after this addition, you would get the new vector (an array reference) inside $vec3.

With the object, however, if you wanted to add two vectors together, you would say something like this:

Listing 18.5 use_vecobject.p

1 use VectorObject;

2 my $vec = new VectorObject(0,0,0);

3 my $vec2 = new VectorObject(0,1,2);

4 my $vec3 = $vec->add($vec2);

or if you want to use the overloading that we provided:

Listing 18.6 use_vecobject2.p

1 use VectorObject;

2 my $vec = new VectorObject(0,0,0);

3 my $vec2 = new VectorObject(0,1,2);

4 my $vec3 = $vec + $vec2;

A lot of coding for such a small change in syntax! What exactly have we gained from the additional expense of twenty six extra lines of code, and two extra functions?

The main thing that we have gained is the encapsulation of the data, the addition of the data to the package. In particular, notice that

my $vec = [0,0,0];

directly sets what vector is, and what the data is inside the vector. Whereas

my $vec = new VectorObject(0,0,0);

has a function call surrounding the creation of the vector. This function call removes the responsibility of knowing what actually makes a vector. In the first case, we can see that it is a array reference. In the second, one doesn't need to know.

Since only the creator of the module needs to know how the object is implemented, he or she can change it at will, as long as the interface preserves 'backwards compatibility'. For example, if we wanted to add an element that shows the distance of that element from the origin (0,0,0...), we could change the new function as such:

1 sub new

2 {

3 my ($type, @elements) = @_;

4 my $self = {};

5 foreach (@elements) { $self->{distance} += $_ ** 2; }

5 $self->{elements} = \@elements;

6 bless $self, $type;

7 }

In the module version, there would be no clean way to add this to the data structure. We could change it such that our functions add and subtract take the following data structures:

use VectorModule;

my ($vec) =

{

distance => VectorModule::getDistance([0,1,2]),

elements => [0,1,2]

}

my ($vec2) =

{

distance => VectorModule::getDistance([0,0,0]),

elements => [0,0,0]

}

my $vec3 = VectorModule::add($vec, $vec2);

in which the bolded hash references are the data structures that we pass to 'VectorModule::add()'. Now, this does exactly the same thing, only not as neatly.

But more importantly, it does it in the client, and this is exactly the point. It is more difficult to change the code that uses the module (the client) than it is to change the module itself (the server).

After all, you actually use your module/class in clients many times, but you code the module/class once. As an analogy, you can think of both object oriented programming and modular programming as basically 'putting all your eggs in one basket': coding the functionality that you need one time, and then guarding that functionality with care so that existing clients that use that code don't have to change.

The difference, then, between the two, is that sometimes it makes sense to make your baskets contain just functions (in the case of modules) and sometimes contain functions with their own data (in the case of objects).

Choosing Between Object and Module

So, how do you choose between a module and an object? This will be the first design decision you make when tackling a problem. Let's consider three separate examples to illustrate the choices between object and module:

1) Diff - getting the difference between two data structures

2) PathNames - turning path names from absolute to relative, and vice versa

3) LogFile - creating a log file with a certain format

One very prudent approach is to always prefer objects over modules, since objects are more flexible and extendible than modules (a module is simply an object with no data in it). However, you may want to consider otherwise.

One reason is that objects, although more flexible and extendible than modules, are also more complicated. Just look at our 'Vector' Example above. The physical difference between the two:

1) the VectorObject had 26 more lines of code than the VectorModule

2) the VectorObject had 2 more functions than the VectorModule

3) The VectorObject had 3 more variables than the VectorModule

This is a cost associated with the use of objects in maintenance, programming, and complexity. Objects also have a cost in ease of usage and performance.

However, if you find out that you in fact need a class when you have a module instead, you may not find it that easy to back up in your tracks. This is especially true the more complicated your modules get. If you decide that you someday need two or more of the same module working concurrently, you will be able to hack your way through (Perl is great at that!) but it will not be the most fun to do this hacking. Therefore, the decision isn't an easy one to make.

Let's take a look at the three different problems in which you might be called on to make such a decision.

Example #1: Diff Revisited

One of the most common tasks of a Perl programmer is finding differences. It might be finding differences in arrays and hashes, in anonymous data structures, or in tables, indexes, directory structures, and so forth.

In fact, we have already started such a 'Diff' code repository. In chapter 15 we made a Diff module. The two functions defined were:

checkEq();

checkData();

These functions were designed to look at two arbitrary data structures, check to see if they were equal, and return a structure with the difference between the two.

You may have a situation in which you would want to extend this functionality. Let's consider a function that returns the difference between two arrays. Diff::array(), we might call it:

Listing 18.7 Diff.pm continued.

1 package Diff;

2

3 # Other subs checkData, checkEq, etc.

4

5 sub array

6 {

7 my ($ref1, $ref2) = @_;

8 my (%MARK1, %MARK2,%MARKTOTAL);

9 my $returnRef = [];

10 die "Need to pass two array references!\n"

11 if ((ref($ref1) ne 'ARRAY') || (ref($ref2) ne 'ARRAY'));

12 foreach (@$ref1) { $MARK1{$_} = 1; }

13 foreach (@$ref2) { $MARK2{$_} = 1; }

14 %MARKTOTAL = (%MARK1, %MARK2);

15 foreach (sort keys(%MARKTOTAL))

16 {

17 if (!defined($MARK1{$_}) || !defined ($MARK2{$_}))

18 {

19 push (@$returnRef, $_);

20 }

21 }

22 return($returnRef);

23 }

Given this type of code, you then could call this function via:

use Diff;

my ($ref1, $ref2) = ([1,2,3,4,6], [4,3,2,1,5]);

my $ref3 = Diff::array($ref1, $ref2);

which then would make $ref3 equal to '[ 5,6 ]' because 5 and 6 are the only differences between the two array references.

Now, the question is whether should this should be a module or an object? Let's look at the same client code, only coded as an object:

use Array;

my ($ref1, $ref2) = (new Array(1,2,3,4,6), new Array(4,3,2,1,5));

my $ref3 = $ref1->diff($ref2);

Here, we make Array a 'first class object', meaning that we would go through the Array object methods to manipulate the Array. As such, diff is a simple property of the object Array.

The design decision comes down to this: is 'diff' something that you do (and hence be in its own module), or should 'diff' be a property of the datastructure which calls it (and be part of an object)? In other words, is 'diff' being used as a verb or is 'diff' being used as a noun. If 'diff' is being used as a verb, then the decision should be to make this a module. If 'diff' is being used as a noun then it should be made an object.

There are arguments to both sides here. The first type of code, where diff is made a module:

Diff::array($array1, $array2);

is more standard. "Traditional" Perl programmers will recognize it, and be able to acclimate to it more quickly.

The second type of code where diff is a property of an Array object:

my $a = new Array(1,2,3);

my $b = new Array(1,2,4);

my $c = $a->diff($b);

seems cleaner. We could add other functionality to the array that does not have to do with 'diff'ing; merge strikes me as another function that we could attach to an Array object. This function would merge all of the elements of an Array, take out the duplicates, and stuff them into its own Array. So, what to do?

In this case, I would vote for the module. The kicker? In order to use the diff function when attached to an Array object, you need to use the Array object pretty much everywhere.

What happens if you have a regular array reference, and then try to use the diff function by mistake, as in:

my $array = [1,2,3,4,5];

my $secarray = [1,2,3,4,5,6];

my $thirdarray = $secarray->diff($array);

In this case you would get

Can't call method "diff" on unblessed reference at script.p line 7.

This is a fatal error because Perl cannot find the method diff for a regular array reference. The code would then die.

Since there is no way of guaranteeing that all of the array references that you will encounter will be Array objects, you are limiting yourself if you attach diff to a specialized Array object instead of making it global. Put it into a module, and wait for the days when Perl data structures (hash references, array references, scalar references) are objects in their own right.

Example #2: PathNames

The whole problem of path names is a nagging one. Suppose that you want people to be able to enter in, as usage for a script:

script.p ..\..\..\scripts\path_to_file

in which 'path_to_file' is a 'relative' path - the '..\..\..' goes up three directories, down into the 'scripts' directory, and points to 'path_to_file' which so happens to be sitting there. In this situation, you might want to check to see if '..\..\..\scripts' is in a 'legal' place for script.p to work on it (such as if it is in the '\windows32' sub directory).

What is required is a path converter - something that takes '..\..\..\scripts\path_to_file' and turns it into '\windows32\scripts\path_to_file'. A routine that does the job for you. Again, let's look at the potential solutions, from the point of view of the clients who use them:

Object:

my $path = new Path('..\..\..\scripts\path_to_file');

print $path->abs(); # prints absolute path

print $path->rel(); # prints relative path, to current working directory

Module:

my $abs = Path::rel2abs('..\..\..\scripts\path_to_file');

my $rel = Path::abs2rel('\windows32\scripts\path_to_file');

Again, this is a difficult choice. Does the advantage of having things encapsulated, as in the object, outweigh the disadvantage of having the code be more complicated? After all, a path to a file is a string, and it is not going to become much more complicated than that. This means that issues such as 'independence of representation' and 'changeability' are relatively minor.

On the flip side, there is not too much of a cost in making it an object.

In cases like this, it is best to play safe and go with the object even though it may take a little extra time. Since objects are more flexible, you gain a little bit of insurance, just in case you think of something really cool that you wanted to do to Path as an object.

You may be able to do this 'cool thing' to an object and not a module. Always keep this in mind; if arguments weigh even between a module and an object, pick the object. For if you miscalculate the complexity of what you are designing, you have more room to maneuver if you do so.

Here is the 'Path' functions we talked about as an object:

Listing 18.8 Path.pm

1 package Path;

2 use strict;

3 use Carp;

4 use CommonVarbs qw ($CMD $DSEP); # developed in Chapter 15..

5 # gives common variables for OSes

6 use File::Basename; # gives basenames, filenames.

7

8 sub new

9 {

10 my ($type, $path) = @_;

11 my $self = \$path; # reference to a scalar

12

13 bless $self, $type;

14 }

15

One note about this code: notice in line 9 that I did not choose to make this object a hash reference. Why? I thought that a scalar would have been more flexible, to work with other code. Since $path is a scalar reference as well as a Path object, you can say:

my $path = new Path('..\..\..');

if (-d $$path) { print "$$path is a directory!!!\n"; }

where you can simply test whether or not $path is a directory by dereferencing it.

The cost of this design choice is that you do not have nearly as much room to maneuver in your object. If you decide that you need to have a hash-based object (instead of a scalar) well, then you are stuck, and will either have to:

change all of your clients to reflect this change

create a totally new object without this limitation

The code "(-d $$path)" is a subtle way of breaking encapsulation. We are reaching into the object and stealing its data.

However, instinctively, I feel that giving the object Path all of the power of Perl's file operators compensates for this loss of flexibility. You can treat a $path object as if it were a path to a file, just by dereferencing it. Anyway, the code continues:

Listing 18.9 abs() in Path.pm

16 sub abs

17 {

18 my ($self) = @_;

19 my ($dir, $file);

20 my $path = $$self;

21 my $cwd = &$CWD(); # could say 'use Cwd'; cwd();

22 if (-d $path)

23 {

24 $dir = "$path");

25 $file = '';

26 }

27 else

28 {

29 $dir = dirname ($path); # gotten from File::Basename.

30 $file = basename($path); # gets directory name, file name

31 }

32

33 chdir($dir);

34 my $return = &$CWD();

35 chdir($cwd) || warn (";

36 $return .= $DSEP . $file if ($file);

37 return($return);

38 }

Here, we are a little bit sneaky. If you think about it a bit, a relative path is a parsing nightmare. The '.' means current directory, '..' which means the directory above the current one, '.\' which is a relative path to the current working directory, '.\.\.\.' which is really redundantly the same thing, etc., etc. etc. Therefore, it is easier to chicken out. We let the shell do the hard work. If $path is a directory, or does not exist (lines 19-23), then simply note that $path is the path we are looking for.

If not, we split up the file into its directory and file parts (26-27). Processing then proceeds to lines 30-34 where we actually change to the directory given by the calls above, and then figure out the path by the shell itself in line 31. The shell knows better than us what the path is, and although creating a parser like this is possible, it sure is tedious and it doesn't have the knowledge that the shell does about what actually exists in the environment.

On the downside, using the shell to get information about the path creates a dependency. We are now dependent on having access to the directory that we want to change, in order to use the Path object.

The function rel() isn't nearly as important as the abs() because once you have a full path, who needs a partial one? A full path can be used to locate anything on the system without mistake, but a relative path (as in "..") varies from place to place. However, there are some uses. HTML pages, for example, sometimes have the links inside them as relative paths so that the pages can be moved around from system to system.

Hint on the exercise: if you make the path 'absolute' first (no matter what), then you can use the same algorithm to compute relativeness. If your current path is:

c:\winnt\system32\tmp\tmp2\tmp3

and you want to go to:

c:\winnt\system32\tmp\tmp1

then you can 'cancel out' winnt\system32\tmp, and manipulate the rest to get '..\..\tmp1'.

Example #3: Making Logs with LogFile

In Chapter 15, we went through the process of taking some procedural code and modularizing it by extracting the common functions and putting them into their own API. As part of this API, we came up with a Log module, whose interface looked like:

5 Log::open("log_file");

6 # ..... all the other stuff.

7 # .....

8 Log::write(@output);

9 Log::close();

Now, should we be happy with this interface? Should Log stay a module or should we make it an object?

We should definitely make it an object. We tested the waters of implementing a simple Log module, and let's see what we came up with:

Listing 18.10 Log module

1 package Log;

2 my $fh;

3 sub open

4 {

5 my ($log) = @_;

6 $fh = new FileHandle(">> $log");

7 }

8 sub close

9 {

10 close($fh);

11 }

12 sub write

13 {

14 my (@output) = @_;

15 print $fh "@output";

16 }

17 1;

Now, right off there are a couple of things that you should notice with this code which should tip you to the design decision to make. They are:

1) there is data that is separate from the functions themselves.

We define the filehandle $fh in line #2, which is shared with the whole module. If we try to say something like:

Log::open("log_file");

Log::open("log_file2");

then we are going to have problems. The first call to open will work, but the second one will overwrite the first one, since it is using the same file descriptor internally. This can be changed by adding a parameter to the write function:

my $fd = new FileHandle("> log_file");

Log::write($fd, @output);

# etc...

but this hardly does the job. The log isn't self contained, and the code isn't any cleaner than:

print $fd "@output\n";

which basically does the same thing without the overhead of a module call.

2) We have made some design decisions about what the output should look like.

The second thing that we should notice is that we have made some design decisions on what the log should look like. Line 15:

print $fh "@output\n";

specifies that our log will simply consist of space separated input.

Now, what happens if we want to have more than one type of format, i.e.: one log will have a timestamp, another be pipe delimited, another which has the associated script name attached to it, another which mails users when problems occur? In this case, the one write function will not suffice.

By making Log a module we are splitting the data from the functions, and limiting our flexibility.

Let's see how the interface of an object would change this:

1 my $log = new LogObject( "log_file",

2 { type => 'regular', action =>'append' });

3 my $log2 = new LogObject("log_file2",

4 { type => 'stamped',action => 'overwrite'});

5 $log2->open();

6 $log2->write("@data\n");

Lines 1-4 now associate the 1) name of the file, 2) the type of log we are going to be using, and 3) the action that it is going to be taking (either appending or overwriting the file).

When we get to line 5, we can pick and choose which log to open and which log to write. $log2->write("@data\n") then writes to the file "log_file2", with the addition of a timestamp. We implement it like so.

Listing 18.11 LogObject.pm - headers, new()

1 package LogObject;

2 use strict;

3 use Carp;

4 use FileHandle;

5 use Diff;

6 use Data::Dumper;

7

8 my $_defaultConfig = { 'type' => 'regular', 'action' => 'append' };

9 my $_legal = { 'action' => { 'append' => '>> ','overwrite' =>'> ' },

10 'type' => { 'regular' => 1, 'stamped' => 1 }

11 };

12

13 sub new

14 {

15 my ($type, $filename, $config) = @_;

16 my $self = {};

17 my %fullconfig;

18

19 confess "Config has to be a hash!\n" if($config && ref($config) ne 'HASH');

20 $config = $config || {};

21

22 $self->{filename} = $filename;

23 $self->{type} = $type;

24

25 %fullconfig = (%$_defaultConfig, %$config);

26 print Dumper(\%fullconfig);

27 bless $self, $type;

28 $self->{config} = \%fullconfig;

29 $self->_validate();

30 $self;

31 }

32

The constructor, new, is pretty straightforward, but with one added twist that we haven't seen before. The third parameter to new is a config hash: it contains information that is pretty much essential for telling the "LogObject" to work. In line #6 we are being nice; $_defaultConfig gives a default configuration so users do not have to type as much. If they say:

my $log = new LogObject("filename");

then filename will be interpreted as a regular, appending sort of Log. Lines 8-11 are more niceties, telling us which configurations are legal. Lines 19-20 are still more niceties. The routine validates that users have typed the correct information before continuing.

These niceties are almost always a necessary evil when writing real modules. The less strict users need to be, and the more free they are to put anything into a method then having the computer understand what they are saying, the more accepted your modules and objects will become.

Believe me, you will be glad that you took the time to code the extra functionality here, and be glad that Perl gives you the flexibility to do your own checking easily.

Once the new routine has completed, there is a legal LogObject available for our use. Since the checks have been done up front, we do not need to worry about checking for legality in arguments anywhere else. Processing then proceeds to open and write:

Listing 18.12 LogObject.pm (open,write)

33 sub open

34 {

35 my ($self) = @_;

36 my ($config) = $self->{config};

37 my $legalacts = $_legal->{'action'};

38 my ($action, $filename) =

39 ( $legalacts->{$config->{action}}, $self->{filename} );

40 my $fh = new FileHandle("$action $filename");

41 $self->{filehandle} = $fh;

42 }

43

44 sub write

45 {

46 my ($self, @text) = @_;

47 my $config = $self->{config};

48 my ($type, $fh) = ($config->{type}, $self->{filehandle});

49 if ($type eq 'regular') { $self->_writeRegular(@text); }

50 elsif ($type eq 'stamped') { $self->_writeStamped(@text); }

51 }

52

53 sub close { close($_[0]->{filehandle}); }

54

55 sub _writeRegular

56 {

57 my ($self, @text) = @_;

58 my $fh = $self->{filehandle};

59 print $fh "@text\n";

60 }

61

62 sub _writeStamped

63 {

64 my ($self, @text) = @_;

65 my $fh = $self->{filehandle};

66 my $script = $0;

67 my $time = localtime();

68 print $fh "$script:$time: @text\n";

69 }

There are more shenanigans here. Line #40 opens up a filehandle and we store it for further use in Line #41.

In order to determine how to open the filehandle, the method open() looks at '$config->{action}'. If the user gives an action of 'append', open with '>> ', and an action of 'overwrite' opens it with '> '.

Line #49 and #50 determine exactly how the write is going to occur. If the type of log is a 'regular' one, then the private function _writeRegular() is called. Otherwise, _writeStamped() is called, which write to the log files in different ways. We then stuff the actual details of how the writing is being done, out of sight.

All this code is relatively clean because we have placed all of the validation logic into one bucket, and it isn't a very pretty sight.

Listing 18.13 LogObject.pm (_validate)

70 sub _validate

71 {

72 my ($self) = @_;

73 my (@errors);

74 my ($hash, $filename) = ($self->{config}, $self->{filename});

75

76 my ($_actions, $_types ) =

77 ($_legal->{'action'}, $_legal->{'type'});

78

79 push (@errors, "Incorrect action $hash->{'action'}! Needs

80 to be one of :@{[ keys %$_actions ]}:\n")

81 if (!defined $_actions->{$hash->{'action'}});

82

83 push (@errors, "Incorrect type $hash->{'type'}! Needs to be one of

84 :@{[keys %$_types]}:\n")

85 if (!defined $_types->{$hash->{'type'}});

86

87 my @keys = keys (%$hash); my @legal = keys (%$_legal);

88 my $diff = Diff::array(\@keys, \@legal);

89 push(@errors, "Incorrect keys :@$diff: passed to LogObject!\n")

90 if (@$diff);

91 push(@errors, "Unwriteable log file! $filename\n")

92 if (!(new FileHandle(">> $filename")));

93 confess ("@errors") if (@errors);

94 }

95 1;

All of this is devoted to making the user behave, to make them type the correct responses. Lines 79 through 92 make a stack of errors, things that _validate() has found wrong with the input parameters. If it has found any, it confesses them, and dies in line 93.

So if the use types:

my $log = new LogObject("file", { 'type' => 'reglur' }

Perl will catch the spelling error for you and die. If the user types:

my $log = new LogObject("file");

and the "file" happens to be unwritable, then Perl will die. Since all of the errors are written to a stack, Perl keeps track of each error for us:

my $log = new LogObject("file", 'type' => 'rglr');

# both illegal type and unreadable.

When you run this script now, it will say something like:

Incorrect type rglr! Needs to be one of :regular stamped:

Unwriteable log file! filea

LogObject::_validate called at LogObject.pm line 29

LogObject::new called at script.p line 5

Good log files are pretty much essential for you getting your job done. They are windows into what the computer is doing, and can save your skin a thousand times over. If something goes wrong, it had better be logged somewhere because a problem that is untraceable is one that is unfixable, and unfixable problems make people very nervous.

We are finished with this example for now, but we aren't finished with it for good. In the next chapter we shall see how it can be improved by inheritance; and we will throw on a couple of more features just for good measure. The key to improving it is to think about all of the validation code, and to get rid of it so the maintenance of the module is easier.

Summary of Modules Versus Objects

The three examples in this section demonstrate design decisions between making routines modules or objects. In each case we had a different criteria in choosing.

The first case, Diff - functionality to get the difference between two simple data structures - we decided that the routine was better off as a module by itself. In order to make it an effective object, it would have to be tied with an Array object (or Hash object or Scalar object), and that would mean that in order to use Diff, you would have to exclusively use the Array object (rather than the built in datastructure).

In the second case, we decided that PathNames - functions to convert a path from absolute to relative and vice versa - should be an object, even though the deck was stacked about 50/50 in either case. The tipping factor was that even though we did not think that PathNames could get much more complicated, we weren't sure. Since we weren't sure, it was better to play safe and make it an object.

In the third case - the LogFile - we definitely decided to make it an object since we want LogFile to be pretty much bulletproof, yet customizable. We will need all the power of the object in order to do this.

So what can we say about this whole decision process? The following are two good basic, non-bulletproof, rules to follow:

1) If the thing you want to program has to do with a verb, then make it (prototype it) as a module. Verb implies action, and action implies a function, by itself. Here, we had diff as the function we wanted to program. If you prototype it, and use the module, it will take a lot less effort if you decide to convert it to an object.

2) If the thing that you are program is a noun, then make it an object. Noun implies that there is data associated with whatever it is you are modeling - hence Pathname and LogFile.

Finally, realize that you always want to favor your decision making process towards the object if you are uncertain. Sometimes, all it takes is a 90% shift in viewpoint to think of 'verbs' in terms of 'nouns'; in the diff case, all we did is think about what we were acting on (namely an array) to see a plausible object implementation. The only reason we hesitated is because this would force you to make all of your arrays to be objects.

Turning Procedural Code into Object

Suppose for an instant that you land a new job, or that you have a large body of code that is procedural (scripts with functions sitting at the bottom of them.) Now, you are going to try to re-use lots of that code if feasible - since the code is basically a gift-for-free. If a script works to automate a given problem, then no matter how spaghetti-like that script is, it has value. (The problem comes when people try to cling to that spaghetti-like code, causing lots of lost time in the process.)

When ever you get a chance, recycle that code! Below are two examples of code that we will recycle into objects. We originally saw the scripts behind them in chapter 12; they did some very useful things for us (automating ftp, telnet, and doing a directory compare). By turning them into objects, we can make them even better because they don't have to be tied to a specific command.

Example #1: ftp and telnet 'Expect' object

Now the first step in recycling any procedural code and turning it into objects is to take out the structure of that code to analyze it. This means:

1) Global variables

2) a function calling tree

3) arguments to the calling tree.

Let's do this for both the ftp and telnet examples of Chapter 12. First, the ftp example, shown in Figure 18.1:

181.fig

Figure 18.1

ftp calling tree

Then for telnet, shown in Figure 17.2:

182.fig

Figure 18.2

telnet calling tree

Look somewhat similar, don't they? We are basically doing two things:

1) generating expect code.

2) executing expect code.*

*
*

In addition, there is about 150 lines of filler that deals with interfaces. These interfaces include getting arguments from the command line and getting arguments from a file. Both of these are dealt with in other sections of this book; the first one we dealt with in Chapter 15, and the second one we will deal with in this chapter.

*

*

In addition, we have one global variable, $opt, which is filled either by the command line or a configuration file. Finally, the four functions _genTelnetCode(), _genFtpCode(), _execFtpCode() and _execTelnetCode() are pretty much ready to order, doing some very basic functionality for us.

In fact, _execFtpCode() and _execTelnetCode() are exactly the same! It looks like we could gain some savings in code lines if we combined the two into an object.

We now have to come up with what will be the name of the object, and what the interface into the object will be. Since what the module automates is Don Libes' 'Expect' program, let's choose that as a name.

From this flows our interface. We shall make a new Expect object:

my $obj = new Expect ( { 'configuration' => 'of', 'expect' => 'application' } );

and we will then stuff it with 'things to do':

$obj->set('filetoget1','filetoget2','filetoget3');

$obj->set('filetoget4');

and then we will let it do its stuff:

$obj->execute();

which will then go to actually execute the expect code that we have created. So the functions we shall have are:

new - constructor

set - sets what the ftp or telneting is going to do (which files to get, etc.)

reset - erases the former instructions that the Expect object was going to do, and sets them to blank

execute - generates Expect code, to either get or do commands (whether or not is of type ftp or telnet)

Here is a sample of what our interface is going to look like. This particular piece of code goes off and gets the latest version of Perl, and then installs it via a different machine:

Listing 18.14 expect_example.p

1 #!/usr/local/bin/perl5 -w

2

3 use Expect;

4 use Data::Dumper;

5

6 my $objftp = new Expect

7 (

8 {

9 'type' => 'bin' , 'site' => 'ftp.cs.colorado.edu' ,

10 'objtype' => 'ftp'

11 }

12 );

13

14 my $objtelnet = new Expect

15 (

16 {

17 'objtype' => 'telnet' , 'site' => 'rock_lobster',

18 'userprompt' => 'ogin:','passwordprompt'=>'Password:',

19 'user' => 'epeschko'

20 }

21 );

22

23 chdir ("/net/mount");

24 $objftp->set('/pub/perl/CPAN/src/5.0/latest.tar.gz');

25 $objftp->execute();

26

27 $objtelnet->set

28 (

29 ['epeschko', 'cd /net/mount'],

30 ['epeschko', 'unzip latest.tar.gz'],

31 ['epeschko', 'tar xvf latest.tar'],

32 ['epeschko', 'cd perl5.005'],

33 ['epeschko','sh configure -prefix="/home/perl/install"],

34 ['epeschko', 'make'],

35 ['epeschko', 'make test'],

36 ['epeschko', 'make install'],

37 ['epeschko', 'exit']

37 );

38 $objtelnet->execute();

We simply set up our Expect object exactly like we set up the telnet.p and ftp.p scripts, namely we configure them via a hash. Here the hash was provided by the user, and in the script examples, the hash was provided from the command line.

Then we let the script chug along by running execute().

The only difference between this - and the scripts ftp.p and expect.p - is that this object is much more easily re-used than the script's code. Of course you could use the 'telnet.p' and 'ftp.p' scripts of chapter 12 in your code by saying:

system("telnet.p -file telnetfile");

system("ftp.p -site my.site");

instead, where your code re-use is dependent on lots of system calls to scripts that you have already written. This is good if your project is not going to grow very large.

However, as soon as you try to scale this up, you will hit a major brick wall. I know, because I have tried to scale up code dependent on system calls, before Perl version 5 and before Perl's version of the object.*

*
**

I guess the reason why this doesn't work so well has to do with communication. When you depend on system calls like this, you can't get anything back from call except for:

1) a status, which tells you whether or not the code has succeeded.

In an operating system, this status code is a simple number, something like 0, and usually stored in $?. Of course you could say:

$text = `ftp.p -file ftpfile`

and then get back the text which this system call generates, but even then you are on slippery ground. You are dependent on the structure of the file that ftp.p reads, which itself is dependent on how ftp.p has implemented this. All these dependencies tend to make your code like a house of cards; a simple puff of wind from any direction can bring this house of cards down.

*

While you are going through the following, note the emboldened code; this is code that we have directly stolen from the ftp.p and expect.p modules that were introduced in Chapter 12.

First we deal with the constructor and headers:

Listing 18.15 Expect.pm header and constructor

1 package Expect;

2

3 use Carp;

4 use Term::ReadKey;

5 use strict;

6 use Data::Dumper;

7

8 my $_legalTypes =

9 {

10 telnet =>

11 { 'site' => 1, 'expect' => 1, 'pass' => 1,

12 'telnet' => 1,'debug' => 1, 'objtype' => 1,

13 'user' => 1,'pass' =>1,'passwordprompt'=>1,

14 'userprompt' => 1,'commands'=>1,'RESET'=> 1

15 'ignore' => 1

16 },

17 ftp =>

18 { 'site' => 1, 'expect'=> 1, 'ftp' => 1,

19 'user' => 1, 'pass' => 1, 'type' => 1,

20 'debug' => 1, 'objtype' => 1, 'RESET' => 1.

21 'ignore' => 1

22 }

23 };

24

25 sub new

26 {

27 my ($type, $config) = @_;

28 my $self = {};

29 bless $self, $type;

30 $self->{config} = $config || {};

31 $self->_validate();

32 $self->_fill();

33 $self;

34 }

Note two things here. One, is that the hash reference $_legalTypes was directly copied from the ftp.p and the telnet.p examples from chapter 12, with a few modifications. First, we removed the stuff that didn't have anything to do with Expect at all (getting options from a file for instance). Second, we added an 'ignore' flag, which tells the Expect module to ignore legal checks.

Why do this? For backwards compatibility. We will want to slide this module in, as easy as possible, to existing code that we have. At this point, this hash contains all of the information about what options an Expect object can do.

We then come to the constructor, new, which takes a configuration hash from any client that wants to make a new Expect object. Line #29 takes this $config hash and makes a copy of it for the object itself. $config tells us what the object actually looks like. When you say:

my $objtelnet = new Expect

(

{

'objtype' => 'telnet' , 'site' => 'rock_lobster',

'userprompt' => 'ogin:','passwordprompt'=>'Password:',

'user' => 'edward'

}

);

we are making the telnet object of type 'telnet', are going to connect to rock_lobster, and so on.

In line #30, then we actually check to make sure that the configuration hash that the user passed us is legal, with $self->_validate(). And line #31 helps the user with $self->_fill(), which fills in some common defaults so the user doesn't have to fill in each gory detail.

These two functions are private, and therefore we put them at the end of our module. In fact, we need not write them until the rest of the module is written; we simply assume that they are there so we can assume that we have a legal Expect configuration that we can deal with.

As a further convenience, we also provide a configure function, as below:

Listing 18.16 configure()

35 sub configure

36 {

37 my ($self, $configure) = @_;

38 if (defined ($configure->{'RESET'}))

39 {

40 $self->{config} = ($configure)

41 }

42 else

43 {

44 %{$self->{config}} = (%{$self->{config}}, %$configure);

45 }

46 $self->_validate();

47 }

48

This configure function allows us to reuse an existing Expect object. Say we defined the Expect object as above:

my $objftp = new Expect

(

{

'type' => 'bin' , 'site' => 'ftp.cs.colorado.edu' ,

'objtype' => 'ftp'

}

);

Now it would be a pain to have to type all of this over again, when say, all we wanted to do is change the type from binary to ASCII, for example. So instead, we could say:

$objftp->configure( { 'type' => 'ascii' });

which would override the type of file transfer in the constructor call, but leave everything else intact, as in the site, password, etc.

With this convenience out of the way, all we have to do is write the three major functions, namely set, reset, and execute. Then, of course, fill in the private functions.

Listing 18.17 set(), reset(), execute()

49 sub set

50 {

51 my ($self, @options) = @_;

52

53 my $opt = $self->{'config'};

54 if ($opt->{'objtype'} eq 'telnet')

55 {

56 $self->_telnetset(\@options);

57 }

58 else

59 {

60 $self->_ftpset(\@options);

61 }

62 }

63

64 sub reset

65 {

66 my ($self, @options) = @_;

67 if ($self->{'objtype'} eq 'telnet')

68 {

69 $self->_telnetset(\@options, 'RESET');

70 }

71 else

72 {

73 $self->_ftpset(\@options, 'RESET');

74 }

75 }

76

77 sub execute

78 {

79 my ($self) = @_;

80 my $opt = $self->{'config'};

81 if ($opt->{'objtype'} eq 'telnet')

82 {

83 $self->_genTelnetCode();

84 }

85 else

86 {

87 $self->_genFtpCode();

88 }

89 $self->_execCode();

90 }

set and reset are simply more ways of setting attributes of the object. If the object is of type 'telnet', then we call the private function _telnetset(), and if the type is 'ftp' we call _ftpset().

execute is the one function that actually does anything useful! Again, if the object type is 'telnet', we call _genTelnetCode(), and if the object type is 'ftp' we call _genFtpCode(), which generates the code that we are to run. And line #89, _execCode() actually executes this code.

Hence, out of the 90 lines of code that we have made so far, only one line, line #89 even alludes to doing anything. We haven't yet reused the bulk of the code yet!

This is what we are talking about when we talk about the overhead of programming objects. Although objects themselves are powerful, in programming them, one tends to go through a lot of this interface stuff. Interface, although it doesn't do anything directly, will become really important when you come to actually using your objects.

Notice another thing. We are doing quite a bit of programming like:

sub XXXXX

{

my ($self) = @_;

if ($type eq 'ftp') { _doFTP(); } else { _doTelnet(); }

}

If you see this pattern in your code (if something is of type ftp, then do this, otherwise do something else), this is a good clue that you should be using inheritance to split out the code into more manageable chunks.

Why? Because, when you add more and more pieces to this code (say you added a mail module, which automatically read your mail for you through expect) then you are not going to want to be constrained by saying set and execute, since these functions don't really fit with Expect::mail().

Anyway, we shall go through all of this in chapter 19, when we actually split this out via inheritance. For now, all we have to do is 'fill in the blanks', for this particular module. First, the set private functions:

Listing 18.18 _telnetset(), _ftpset()

91

92 sub _telnetset

93 {

94 my ($self, $options, $flag) = @_;

95 my $opt = $self->{'config'};

96 if (defined($flag))

97 {

98 $opt->{'commands'} = [];

99 }

100 else

101 {

102 push (@{$opt->{'commands'}}, @$options);

103 }

104 }

105

106 sub _ftpset

107 {

108 my ($self, $options, $flag) = @_;

109 my $opt = $self->{'config'};

110 if (defined($flag))

111 {

112 $opt->{'files'} = [];

113 }

114 else

115 {

116 push (@{$opt->{'files'}}, @$options);

117 }

118 }

119

These actually set the internal attributes for us: telnet has a list of commands, which are set in line #102, and ftp has a list of files, which are set in line #116. Now, we turn to the validate function, which we used in the constructor:

Listing 18.19 _telnetset(), _ftpset()

120 sub _validate

121 {

122 my ($self) = @_;

123 my $config = $self->{config};

124 my $type = $config->{objtype};

125 my $legals = $_legalTypes->{$type};

126

127 my (@errors);

128

129 if (!defined ($_legalTypes->{$type}))

130 {

131 push (@errors, "A Type of :$type: isn't legal!\n");

132 }

133 else

134 {

135 my $key;

136 foreach $key (keys(%$config))

137 {

138 if ((!defined ($legals->{$key})) && (!$config->{'ignore'}))

139 {

140 push (@errors, "The key :$key: isn't legal!\n");

141 }

142 }

143 }

144 confess "@errors" if (@errors);

145 }

146

We have seen this type of function before in the LogObject up above; we simply go through each of the keys that we know are legal, push the errors that we find onto a stack, and then confess them if we find any, so the user can fix the code (line #144). As a sop to backwards compatibility we make it so that we can pass a special config flag, 'ignore' to override these legality checks.

The _fill() function comes directly out of ftp.p and telnet.p. Remember, we had a bunch of defaults in the ftp and telnet scripts. We don't want these defaults to go to waste, so we 'cut and paste' them into _fill():

Listing 18.20 _fill()

147 sub _fill

148 {

149 my ($self) = @_;

150 my $opt = $self->{'config'};

151

152 if ($opt->{'objtype'} eq 'ftp')

153 {

154 $opt->{'expect'}= $opt->{'expect'} || $ENV{'EXPECT_EXEC'} ||

155 "expect -i";

156

157 $opt->{'user'} = $opt->{'user'}|| $ENV{'EXPECT_USER'}||

158 "anonymous";

159 $opt->{'pass'} = $opt->{'pass'}||$ENV{'EXPECT_PASS'}||"me@";

160 $opt->{'ftp'} = $opt->{'ftp'} ||$ENV{'FTP'} ||"ftp";

161 $opt->{'type'} = $opt->{'type'}||$ENV{'EXPECT_TYPE'}||"bin";

162 $opt->{'site'} = $opt->{'site'}||$ENV{'EXPECT_FTP_SITE'} ||

163 confess "You need to define a site to go to!\n";

164 $opt->{'files'} = $opt->{'files'} || [];

165 }

166 elsif ($opt->{'objtype'} eq 'telnet')

167 {

168

169 $opt->{'site'} =

170 (defined($opt->{'site'}))? $opt->{'site'} :

171 (defined $ENV{'EXPECT_TELNET_SITE'})?$ENV{'EXPECT_TELNET_SITE'}:

172 confess "You need to define a site to go to!\n";

173

174 $opt->{'telnet'} = $opt->{'telnet'}||$ENV{'TELNET'}|| "telnet";

175 $opt->{'expect'} = $opt->{'expect'}|| $ENV{'EXPECT_EXEC'}

176 ||"expect -i";

177

178 $opt->{'user'} = $opt->{'user'} || $ENV{'TELNET_USER'} ||

179 $ENV{'USERNAME'} || getpwuid($<) ||

180 confess "Couldn't get a user!\n";

181

182 $opt->{'passwordprompt'} = $opt->{'passwordprompt'} ||

183 confess "You need to define a password prompt!\n";

184

185 $opt->{'userprompt'} = $opt->{'userprompt'} ||

186 confess "You need to define a userprompt!\n";

187 $opt->{'commands'} = $opt->{'commands'} || [];

188 $opt->{'pass'} = $opt->{'pass'} || 'INTERACT_PASSWORD';

189 }

190 }

With the exception of line #188 (which we shall put to good use later), this is exactly the same code as what is in ftp.p and telnet.p, only munged a little bit to fit. (we are taking time to rethink our password policy - before, setting the password had to be done before filling in defaults occured.) We set up an if clause

if ($opt->{'objtype'} eq 'ftp')

{

# set up ftp options from ftp.p

}

else

{

# set up telnet options from telnet.p

}

and then dump all of the options, defaults, and so forth that we had from ftp.p and telnet.p into the if clause.

Finally, we have the _genFtpCode(), _genTelnetCode(), and _execCode() functions. These are directly lifted, in entirety, from the ftp.p and telnet.p. With minor changes, we make them work with the module:

Listing 18.21 _genFtpCode()

191 sub _genFtpCode

192 {

193 my ($self) = @_;

194 my $opt = $self->{'config'};

195

196 my $line = '';

197 my $key;

# ...........

# lines 198 through 281 directly taken from ftp.p-See chapter 12,

# page XXXXXXXX, or the CD that comes with this book for

# more detail; Makes ftp code and stuffs it in $line.

# ...........

283 $self->{'code'} = $line;

284 }

We cut out the lines in question, merely to emphasize the changes that we need to make the _genFtpCode function work with the Expect class. (If you want to see the _genFtpCode() function turn to chapter 12 or go to the CD associated with this disk.)

The main thing to notice here is that the hash $opt, which we used in the actual ftp.p code and which came from the command line (and ftp file) is now $self->{'config'} which is set by the constructor.

To make things easy, we simply rename $self->{'config'} to $opt, as to use the code without modification. (If you can do this, you will save a lot of testing time since you are sure that the code will work.) We now proceed to do the same thing with telnet:

Listing 18.22 _genFtpCode()

285

286 sub _genTelnetCode

287 {

288 my ($self) = @_;

289 my $opt = $self->{'config'};

290

# ...........

# lines 291 through 252 directly taken from telnet.p - Again,

# see chapter 12 page XXXXXXXX, or the CD that comes with

# this book for more detail;

# ...........

 

353 ;

354 $self->{'code'} = $line;

355 }

Again, we make the code out of the $opt hash, stuff the code into $line, and then into the object variable $self->{'code'}. Since this code is a legal string for expect, when someone says $obj->execute() we then actually execute it via _execCode():

Listing 18.23 _execCode()

356

357 sub _execCode

358 {

359 my ($self) = @_;

360

361 my $opt = $self->{'config'};

362 my $exec = ($opt->{'expect'});

363

364 if ($opt->{'pass'} eq 'INTERACT_PASSWORD')

365 {

366 print "Enter password for $opt->{'user'}\n";

367 ReadMode 2; $opt->{'pass'} = <STDIN>; ReadMode 0;

368 $opt->{'code'} =~ s"INTERACT_PASSWORD"$opt->{'pass'}"sg;

369 }

370

371 if ($opt->{'debug'})

372 {

373 print "Generated code:\n$self->{'code'}\n";

374 }

375 else

376 {

377 open (EXPECT, "| $exec");

378 print EXPECT $self->{'code'};

379 close(EXPECT);

380 }

381 $self->{'code'} = '';

382 }

Again, we are being a little sly here. Instead of directly copying over the function from either ftp.p or telnet.p, we take the chance to better handle passwords.

We give the user the chance to have an interactive password, via setting the variable $opt->{'pass'} to 'INTERACTIVE_PASSWORD'. When Perl sees this, it knows that the user needs to give the class a password. Then it 'hands over' control to the keyboard in lines 364-369, getting the user to enter in a password.

But there is one problem: by this time, we have generated the expect code, and the expect code has - as a password inside it - a line that looks like:

expect {

#......

"Password:", send "INTERACTIVE_PASSWORD\r".

#......

}

If we simply executed this code as is, it would fail. So what to do? Well, there are certain advantages of linking the getting of the password with the actual time the code is executed. This insures that we have the password inside the code for the smallest amount of time possible, and makes the code more secure, for instance. (After all, we set the code back to '' after completion.)

We again take advantage of Perl's flexibility. In line #368, we simply substitute the string INTERACTIVE_PASSWORD with the string we have received from the user. This turns the code into something that looks like:

expect {

#......

"Password:", send "my_password \r".

#......

}

where 'my_password' is what the user has typed on the keyboard.

This ends the first part of our code replacement exercise. Out of 368 lines, we have re-used 203: over half of our old code. On the other hand, 165 are new lines of code, the price of making it a class.

But for now, until you get used to Perl's object syntax, try to keep things as simple as possible, and not to go into inheritance too deeply. Although this code is in some ways more difficult to maintain, it is also a simpler design than splitting up your modules via inheritance, and you should always feel comfortable with the syntax of something before moving on to the next level.

Example #2: Configuration Files

Let's continue with the expect.p and ftp.p scripts, and see what other pieces of information we can abstract out. Remember, in ftp.p and expect.p, we had the option 'execfile' specified in GetOptions:

GetOptions

(

$opt, "--site:s", "--execfile:s", "--expect:s", "--pass:s",

"--telnet:s", "--debug"

);

This option let us take our input from a file, which seems a good thing to do. Not only does it seem a good thing to do, it seems like a common thing to do, something that we would like to do again.

Therefore, we decide that this is an especially good thing to abstract out, to make an object which is reusable which lets us take options from a file. Let's look at what we have so far. In line #34 of telnet.p we have

34 $fileopt = (defined ($opt->{'execfile'}))? _parseFile($opt) : {};

defines that $fileopt will either be blank, or come from the output of _parseFile(). The file that we want to parse looks like:

site: rock_lobster

 

"ogin:", peschko

"assword:", INTERACTIVE_PASSWORD

"epeschko*", tcsh

"epeschko*", ls

"epeschko*", exit

"epeschko*", exit

where we have two categories of entries. First, we have

1) <flag>: <value>

2) a series of comma separated values, which are by themselves on a line.

Our strategy will be to reuse the code in _parseFile(), writing a new class to substitute for it. Let's call this object 'Config::CSV' indicating that it is a Config file class of subtype CSV. We will define two functions that are accessible to the outside world:

get()

new()

new() will parse our configuration file for us, returning legal values. get() will be our interface to actually get those values out of the hash.

Here goes. First we define the headers and the constructor:

Listing 18.24 _execCode()

1 package Config::CSV;

2

3 use FileHandle;

4 use strict;

5 use Text::ParseWords;

6 use Carp;

7 use String::Edit;

8

9 my $_legalTypes =

10 {

11 'special' => 1, 'firstline' => 1,

12 'secondline' => 1, 'thirdline' => 1,

13 'elementsperline' => 1

14 };

15

16 my $_config = { 'special' => 'specialelem' };

17

18 sub new

19 {

20 my ($type, $file, $config) = @_;

21 my $fh = new FileHandle("$file") || confess "Couldn't open $file!\n";

22

23 my $self = bless {}, $type;

24

25 $self->{fh} = $fh;

26 $self->{config} = $_config;

27 %{$self->{config}} = (%$_config, %$config) if (defined ($config));

28

29 $self->_parse();

30 $self;

31 }

32

A couple of notes here. Just for fun (and for the sake of being quick) we define a default configuration hash $_config, rather than defining _validate() and _fill() functions like we did before in the expect object.

This was partly just to have variety; but it also lets you not commit yourself until you have used the object quite a bit. It is simpler to be flexible with a couple of statements rather than writing a complex validation routine and having to rewrite it later.

Second, line 30 (_parse()) does the actual parsing, and we return back the object to the main routine.

Now, let's look at what our usage is going to be, given the parameters. If we say something like:

my ($config) = new Config::CSV("configfile",

{

'special' => 'commands',

'firstline' => ['userprompt','user']

'secondline'=> ['passwordprompt','password']

}

);

this is saying "OK, let's open the file named "configfile", and then, from there, take all the flags (lines like "site: my.site.edu") and then process them." Hence our target transformation looks something like Figure 18.3

183.fig

Figure 183.fig

Target transformation

The subroutine that is going to do this for us is called _parse(). For now, we don't worry how to implement it; we just go on to do our other, non-private function, get().

Listing 18.25 get()

33

34 sub get

35 {

36 my ($self, $elementName) = @_;

37 my $values = $self->{values};

38 if (defined ($elementName))

39 {

40 if (!defined ($values->{$elementName}))

41 {

42 print "Warning!!! :$elementName: not defined!\n";

43 }

44 else

45 {

46 return($values->{$elementName});

47 }

48 }

49 else

50 {

51 return($values);

52 }

53 }

54

55

Here, again, this is pretty simple. All we do is assume that we have already parsed the file, and stored the values in $self->{values}. To make it even easier (and compatible with how telnet.p and ftp.p worked before) we give the special option to get() which returns all the configuration values if no parameter was given. Hence:

my $user = $config->get('user');

would return the user name that was in a config file whose attached object was $config, and

my $vals = $config->get();

would return the whole thing. The difficult bit, the parsing, is left until last:

Listing 18.26 _parse()

56 sub _parse

57 {

58 my ($self) = @_;

59 my $return = {};

60 $self->{values} = $return;

61

62 my $fh = $self->{fh};

63 my $opt = $self->{config};

64 my @lines = <$fh>;

65

66

67 my ($line, $xx, $keep) = ('', 0, '');

68

69 foreach $line (@lines)

70 {

71 chop($line);

72 next if (!$line); # ignore blanks

73

74 my @array;

75 if (@array = _isaFlag($line))

76 {

77 $return->{$array[0]} = $array[1];

78 }

79 else

80 {

81 if (defined ($opt->{firstline}) && ($xx == 0))

82 {

83 my (@elements) = $self->_parseCommand($line);

84 my $keys = $opt->{firstline};

85 $self->_set($keys, \@elements);

86 }

87 elsif (defined ($opt->{secondline}) && ($xx == 1))

88 {

89 my (@elements) = $self->_parseCommand($line);

90 my $keys = $opt->{secondline};

91 $self->_set($keys, \@elements);

92 }

93 elsif (defined ($opt->{thirdline}) && ($xx == 2))

94 {

95 my (@elements) = $self->_parseCommand($line);

96 my $keys = $opt->{thirdline};

97 $self->_set($keys, \@elements);

98 }

99 else

100 {

101 my (@elements) = $self->_parseCommand($line);

102 my $key = $opt->{'special'};

103 push (@{$return->{$key}}, \@elements);

104 }

105 $xx++;

106 }

107 }

108 return($return);

109 }

Bleah. Not pretty. But neither is our problem so that is OK. The bolded out lines are either wholly or partially borrowed from telnet.p so it's not as bad as it looks. Basically, @lines contains all of the lines in the configuration file, and we go through each of them, one by one in the foreach loop, just like before.

However, whereas before we hardcoded the values, like:

my ($passwordprompt, $password ) = _parseCommand($line);

$opt->{'passwordprompt'} = $passwordprompt;

in our original script, we generalize this to fill our hash based on the config hash given in $self->{'config'}. If the line was

"epeschko*", ls

and $opt->{'special'} was equal to 'commands', then:

my $key = $opt->{'special'};

makes $key equal to 'commands', and

push (@{$return->{$key}}, \@elements);

is equal to

push (@{$return->{'commands'}}, [ "epeschko*", "ls" ]);

Or, in other words our new code returns the same results as it did when stuffed at the bottom of telnet.p! It is simply more general then the code before. It has to be; we are making a general module here, and we can't have hard-coded elements.

All that is left to do is do the actual parsing; the data structure that we have made is in place, and we have to fill in the blanks. Bolded code, again, is code that we have reused:

Listing 18.27 _parse()

110

111 sub _set

112 {

113 my ($self, $keys, $values) = @_;

114 my $return = $self->{values};

115 my $xx = 0;

116 for ($xx = 0; $xx < @$values; $xx++)

117 {

118 $return->{$keys->[$xx]} = $values->[$xx];

119 }

120 }

121

122 sub _parseCommand

123 {

124 my ($self,$line) = @_;

125 my $keep = 0;

126

127 my (@args) = quotewords("\s*,\s*", $keep, $line);

128 if (defined ($self->{elementsperline}) &&

129 @args != $self->{elementsperline})

130 {

131 print "Warning!!!! Number of arguments in line :$line: is not correct!

132 Should be $self->{elementsperline}\n";

133 }

134 @args = trim(@args);

135 return(@args);

136 }

137

138 sub _isaFlag

139 {

140 my ($line) = @_;

141 my $keep = 0;

142 my (@array) = quotewords("\s*:\s*", $keep, $line);

143 @array = trim(@array);

144 if (@array == 2)

145 {

146 return (@array);

147 }

148 else

149 {

150 return(());

151 }

152 }

153 1;

Example #3: Rewriting telnet.p

With these two objects in place, we are now in a position to rewrite the telnet.p from Chapter 12. Considering that our original telnet.p was 216 lines long, let's see exactly how much we can improve on it.

As a simple preamble, let's consider our use statements:

use strict;

use FileHandle; # inside Config::CSV

use Term::ReadKey; # inside Expect

use Getopt::Long;

use Text::ParseWords; # inside Config::CSV

use String::Edit; # inside Config::CSV

We still need strict and Getopt::Long, but now FileHandle, Term::ReadKey, Text::ParseWords, and String::Edit are no longer being directly used, instead being inside either Config::CSV or Expect.

Now our headers look like:

use strict;

use Getopt::Long;

use Config::CSV

use Expect;

We continue to do this for the whole code, marking up where it makes sense to put stuff in the objects that we have written. Something as in Figure 18.4:

184.fig

Figure 18.4

Marked up code -- where things go.

Having had a preliminary look at the code, we are ready to draw up the new code. First, we make a stub file, one that shows all of the new modules that we are going to be using:

Listing 18.28 telnet

1 use strict;

2 use Getopt::Long;

3 use Config::CSV;

4 use Expect;

5 use Data::Dumper;

6

Then we test the combination with a 'perl -c':

prompt% perl -c telnetobject.p

and it compiles. (assuming that all of these modules are in your @INC directory.)

Second, we look at what we need to change and then add them piece by piece. The first thing that telnet.p does is take the arguments from a file (given the flag '--execfile'), and then gives the user the chance to override these arguments from the command line.

Well, since we made 'Config::CSV', the 'arguments taken from a file' turns from a subroutine to a method call:

Listing 18.29 telnetobj.p (continued)

7

8

9 main();

10

11 sub main

12 {

13 my $opt = {};

14 GetOptions

15 (

16 $opt, "--site:s", "--execfile:s", "--expect:s", "--pass:s",

17 "--telnet:s", "--debug"

18 );

19

20 if ($opt->{'execfile'})

21 {

22 my $filecfg = new Config::CSV

23 (

24 $opt->{'execfile'},

25 {

26 'firstline' =>['userprompt', 'user'],

27 'secondline'=>['passwordprompt','pass'],

28 'special' => 'commands'

29 }

30 );

31 my $fileopt = $filecfg->get();

32 %$opt = (%$fileopt, %$opt);

33 }

(Ignore lines 32-33 for now. They belong to the next section.)

The key lines here are in bold; our objective is to get the $fileopt hash to look right, which contains the parsing of $opt->{'execfile'} in lines #22-#31. Now we make a simple test file:

site: rocky_horror

"ogin:", epeschko

"assword:", INTERACT_PASSWORD

"epeschko*", tcsh

"epeschko*", ls

"epeschko*", exit

"epeschko*", exit

and then add the line:

print Dumper($fileopt);

to our script. When we say:

%prompt telnetobj.p -execfile telnetfile

we hope to have a print out that looks something like Figure 18.5:

185.fig

Figure 18.5

Correspondance between file and object

We then cycle through, debugging as we go, until we get this output.

Finally, we add the merge of the command line and the file that our old file had, as well as the call to Expect:

Listing 18.30 telnetobj.p (end)

33 %$opt = (%$opt, %$fileopt);

34 }

35

36 my $expectobj =new Expect({ 'objtype' => 'telnet','ignore' => 1, %$opt });

37 $expectobj->execute;

38 }

That is it. Now I admit it, it is difficult to believe that this 38 line program (which could probably be condensed more) is the same, 216 line behemoth that we had written before! Yet I hope it hasn't been too much of a surprise.

The key here is in the bolded %$opt. Line #33 merges the options from the command line, and the options in the file that we provide on the command line. But (being sneaky as always) we have designed it so %$opt is compatible with the Expect object.

Think about this for a second: each one of the lines that comes out of the hash shown in Figure 18.X has a use in Expect, the module. It should be. We were the ones that typed the file!

In a sense then, there is a usage relationship between Config::CSV and Expect here. Config::CSV has the job of sifting the input (which can be quite complicated), and then Expect knows that this input has been 'validated'. Now, it can do its job, which is to actually run the telnet job!

This did not come totally seamlessly. I had to do testing, yes, and the module Data::Dumper came as an invaluable asset to knowing exactly what was going on.

Summary of Turning Procedural Code into Object Code

Sometimes object oriented programming is the most fun, and worthwhile, when you are searching your old code (or somebody else's code ) for objects. Much of the time, it cannot be done as quickly as the examples above, but there are times that you can glean diamonds out of your own code, and then ease them into existing code in a very seamless way.

The last example above may be a bit extreme, going from a 216-line client to 37 line one, but it is not by any means the only time that this has happened to me, and to programmers I know. The trick to doing this is:

1) A heavy analysis phase - understanding what you have and documenting down in detail how your code works (via using a calling tree )

2) actually ripping your code apart for scraps, and pigeonholing what you see into several categories.

3) Designing your new interface with the old one in mind.

4) Coding the new interface, and slipping it in and testing it a step at a time.

You will get a lot out of doing this recycling job, You will learn more about your thinking processes if you do this strip-mining of your code, and build it back in a better way. I am constantly doing it; if a module becomes unwieldly, I always sell it for parts. Often this is the only way to learn, and I recommend it highly.

In this chapter so we have 1) gone through some of the common methods, and 2) learned about some of the ways to decide modules Vs objects, and 3) learned to strip-mine the objects for code.

In the next chapter, let's build a real object from scratch. We will consider everything: problem recognition, available resources, design, and implementation -- everything.

Orders Orders Backward Forward
Comments Comments

COMPUTING MCGRAW-HILL | Beta Books | Contact Us | Order Information | Online Catalog


HTML conversions by Mega Space.

This page updated on October 14, 1997 by Webmaster.

Computing McGraw-Hill is an imprint of the McGraw-Hill Professional Book Group.

Copyright ©1997 The McGraw-Hill Companies, Inc. All Rights Reserved.
Any use is subject to the rules stated in the Terms of Use.