Classes, Objects, and Ties (Perl Cookbook, 2nd Edition)

13.0. Introduction

Although Perl was not initially conceived of as an object-oriented language, within a few years of its initial release, complete support for object-oriented programming had been added. As usual, Perl doesn't try to enforce one true style, but embraces many. This helps more people do their job the way they want to do it.

You don't have to use objects to write programs, unlike Java, where programs are instances of objects. If you want to, though, you can write Perl programs that use nearly every weapon in the object-oriented arsenal. Perl supports classes and objects, single and multiple inheritance, instance methods and class methods, access to overridden methods, constructors and destructors, operator overloading, proxy methods through autoloading, delegation, a rooted hierarchy for all objects, and two levels of garbage collection.

You can use as many or as few object-oriented techniques as you want and need. Ties are the only part of Perl where you must use object orientation. And even then, only the module implementor need be aware of this; the casual user gets to remain blissfully unaware of the internal mechanics. Ties, discussed in Recipe 13.15, let you transparently intercept access to a variable. For example, you can use ties to create hashes that support lookups by key or value instead of just by key.

13.0.1 Under the Hood

If you ask 10 people what object orientation is, you'll get 10 different answers. People bandy about terms like abstraction and encapsulation, trying to isolate the basic units of object-oriented programming languages and give them big names to write papers and books about. Not all object-oriented languages offer the same features, yet they are still deemed object-oriented. This, of course, produces more papers and books.

We follow the nomenclature used in Perl's documentation, the perlobj(1) manpage, and Chapter 12 of Programming Perl. An object is a variable that belongs to a class. Methods are functions associated with a class. In Perl, a class is a package—and usually a module. An object is a reference to something associated with a class.

Once associated with a class, something is said to be blessed into that class. There's nothing ecclesiastical or spooky going on here. Blessing merely associates a referent with a class, and this is done with the bless function, which takes one or two arguments. The first is a reference to the thing you want associated with the class; the second is the package with which to make that association.

$object = {  };                          # hash reference
bless($object, "Data::Encoder");    # bless $object into Data::Encoder class
bless($object);                     # bless $object into current package

The class name is the package name (Data::Encoder in this example). Because classes are modules (usually), the code for the Data::Encoder class resides in the file Data/Encoder.pm. As with traditional modules, the directory structure is purely for convenience; it implies nothing about inheritance, variable sharing, or anything else. Unlike a traditional module, though, an object module seldom if ever uses the Exporter. Access should be through methods only, not imported functions or variables.

Once an object has been blessed, calling the ref function on its reference returns the name of its class instead of the fundamental type of referent:

$obj = [3,5];
print ref($obj), " ", $obj->[1], "\n";
bless($obj, "Human::Cannibal");
print ref($obj), " ", $obj->[1], "\n";

ARRAY 5
Human::Cannibal 5

As you can see, you can still dereference a reference once it has been blessed. Most frequently, objects are implemented as blessed hash references. You can use any kind of reference you want, but hash references are the most flexible because they allow arbitrarily named data fields in an object.

$obj->{Stomach} = "Empty";   # directly accessing an object's contents
$obj->{NAME}    = "Thag";
# uppercase field name to make it stand out (optional)

Although Perl permits it, it's considered poor form for code outside the class to directly access the contents of an object. The point of objects, everyone agrees, is to give you an abstract something with mediated access through designated methods. This lets the maintainer of the class change its implementation without needing to change all application code that uses the class.

13.0.2. Methods

The whole purpose for blessing—that is, associating a package with a referent—is so that Perl can determine the package namespace in which to find functions when you invoke methods against an object. To invoke a method, use ->. Here, we invoke the encode( ) method of $object with the argument "data" and store the return value in $encoded:

$encoded = $object->encode("data");

The lefthand operand of the -> operator is said to be the method's invocant. Think of the invocant as the entity on whose behalf the method was called. Methods always involve invocants. Here we have an object method because we invoke the method on an object. We can also have class methods where the invocant is a string representing the package—meaning, of course, the class.

$encoded = Data::Encoder->encode("data");

Invoking a method calls the function in the corresponding class, implicitly passing its invocant as the initial argument to that function: a reference for object methods, a string for class methods. It isn't always obvious which of the two invocation types you have, because the invocant could be a variable holding a class name instead of one holding a reference that's been blessed.

$class = "Animal::" . ($aquatic ? "Fish" : "Mammal");
$beastie = $class->create( );

That will sometimes invoke the create method from class Animal::Fish and sometimes invoke the create method from class Animal::Mammal. This might even end up being the same underlying function if those two classes share a common ancestral class. Here you don't know the class until runtime. Recipe 13.8 shows how to invoke a method where the method name isn't determined until runtime.

Most classes provide constructor methods, which return new objects. Unlike in some object-oriented languages, constructor methods in Perl are not specially named. In fact, you can name them anything you like. C++ programmers have a penchant for naming their constructors in Perl new. We recommend that you name your constructors whatever makes sense in the context of the problem you're solving. For example, constructors in the Tk extension to Perl are named after the widgets they create. A less common approach is to export a function with the same name as the class; see Recipe 13.14.4 in Recipe 13.14 for an example.

A typical constructor used as a class method looks like this:

sub new {
    my $class = shift;
    my $self  = {  };         # allocate new hash for object
    bless($self, $class);
    return $self;
}

Call the constructor with:

$object = Classname->new( );

If there isn't any inheritance or other monkey business working behind the scenes, this is effectively the same as:

$object = Classname::new("Classname");

The new function's first argument here is the name of the class—hence, the package—to bless the new reference into. A constructor should pass that string as the second argument to bless.

Recipe 13.1 also talks about functions that return blessed references. Constructors don't have to be class methods; it's often useful to have object methods that themselves return new objects, as discussed in Recipe 13.6 and Recipe 13.7.

A destructor is a subroutine that runs when an object's referent is garbage collected, which happens when its internal reference count becomes zero. Because it is invoked implicitly by Perl, unlike a constructor, you have no choice in naming a destructor. You must name your destructor method DESTROY. This method, if it exists, is invoked on an object immediately prior to memory deallocation. Destructors, described in Recipe 13.2, are optional in Perl.

Some languages syntactically allow the compiler to restrict access to a class's methods. Perl does not—it allows code to invoke any method of an object. The author of a class should clearly document the public methods (those that may be used), and the user of a class should avoid undocumented (implicitly private) methods.

Perl doesn't distinguish between methods that can be invoked on a class (class methods) and methods that can be invoked on an object (instance methods). If you want a particular method to be invoked as a class method only, do something like this:

use Carp;
sub class_only_method {
    my $class = shift;
    croak "class method invoked on object" if ref $class;
    # more code here
}

If you want to allow a particular method to be invoked as an instance method only, do something like this:

use Carp;
sub instance_only_method {
    my $self = shift;
    croak "instance method invoked on class" unless ref $self;
    # more code here
}

If your code invokes an undefined method on an object, Perl won't complain at compile time, but this will trigger an exception at runtime. Methods are just function calls whose package is determined at runtime. Like all indirect functions, they can have no prototype checking, because that happens at compile time. Even if methods were aware of prototypes, in Perl the compiler never checks the precise types or ranges of arguments to functions. Perl prototypes are used to coerce a function argument's context, not to check ranges. Recipe 10.11 details Perl's peculiar perspective on prototypes.

You can prevent Perl from triggering an exception for undefined methods by using the AUTOLOAD mechanism to catch calls to nonexistent methods. We show an application of this in Recipe 13.12.

13.0.3. Inheritance

Inheritance defines a hierarchy of classes. Calls to methods not defined in a class search this hierarchy for a method of that name. The first method found is used. Inheritance means allowing one class to piggyback on top of another so you don't have to write the same code again and again. This is a form of software reuse, and therefore related to Laziness, the principal virtue of a programmer.

Some languages provide special syntax for inheritance. In Perl, each class (package) can put its list of superclasses (parents in the hierarchy) into the package variable @ISA. This list is searched at runtime when a method that is not defined in the object's class is invoked. If the first package listed in @ISA doesn't have the method but that package has its own @ISA, Perl looks first in that package's own @ISA, recursively, before going on.

If the inheritance search fails, the same check is run again, this time looking for a method named AUTOLOAD. The lookup sequence for $invocant->meth( ), where $invocant is either a package name or a reference to something blessed into that package, is:

P::meth
All packages S in @P::ISA, recursively, for any S::meth( )
UNIVERSAL::meth
The P::AUTOLOAD subroutine
All packages S in @P::ISA, recursively, for any S::AUTOLOAD( )
The UNIVERSAL::AUTOLOAD subroutine

Most classes have just one item in their @ISA array, a situation called single inheritance. Classes with more than one element in @ISA represent multiple inheritance. The benefits of multiple inheritance are widely contested, but it is supported by Perl.

Recipe 13.10 talks about the basics of inheritance and designing a class so it can be easily subclassed. In Recipe 13.11, we show how a subclass can invoke overridden methods in its superclasses.

Perl doesn't support inheritance of data values. You could say that Perl supports only interface (method) inheritance, not implementation (data) inheritance. A class usually can, but seldom should, touch another's data directly. This violates the envelope and ruins the abstraction. If you follow the advice in Recipe 13.11, this won't be much of an issue.

13.0.4. A Warning on Indirect Object Notation

The indirect notation for method invocations:

$lector = new Human::Cannibal;
feed $lector "Zak";
move $lector "New York";

is an alternative syntax for:

$lector = Human::Cannibal->new( );
$lector->feed("Zak");
$lector->move("New York");

This indirect object notation is appealing to English speakers and familiar to C++ programmers, who use new this way. However, it suffers from several tricky problems. One is that the construct follows the same quirky rules as the filehandle slot in print and printf:

printf STDERR "stuff here\n";

This slot, if filled, is limited to a bare symbol, a block, or a scalar variable name; it can't be just any old scalar expression. This can lead to horribly confusing precedence problems, as in these next two lines:

move $obj->{FIELD};                 # probably wrong
move $ary[$i];                      # probably wrong

Surprisingly, those actually parse as:

$obj->move->{FIELD};                # Surprise!
$ary->move->[$i];                   # Surprise!

rather than as you might have expected:

$obj->{FIELD}->move( );              # Nope, you wish
$ary[$i]->move;                     # Nope, you wish

As with printf, you can fix this by wrapping the expression in braces to make it a block:

move { $obj->{FIELD} };                 # These work
move { $ary[$i] };

Furthermore, just like print or printf with a filehandle in the indirect object slot, parentheses are optional, and the method invocation becomes a list operator syntactically. Therefore, if you write:

move $obj (3 * $position) + 2;
print STDERR (3 * $position) + 2;

that will end up being taken to mean:

$obj->move(3 * $position) + 2;
STDERR->print(3 * $position) + 2;

So you'd need to put in an extra set of parentheses:

move $obj ((3 * $position) + 2);
print STDERR ((3 * $position) + 2);

The other problem is that Perl must guess at compile time whether name and move are functions or methods. If you write:

$obj = new Game;

that could, depending on what's in scope and what the compiler has seen, mean any of the following:

$obj = new("Game");                        
$obj = new(Game( ));                                
$obj = "Game"->new( );

of which only the third is the one you want. In fact, even using the infix arrow operator for method invocation has a potential problem. For example:

$obj = Game->new( );

could end up being interpreted as:

$obj = Game( )->new( );

under slightly esoteric circumstances: when there's a function in the current package named Game( ). Usually Perl gets it right, but when it doesn't, you get a function call compiled as a method invocation, or vice versa. This can introduce incredibly subtle bugs that are hard to unravel.

The surest way to disambiguate this is to put a double-colon after the package (class) name.

$obj = new Game::;             # always "Game"->new( )
$obj = Game::->new;            # always "Game"->new( )

Now it doesn't matter whether there is a function named Game or new visible in the current package; you'll always get the method invocation. When you use a package-quoted class like this, the invocant has the double-colon stripped off again when the method is run, as the comments indicate.

To be honest, you can almost always get away using just the bare class name and omitting the ugly trailing double-colon—provided two things are true. First, there must be no subroutine of the same name as the class. (If you follow the convention that subroutine names like new start with a lowercase letter and class names like Game start with an uppercase letter, this is never a problem.) Second, the class needs to have been loaded with one of:

use Game;
require Game;

Either of these declarations ensures that Perl knows Game is a module name. This forces any bare name like new before the class name Game to be interpreted as a method invocation, even if you happen to have declared a new subroutine of your own in the current package. People don't generally get into trouble with indirect objects unless they start cramming multiple classes into the same file, in which case Perl might not know that a particular package name was supposed to be a class name. People who name subroutines with names that look like ModuleNames also come to grief eventually.

For more information about this, see the sections on "Syntactic Snafus with Indirect Objects" and "Package-Quoted Classes" in Chapter 12 of Programming Perl.

13.0.5. Some Notes on Object Terminology

In the object-oriented world, many words describe only a few concepts. If you've programmed in another object-oriented language, you might like to know how familiar terms and concepts map onto Perl.

For example, it's common to refer to objects as instances of a class and those objects' methods instance methods. Data fields peculiar to each object are often called instance data or object attributes, and data fields common to all members of that class are class data, class attributes, or static data members.

Also, base class, generic class, and superclass all describe the same notion (a parent or similar ancestor in the inheritance hierarchy), whereas derived class, specific class, and subclass describe the opposite relationship (a child or descendent in the inheritance hierarchy).

C++ programmers have static methods, virtual methods, and instance methods, but Perl only has class methods and object methods. Actually, Perl only has methods. It accepts any sort of invocant you choose to employ. Whether a method acts as a class or an object method is determined solely by actual usage. You could call a class method (one expecting a string argument) on an object or an object method (one expecting a reference) on a class, but you shouldn't expect reasonable results if you do.

A C++ programmer thinks about global (class) constructors and destructors. These correspond to module initialization code and per-module END{ } blocks, respectively.

From the C++ perspective, all methods in Perl are virtual. This is why their arguments are never checked for function prototypes as the built-in and user-defined functions can be. Prototypes are checked by the compiler at compile time, and you can't determine until runtime the function called by method invocation.

13.0.6. Philosophical Aside

In its OO programming, Perl gives you a lot of freedom: the ability to do things more than one way (you can bless any data type to make an object), to inspect and modify classes you didn't write (adding functions to their packages), and to use these to write tangled pits of misery—if that's really what you want to do.

Less flexible programming languages are usually more restrictive. Many are fanatically devoted to enforced privacy, compile-time type checking, complex function signatures, and a smorgasbord of other features. Perl doesn't provide these things with objects because it doesn't provide them anywhere else, either. Keep this in mind if you find Perl's object-oriented implementation weird. You only think it's weird because you're used to another language's philosophy. Perl's treatment of OO is perfectly sensible—if you think in Perl. For every problem that you can't solve by writing Perl as though it were Java or C++, there is a native Perl solution that works perfectly. The absolutely paranoid programmer can even have complete privacy: the perltoot(1) manpage describes how to bless closures to produce objects that are as private as those in C++ (and more so).

Perl's objects are not wrong; they're differently right.

13.0.7. See Also

The general literature on object-oriented programming rarely refers directly to Perl. The documentation that came with Perl is a good place to begin learning about object-oriented programming, particularly the object tutorials perltoot(1) and perlboot(1). For a reference, read perlobj(1) and Chapter 12 of Programming Perl. You might need it when you read perlbot(1), which is full of object-oriented tricks.

Damian Conway's Object Oriented Perl (Manning) is the best introduction and reference for object-oriented programming in Perl. It's readable, accurate, and comprehensive.

Chapter 13. Classes, Objects, and Ties

Contents: