This series has discussed Perl as a language for numbers, strings, and files -- the original purpose of the language. (A Beginner's Introduction to Perl 5.10, A Beginner's Introduction to Files and Strings with Perl 5.10, and A Beginner's Introduction to Perl Regular Expressions). Then it showed how to use Perl 5.10 for Web Programming.
The previous articles assumed that you'll write code for your programs yourself, in one big file per program. Perl has a huge advantage; you don't need to do this. Over 7,000 people have contributed over 16,400 addon libraries, or module distributions, for common tasks.
This installment explains how modules work; you'll build one, and along the way you'll learn a bit about object-oriented programming in Perl.
What Is an Object?
Think back to the first article in this series, which discussed two basic data types in Perl, strings and numbers. There's another basic data type: the object. You saw them in the previous article, A Beginner's Introduction to Perl Web Programming.
Objects are a convenient way of packaging information with the things you actually do with that information. An object contains data in its attributes or properties, and can perform actions through methods.
For example, you might have an AddressEntry object for an
address book program. This object would contain properties that store
a person's name, mailing address, phone number, and e-mail address; as well as
methods that print a nicely formatted mailing label or allow you to
change the person's phone number.
A New Goal
So far, the configuration information for the code developed in previous articles appears directly in the source code of those programs. This isn't a good approach. You may want to install a program and allow multiple users to run it, each with their own preferences, or you may want to store common groups of options for later. What you need is a configuration file to store these options.
The INI-style format is particularly easy to use; it's a simple plain-text
format, which groups name and value pairs into sections. Header names in
brackets delineate sections. To refer to the value of a specific key in the
configuration file, use the section.name syntax. For instance,
the value of author.firstname in this simple file is
Doug:
[author]
firstname=Doug
lastname=Sheppard
[site]
name=Perl.com
url=http://www.perl.com/
If you used Windows in the ancient days when versions had numbers, not years, you'll recognize this format. Note also that an existing well-tested and well-maintained CPAN module already handles this format: Config::INI. In real-world code, use that module instead -- but it's simple enough to write a module to handle the very basic form of that format that this makes a good didactic exercise.
With the real-world purpose of this module defined, it's time to think about
what properties and methods it will have. What do
TutorialConfig objects store, and what can you do with them?
The first part is simple: the object's properties will be the values in the configuration file.
The second part is more complex. Start by listing the two things you
need to do: read a configuration file and retrieve a value from it.
Call these two methods read and fetch. Finally, add
another method to store or change a value from within your program. Call it
store. These three methods will cover nearly everything you want
to do.
Starting Off
Perl class names often use the StudlyCaps or CamelCase style. Use the name
TutorialConfig. Because Perl looks for a module by its filename,
the filename should be TutorialConfig.pm.
Put the following into a file called TutorialConfig.pm:
package TutorialConfig;
warn "TutorialConfig is successfully loaded!\n";
1;
(I've sprinkled debugging statements throughout the code. You can take them
out in practice. The warn keyword is useful to bring things to
the user's attention without ending the program the way die
would.)
The package keyword tells Perl the name of the class you're
defining. This is generally the same as the module name. (It doesn't
have to be, but it's a good idea!) The 1; will return a
true value to Perl, which indicates that the module has loaded completely and
successfully. If you forget this (and you will), Perl will give you an error
message saying that your package did not return a true value.
You now have a simple module called TutorialConfig, which you
can use in your code with the use keyword. Run this very simple,
one-line program:
use TutorialConfig;
... and you should see:
TutorialConfig is successfully loaded!
What Does an Object Do?
Before you can create an object, you need to know how to create it.
That means you must write a method called new that will initialize
and return an object. Sometimes you need to initialize objects and sometimes
you don't; this constructor is where you do so.
Add this new method to TutorialConfig.pm right after the package declaration:
sub new {
my ($class_name) = @_;
my $self = {};
warn "We just created our new variable...\n";
bless ($self, $class_name);
warn "and now it's a $class_name object!\n";
$self->{_created} = 1;
return $self;
}
(Again, you won't need those warn statements in actual
practice.)
First, notice that methods use the sub keyword as well. (All
methods are really just a special sort of sub.) This new method
takes one parameter: the type of object to create, stored in a private
variable called $class_name. (You can also pass extra parameters
to new if you want. Some modules use this for special
initialization routines.)
Next, the code uses a hash-based object by creating a new anonymous hash. This works just like a regular hash, except that it has no name and you must dereference it to use it. More on that in a moment.
The bless operator takes two parameters: a variable containing
a reference that you want to make into an object, and the type of object you
want it to be. This is the line that makes the magic happen! All
bless does is associate a class with a reference so that you can
call methods on it.
The code next stores a property called _created.
This property isn't really that useful, but it does show the syntax for
accessing the contents of a hash reference:
$object_name->{property_name}.
Finally, having made $self into a new
TutorialConfig object, the code returns it.
A program to create a TutorialConfig object looks like:
use TutorialConfig;
my $tut = TutorialConfig->new();
In this case, new() is a class method; you call it on
a class rather than an object itself. Notice that the method calling operator
-> looks exactly like the operator used to dereference the
anonymous hash. This is on purpose. All objects are references. (If you've
followed closely, you may say "Wait, C
When you run this code, you'll see:
TutorialConfig is successfully loaded!
We just created the variable ...
and now it's a TutorialConfig object!
Now that you have a class and can create objects with it, it's time to make the class do something!
The Goal, Part 2
Remember the goals for the example program? You need to write three methods
for the TutorialConfig module: read,
fetch, and store.
The first method, read, obviously requires the name of a file
to read. Notice that this method requres two parameters. The first
parameter is the object to use, and the second is the filename to read. The
returned value indicates whether the method successfully read the
file.
sub read {
my ($self, $file) = @_;
open my $config_fh, $file or return 0;
# Store a special property containing the name of the file.
$self->{_filename} = $file;
my $section;
while (my $line = <$config_fh>) {
chomp $line;
given ($line) {
when (/^\[(.*)\]/) { $section = $1 }
when (/^(?<key>[^=]+)=(?<value>.*)/) {
$section //= '';
$self->{"$section.$config_name"} = $config_val;
}
}
}
close $config_fh;
return 1;
}
Surprisingly, that code handles most of the work of the configuration
object. Now that the class knows how to read a configuration file, you can add
a method to retrieve a value from the object. fetch is
simple:
sub fetch {
my ($self, $key) = @_;
return $self->{$key};
}
These two methods are really all you need to begin experimenting with our
TutorialConfig object. Save the sample configuration file as
tutc.txt, and then run this sample TutorialConfig program:
use 5.010;
use TutorialConfig;
my $tut = TutorialConfig->new();
$tut->read('tutc.txt') or die "Couldn't read config file: $!";
say "The author's first name is ",
$tut->fetch('author.firstname'),
".";
When you run this program, you'll see something like:
TutorialConfig has been successfully loaded!
We just created the variable...
and now it's a TutorialConfig object!
The author's first name is Doug.
You now have an object that will read configuration files and show values
inside those files. This is good enough, but there was one more goal: to write
a store method that allows you to add or change configuration
values from within a program. This is almost as simple as
fetch:
sub store {
my ($self, $key, $value) = @_;
$self->{$key} = $value;
}
Now test it:
use 5.010;
use TutorialConfig;
my $tut = TutorialConfig->new();
$tut->read('tutc.txt') or die "Can't read config file: $!";
$tut->store('author.country', 'Canada');
say $tut->fetch('author.firstname'), " lives in ",
$tut->fetch('author.country'), ".";
These three methods (read, fetch, and
store) are everything necessary for this simple
TutorialConfig.pm module. More complex modules might have dozens
of methods!
Encapsulation
You may be wondering why the code has fetch and
store methods at all. Why use
$tut->store('author.country', 'Canada') when
$tut->{'author.country'} = 'Canada' works just as well? There
are multiple reasons to use methods instead of playing directly with an
object's properties.
First, you can generally trust that a module won't change its methods, no
matter how much their implementation changes. Someday, you might want to
switch from using text files to hold configuration information to using a
database such as MySQL or PostgreSQL. The new TutorialConfig
module might have new, read, fetch and
store methods that look like:
sub new {
my ($class) = @_;
bless {}, $class;
}
sub read {
my ($self, $file) = @_;
my ($db) = database_connect($file);
return 0 unless $db;
$self->{_db} = $db;
}
sub fetch {
my ($self, $key) = @_;
my $db = $self->{_db};
return database_lookup($db, $key);
}
sub store {
my ($self, $key, $value) = @_;
my $db = $self->{_db};
return database_store($db, $key, $value);
}
(Assume that the database_connect,
database_lookup(), and database_store() routines
appear elsewhere and do just what their names imply.)
Even though the entire module's source code has changed, all of the methods still have the same names and syntax. The external interface to this code remains the same. Code that uses these methods will continue working just fine, but code that directly manipulates properties will break!
Suppose that you have some code which stores a configuration value:
$tut->{'author.country'} = 'Canada';
This works fine with the original TutorialConfig, because when
you call $tut->fetch('author.country'), it looks in the
object's properties and returns Canada just like you expected.
However, when you upgrade to the new version that uses databases, the code will
no longer return the correct result. Instead of fetch() looking
in the object's properties, it'll go to the database, which won't contain the
correct value for author.country! If you'd used
$tut->store('author.country', 'Canada') all along, things would
work fine.
As a module author, writing methods will let you make changes (bug fixes, enhancements, or even complete rewrites) without requiring your module's users to rewrite any of their code.
A related benefit is that you can further customize the behavior of this
module by subclassing or other polymorphic behavior. Polymorphism is a
four-dollar word which means "Anything that has the same interface -- the same
public attributes and methods -- behaves the same way." That is, if you had
multiple active TutorialConfig objects, you can treat them all the
same way, perhaps using the one that works with INI files to read in old
configuration data and write to the object that works with a database
backend.
Second, using methods lets you avoid impossible values. You might have an
object that takes a person's age as a property. A person's age must be a
positive number (you can't be -2 years old, unless you have a time machine, in
which case the author has a business plan and is willing to split the
proceeds!), so the age() method for this object will reject
negative numbers. If you bypass the method and directly manipulate
$obj->{age}, you may cause problems elsewhere in the code. A
routine to calculate the person's birth year, for example, might fail or
produce an odd result.
As a module author, you can use methods to help programmers who use your module write better software. You can write a good error-checking routine once, and reuse it many times.
Some languages, by the way, enforce encapsulation, by giving you the ability to make certain properties private. Perl doesn't do this. In Perl, encapsulation isn't the law. It's just a very good idea.
This encapsulation goes as far as using methods to access an object's
properites, rather than poking in the blessed hash directly. If you're creating your objects directly, you might declare a filename() accessor method to get and set the name of the configuration file:
sub filename
{
my ($self, $filename) = @_;
$self->{_filename} = $filename if defined $filename;
return $self->{_filename};
}
... and then use $self->filename() instead of accessing
$self->{_filename} directly. Not only does this insulate the
details of how you store the name of the configuration file in a single place
(the filename() method), but it allows you to change the
representation and storage of the filename and the object as a whole in this
class itself or polymorphic variants of the class.
Writing all of those accessors by hand can be a little tedious, but this is Perl. There's more than one way to do things.
Declarative Objects
If you've used objects in other languages, you may rightfully wonder "What's
with all of that weird bless a reference stuff? Why can't I just
declare my class and its attributes and have Perl take care of everything for
me?" Fortunately, you can.
The Moose distribution from the CPAN builds on Perl 5's standard object system to provide many more features in a declarative fashion that's easier to use in many ways. Moose borrows liberally from Perl 6's object system (as well as some nice features from Smalltalk and Common Lisp). The result is much more powerful and introspective.
Moose can be intimidating; it has many features. Mouse is a similar distribution which provides a gateway to Moose. It supports the most common, basic operations in the same way, though it's easier to install and understand.
This is not to say that there's anything wrong with Perl 5's default object system. It works. It's deliberately minimal, but you can build anything you want out of it. Sometimes reaching for the Mouse or the Moose can help you write (and especially maintain) larger programs more effectively.
With Mouse and Moose, you use the module, then declare your
object's attributes and some metadata about them. You get a constructor and
accessor methods in return without having to write them. A
TutorialConfig class might look like this instead:
package TutorialConfig;
use 5.010;
use Mouse;
has 'filename' => ( is => 'rw' );
has 'properties' => ( is => 'ro', default => sub { {} } );
sub read
{
my ($self, $file) = @_;
open my $config_fh, $file or return 0;
$self->filename( $file );
my $section = '';
while (my $line = <$config_fh>) {
chomp $line;
given ($line) {
when (/^\[(.*)\]/) { $section = $1 . '.' }
when (/^(?<key>[^=]+)=(?<value>.*)/) {
$self->store( $section . $+{key}, $+{value} );
}
}
}
return 1;
}
sub store
{
my ($self, $key, $value) = @_;
$self->properties->{$key} = $value;
}
sub fetch
{
my ($self, $key) = @_;
return $self->properties->{$key};
}
1;
Most of the special Mouse magic is in the first few lines of the file.
use Mouse; in the TutorialConfig package creates a
class named TutorialConfig. This class has two attributes,
filename and properties. The has
keyword (okay, it's a list-ary function, but it looks and behaves like a
keyword here!) takes the name of an attribute to add and a list of the
attribute's properties.
In this case, the filename is a rw property. It's readable and
writeable, so Mouse will generate a read/write accessor for it, named
filename.
properties is a hash. This attribute is more complex; it's not
a good idea to allow users to replace the properties hash, so it's a
read-only attribute (thus the properties() method only returns the
hash reference; you can't set it). There's also a default value: a hash
reference.
Mouse exposes one drawback of Perl in the syntax to set a default value for
the property. Every new TutorialConfig object needs its own
unique properties, so the default value provided here is an anonymous Perl
function which returns a new hash reference. If the line read instead
default => {}, all instances would share the same hash
reference. It's a slight infelicity. (Python programmers may recognize
something similar in default function parameters.)
It's little trouble to write the equivalent code by hand, but consider all of the code you don't have to write to declare a class, its attributes, and its accessors. What's left is declarative and clear to understand. Even better, it emphasizes the essential differences between the parameters; they stand out in ways that multiple near-boilerplate accessor method declarations cannot provide.
You don't have to use Mouse or Moose to use objects in Perl effectively, but even for an example as simple as this, their benefits are obvious. In an object more complex, the value of this approach is clear.
Play Around!
There are plenty of opportunities for you to modify this code as you experiment with object oriented Perl.
- The
TutorialConfigclass could use a method that will write a new configuration file to any filename you desire. Write your ownwrite()method (usekeys %$selfto fetch the keys of the object's properties, in the standard Perl object version). Be sure to handle the error condition if Perl cannot open the file! - Write a
BankAccountmodule. YourBankAccountobject should havedeposit,withdraw, andbalancemethods. Make thewithdrawmethod fail if you try to withdraw more money than you have, or deposit or withdraw a negative amount of money. - A big advantage of using CGI objects (see the previous article) is that you
can store and retrieve queries on disk. Take a look in the
CGIdocumentation to learn how to use thesave()method to store queries, and how to pass a filehandle tonewto read them from disk. Try writing a CGI program that saves recently used queries for easy retrieval. - One reason many novices avoid using modules in Perl is because they don't
know how to distribute them. The Module::Build
distribution (a core library in Perl 5.10) helps configure, build, test, and
install Perl modules. A simple Build.PL file takes only a few minutes
to write. Bundle
TutorialConfigandBankAccountinto their own distributions withModule::Build.

Print
Listen
By
I never considered plugging Mouse as a gateway to Moose. I'll refactor the documentation to be geared toward people being first exposed to Moose through Mouse.
If you're not sold on Moose, I urge you to give Moose::Unsweetened a read. It defines two classes, first with Moose then with plain Perl 5 OO. If nothing else, it should exemplify why people are so enthusiastic about Moose.
use 5.010?
If only perl and python had setup their scoping so that the 'self' wasn't always required to access class members. This leads to so many problems for people used to C++ or Java, especially if the language is silently creating new variables when you think you're accessing class members.
If there's one thing I'd like to see in the next major perl or python revision it is that.
@Ciaran, that's almost nearly impossible with the dynamic nature of Perl and Python objects. At compile time, C++ and Java can resolve named symbols to attribute access. Because Perl and Python use very late binding (and their type systems prefer polymorphic equivalence to strict taxonomic identities), there's no good way of distinguishing between free variable and attribute access at compile time.
Python has it worse; lack of variable declarations makes this impossible in the general case.
I had to replace $config_name and $config_val in this line to get it to work. Why doesn't the way presented work for me?
$self->{"$section.$+{'key'}"} = $+{'value'};
Lets see one of these articles for python!
Here is a very good perl guide for beginners:
Object-Oriented Perl Guide
use strict;
And no more silent creating new variables