Perl CookbookPerl CookbookSearch this book

Chapter 5. Hashes

Contents:

Introduction
Adding an Element to a Hash
Testing for the Presence of a Key in a Hash
Creating a Hash with Immutable Keys or Values
Deleting from a Hash
Traversing a Hash
Printing a Hash
Retrieving from a Hash in Insertion Order
Hashes with Multiple Values per Key
Inverting a Hash
Sorting a Hash
Merging Hashes
Finding Common or Different Keys in Two Hashes
Hashing References
Presizing a Hash
Finding the Most Common Anything
Representing Relationships Between Data
Program: dutree

Larry Wall

Doing linear scans over an associative array is like trying to club someone to death with a loaded Uzi.

5.0. Introduction

People and parts of computer programs interact in all sorts of ways. Single scalar variables are like hermits, living a solitary existence whose only meaning comes from within the individual. Arrays are like cults, where multitudes marshal themselves under the name of a charismatic leader. In the middle lies the comfortable, intimate ground of the one-to-one relationship that is the hash. (Older documentation for Perl often called hashes associative arrays, but that's a mouthful. Other languages that support similar constructs sometimes use different terms for them; you may hear about hash tables, tables, dictionaries, mappings, or even alists, depending on the language.)

Unfortunately, this isn't a relationship of equals. The relationship encoded in a hash is that of the genitive case or the possessive, like the word "of " in English, or like "'s". We could encode that the boss of Nat is Tim. Hashes only give convenient ways to access values for Nat's boss; you can't ask whose boss Tim is. Finding the answer to that question is a recipe in this chapter.

Fortunately, hashes have their own special benefits, just like relationships. Hashes are a built-in data type in Perl. Their use reduces many complex algorithms to simple variable accesses. They are also fast and convenient to build indices and quick lookup tables.

Only use the % when referring to the hash as a whole, such as %boss. When referring to the value associated with a particular key, that's a single scalar value, so a $ is called for—just as when referring to one element of an array, you also use a $. This means that "the boss of Nat" would be written as $boss{"Nat"}. We can assign "Tim" to that:

$boss{"Nat"} = "Tim";

It's time to put a name to these notions. The relationship embodied in a hash is a good thing to use for its name. In the previous example you see a dollar sign, which might surprise you since this is a hash, not a scalar. But we're setting a single scalar value in that hash, so use a dollar sign. Where a lone scalar has $ as its type identifier and an entire array has @, an entire hash has %.

A regular array uses integers for indices, but the indices of a hash are always strings. Its values may be any arbitrary scalar values, including references. With references as values, you can create hashes that hold not merely strings or numbers, but also arrays, other hashes, or objects. (Or rather, references to arrays, hashes, or objects.)

An entire hash can be initialized with a list, where elements of the list are key and value pairs:

%age = ( "Nat",   30,
         "Jules", 31,
         "Josh",  23 );

This is equivalent to:

$age{"Nat"}   = 30;
$age{"Jules"} = 31;
$age{"Josh"}  = 23;

To make it easier to read and write hash initializations, the => operator, sometimes known as a comma arrow, was created. Mostly it behaves like a better-looking comma. For example, you can write a hash initialization this way:

%food_color = (
               "Apple"  => "red",
               "Banana" => "yellow",
               "Lemon"  => "yellow",
               "Carrot" => "orange"
              );

(This particular hash is used in many examples in this chapter.) This initialization is also an example of hash-list equivalence—hashes behave in some ways as though they were lists of key-value pairs. We'll use this in a number of recipes, including the merging and inverting recipes.

Unlike a regular comma, the comma arrow has a special property: it quotes any word preceding it, which means you can safely omit the quotes and improve legibility. Single-word hash keys are also automatically quoted when they occur inside braces, which means you can write $hash{somekey} instead of $hash{"somekey"}. You could rewrite the preceding initialization of %food_color as:

%food_color = (
                Apple  => "red",
                Banana => "yellow",
                Lemon  => "yellow",
                Carrot => "orange"
               );

One important issue to be aware of regarding hashes is that their elements are stored in an internal order convenient for efficient retrieval. This means that no matter what order you insert your data, it will come out in an unpredictable disorder.

5.0.1. See Also

The perldata(1) manpage; the two sections on "Hashes" in the first and second chapters of Programming Perl



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.