Learning Perl Objects, References & ModulesLearning Perl Objects, References & ModulesSearch this book

Chapter 7. Practical Reference Tricks

Contents:

Review of Sorting
Sorting with Indices
Sorting Efficiently
The Schwartzian Transform
Recursively Defined Data
Building Recursively Defined Data
Displaying Recursively Defined Data
Exercises

This chapter looks at optimizing sorting and dealing with recursively defined data.

7.1. Review of Sorting

Perl's built-in sort operator sorts text strings in their natural text order, by default.[29]

[29]My friends call that the "ASCIIbetical" ordering. Normally modern Perl doesn't use ASCII; instead, it uses a default sort order, depending on the current locale and character set; see the perllocale (not perllocal!) manpage.

This is fine if you want to sort text strings:

my @sorted = sort qw(Gilligan Skipper Professor Ginger Mary_Ann);

but gets pretty messy when you want to sort numbers:

my @wrongly_sorted = sort 1, 2, 4, 8, 16, 32;

The resulting list is 1, 16, 2, 32, 4, 8. Why didn't sort order these properly? It treats each item as a string and sorts them in string order. Any string that begins with 3 sorts before any string that begins with 4.

You can fix this by overriding how Perl compares pairs of items in the list. By default, as Perl orders the items, a string comparison is used. A new comparison is specified using a sort block, placed between the sort keyword and the list of things to sort.[30]

[30]A sort block can also be specified as a subroutine name, causing the subroutine to be invoked when a comparison is needed.

Within the sort block, $a and $b stand in for two of the items to be sorted. The last evaluated expression must return a -1, 0, or +1 value.[31] If the value is -1, the value currently in $a must appear before the value in $b in the final sorted list. If the value is +1, then the value in $a must appear after the value in $b in the final sorted list. If the value is 0, you don't know or can't tell, so the results are unpredictable.[32]

[31]Actually, you can use any negative or positive number in place of -1 and +1, respectively.

[32]Recent Perl versions include a default sorting engine that is stable, so zero returns from the sort block cause the relative ordering of $a and $b to reflect their order in the original list. Older versions of Perl didn't guarantee such stability, and a future version might not use a stable sort, so don't rely on it.

For example, to sort those numbers in their proper order, you can use a sort block comparing $a and $b, like so:

my @numerically_sorted = sort {
  if ($a < $b)    { -1 }
  elsif ($a > $b) { +1 }
  else            {  0 }
} 1, 2, 4, 8, 16, 32;

Now you have a proper numeric comparison, so you have a proper numeric sort. Of course, this is far too much typing, so you can use the spaceship operator instead:

my @numerically_sorted = sort { $a <=> $b } 1, 2, 4, 8, 16, 32;

The spaceship operator returns -1, 0, and +1, according to the rules discussed. A descending sort is as simple as reversing the position of $a and $b:

my @numerically_descending = sort { $b <=> $a } 1, 2, 4, 8, 16, 32;

In every place the previous sort expression returned -1, this expression returns +1, and vice versa. Thus, the sort is in the opposite order. It's also easy to remember because if $a is to the left of $b, you get out the lower items first, just like a and b would be in the resulting list.

Likewise, a string sort can be indicated with cmp, although this is used less often because it is the default comparison. The cmp operator is handier when you have a more complex comparison, as you'll see shortly.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.