Book HomePerl & XMLSearch this book

4.4. Stream Applications

Stream processing is great for many XML tasks. Here are a few of them:

Filter

A filter outputs an almost identical copy of the source document, with a few small changes. Every incidence of an <A> element might be converted into a <B> element, for example. The handler is simple, as it has to output only what it receives, except to make a subtle change when it detects a specific event.

Selector

If you want a specific piece of information from a document, without the rest of the content, you can write a selector program. This program combs through events, looking for an element or attribute containing a particular bit of unique data called a key, and then stops. The final job of the program is to output the sought-after record, possibly reformatted.

Summarizer

This program type consumes a document and spits out a short summary. For example, an accounting program might calculate a final balance from many transaction records; a program might generate a table of contents by outputting the titles of sections; an index generator might create a list of links to certain keywords highlighted in the text. The handler for this kind of program has to remember portions of the document to repackage it after the parser is finished reading the file.

Converter

This sophisticated type of program turns your XML-formatted document into another format -- possibly another application of XML. For example, turning DocBook XML into HTML can be done in this way. This kind of processing pushes stream processing to its limits.

XML stream processing works well for a wide variety of tasks, but it does have limitations. The biggest problem is that everything is driven by the parser, and the parser has a mind of its own. Your program has to take what it gets in the order given. It can't say, "Hold on, I need to look at the token you gave me ten steps back" or "Could you give me a sneak peek at a token twenty steps down the line?" You can look back to the parsing past by giving your program a memory. Clever use of data structures can be used to remember recent events. However, if you need to look behind a lot, or look ahead even a little, you probably need to switch to a different strategy: tree processing, the topic of Chapter 6, "Tree Processing".

Now you have the grounding for XML stream processing. Let's move on to specific examples and see how to wrangle with XML streams in real life.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.