Content Syndication (Java & XML, 2nd Edition)

In the last two chapters, I tried to give you a "web services" view of the world. In other words, you saw how to write applications that communicated with each other through the various web services technologies like WSDL, UDDI, and SOAP. However, as you also saw, some things in this worldview are still a bit shaky, like WSDL generation and support (providing you're using open standards like Apache SOAP). Today, you may want to consider other options for business-to-business communication. In this chapter, I present an alternative solution for communicating across businesses to round out your skillset.

In this chapter, I look at using different XML specifications to provide this sort of communication across application and company lines, using some companies invented for the purpose. To begin with, I'll examine the Foobar Public Library, a library that allows its suppliers to enter online new books being shipped to the library. These books are then added to the library's data store for later use. Unfortunately, the library is having a hard time finding good Java developers, so it has implemented a Perl-based CGI solution. New books are entered online and then stored by a Perl script. Already, you can see that alternatives to web services would be handy, as finding a good Perl SOAP implementation is not easy (at least not yet!).

I'll also look at another company, mytechbooks.com. mytechbooks.com sells technical and computing books (such as this one) online through various partnerships with large bookstores. It has recently signed an agreement with the Foobar Public Library to obtain books from the library. mytechbooks.com will pay for the shipping and inventory costs of the books, while the library agrees to order extra books at its discounted costs; these extra books are then sold by mytechbooks.com. mytechbooks.com needs to be able to access the new books entered into the Foobar Public Library by suppliers to know when new offerings are available, and then advertise those new offerings. However, mytechbooks.com has no idea how to interface with the Foobar Public Library's Perl-based system. Additionally, there are no protected network connections between the two organizations, so normal HTTP must be used for communication. And just to get us out of the web services world, mytechbooks.com wants to wait until web services is more fleshed out, and has WSDL support integrated in more firmly, so wants a more stable solution (or at least one that has been in use a little longer).

Finally, I'll look at customers of mytechbooks.com. The bookstore targets people who are active online, so wants to advertise on sites like Netscape Netcenter; it also wants to allow people to easily obtain information from its site when new offerings are available. However, as in the situation with the Foobar Public Library, the people at mytechbooks.com have no idea how to achieve this goal. Seeing as they've read O'Reilly books and articles on http://www.oreillynet.com, they've heard RSS's spec lead, Rael Dornfest, talk about how cool RSS is, and want to try it. Of course, Rael is right, so that's what I talk about in this chapter.

We tackle this common scenario by starting with the Foobar Public Library and examining its Perl system. Moving out to mytechbooks.com and then the customers of the bookstore, I show you how to enable this business-to-business (to-customer) application by using XML as a communication tool between each layer.

14.1. The Foobar Public Library

To start the creation of a business-to-business system, I describe the system currently in place at the Foobar Public Library. Before diving into the code, though, it's necessary to examine the library's requirements so that you do not create a system it cannot support.

14.1.1. Evaluating the Requirements

All too often, good solutions to a problem are not appropriate solutions for the company with the problem. The Foobar Library is a perfect example of this: certainly a Java servlet that could communicate with servlets built by mytechbooks.com could quickly solve the two organizations' problems. However, this ignores the library's requirements. Before creating a solution, the library detailed its requirements:

The solution must be Perl-based; no Java engineers are on staff.
The solution must not involve new software or library installations.
The solution must not impact the existing order-entry system (no interface changes).

While these are not extremely stringent requirements, they force a solution other than Java servlets. You must avoid using Java as a solution. Of course, as this is a book on XML, you should be thinking that storing the data about new books in an XML format could allow the library to then supply that XML to clients through an HTTP request, thus enabling those clients to use the data in any way they wish. In fact, this is a much better solution than servlet-to-servlet communication, as the XML can be used by any company or client in its applications, rather than tying the library (and its books) to a specific company. This then defines the goal for updating the Foobar Public Library's system: save the entered information as XML data, and then provide HTTP access to that XML data for clients and customers.

14.1.2. Entering the Books

We need to examine the existing HTML interface for suppliers entering new books into the system. Example 14-1 shows the static HTML used to generate this form.

Example 14-1. Static HTML for Foobar Public Library interface

<html>

<head>
  <title>Foobar Public Library: Add Books</title>
  <style>
<!--
body         { font-family: Arial }
h1           { color: #000080 }
-->
  </style>
</head>

<body link="#FFFF00" vlink="#FFFF00" alink="#FFFF00">
 <table border="0" width="100%" cellpadding="0" cellspacing="0">
  <tr>
   <td width="15%" bgcolor="#000080" valign="top" align="center">
    <b><i>
     <font color="#FFFFFF" size="4">Options</font>
    </i></b>
   <p><b>
     <font color="#FFFFFF">
      <a href="/javaxml/foobar">Main Menu</a>
     </font>
   </p></b>
   <p><b>
    <font color="#FFFFFF">
     <a href="/javaxml/foobar/catalog.html">Catalog</a>
    </font>
   </b></p>
   <p><b>
    <i><font color="#FFFF00">Add Books</font></i>
   </b></p>
   <p><b>
    <font color="#FFFFFF">
     <a href="/javaxml/foobar/logout.html">Log Out</a>
    </font>
   </p></td>
   <td width="*" valign="top" align="center">
    <h1 align="center">The Foobar Public Library</h1>
    <h3 align="center"><i>- Add Books -</i></h3>

<!-- This will need to point at your CGI directory and script, which
     we look at next -->
    <form method="POST" action="/cgi/addBook.pl">

     <table border="0" cellpadding="5" width="100%">
      <tr>
       <td width="100%" valign="top" align="center" colspan="2">
        Title&nbsp;
        <input type="text" name="title" size="20">
        <hr width="85%" />
       </td>
      </tr>
      <tr>
       <td width="50%" valign="top" align="right">Author&nbsp;
        <input type="text" name="author" size="20">
       </td>
       <td width="50%" valign="top" align="left">Subject&nbsp;
        <select size="1" name="subject">
         <option>Fiction</option>
         <option>Biography</option>
         <option>Science</option>
         <option>Industry</option>
         <option>Computers</option>
        </select></td>
       </tr>
       <tr>
        <td width="50%" valign="top" align="right">Publisher&nbsp;
         <input type="text" name="publisher" size="20">
        </td>
        <td width="50%" valign="top" align="left">ISBN&nbsp;
         <input type="text" name="isbn" size="20">
        </td>
       </tr>
       <tr>
        <td width="50%" valign="top" align="right">Price&nbsp;
         <input type="text" name="price" size="20">
        </td>
        <td width="50%" valign="top" align="left">Pages&nbsp;
         <input type="text" name="numPages" size="20">
        </td>
       </tr>
       <tr>
        <td width="100%" valign="top" align="center" colspan="2">
         Description&nbsp;
         <textarea rows="2" name="description" cols="20"></textarea>
        </td>
       </tr>
      </table>
      <p>
       <input type="submit" value="Add this Book" name="addBook"> 
       <input type="reset" value="Reset Form" name="reset">
       <input type="button" value="Cancel" name="cancel">
      </p>
    </form>
   </td>
  </tr>
 </table>
</body>
</html>

This file, saved as addBooks.html, provides the portion of the library application allowing suppliers to add new books they are sending to the library.

NOTE: In Example 14-1 and throughout the rest of the chapter, complete code and HTML listings are given so that you can create the example applications and walk through the process of enabling XML communication across the applications. Additionally, the code examples in this chapter assume you are using the filenames supplied in the text; you will need to change the code and examples if you use your own filenames. Code that may need to be changed to reference different filenames or scripts is emphasized in the listings to help you walk through the examples.

The HTML in Example 14-1, when accessed through a web server, results in the output shown in Figure 14-1. Although we do not look at the other menu options, the supplier can also view the library's catalog, go to the application's main menu, and log out of the application by using the menu on the left of the screen.

Figure 14-1. HTML user interface for Foobar Public Library

This form allows the supplier to enter the details about each book it is sending to the library. The supplier enters the book's essentials (title, author, publisher, pages, and a description), as well as a subject to categorize the book, and sales details, which include the price and ISBN number.

Once this information has been entered, it is submitted to a Perl CGI script:

<form method="POST" action="/cgi/addBook.pl">

This script, then, must produce XML output. The easiest solution would be to download a Perl library that handled XML parsing, such as Xerces-Perl; however, remember that one requirement of the library was that no libraries or software could be added. While this may seem silly and frustrating, keep in mind that many companies have very strict lock-downs on their production systems. In this case, the Foobar Public Library is just beginning to introduce applications on the Internet, and it does not have resources to support additional software.

Luckily, the code only has to output XML; this is done fairly easily by generating a file with information on the entered books by brute force. Things would be much trickier if parsing incoming XML were required. Because the library needs to keep any existing books, each new entry is appended to an existing file, instead of creating a new file upon a new request. Writing the Perl is almost trivial, and the complete Perl program to read the request parameters and append the information to an existing file is shown in Example 14-2.

Example 14-2. Perl CGI script to generate XML entries from entered books

#!/usr/local/bin/perl

# This should be the directory you wish to write files to
$baseDir = "/home/bmclaugh/javaxml/foobar/books/";

# This should be the filename to use
$filename = "books.txt";

$bookFile = $baseDir . $filename;

# Get the user's input
use CGI;
$query = new CGI;

$title = $query->param('title');
$author = $query->param('author');
$subject = $query->param('subject');
$publisher = $query->param('publisher');
$isbn = $query->param('isbn');
$price = $query->param('price');
$numPages = $query->param('numPages');
$description = $query->param('description');

# Save the book to a file in XML
if (open(FILE, ">>" . $bookFile)) {
  print FILE "<book subject=\"" . $subject . "\">\n";
  print FILE " <title><![CDATA[" . $title . "]]></title>\n";
  print FILE " <author><![CDATA[" . $author . "]]></author>\n";
  print FILE " <publisher><![CDATA[" . $publisher . "]]></publisher>\n";
  print FILE " <numPages>" . $numPages . "</numPages>\n";
  print FILE " <saleDetails>\n";
  print FILE "  <isbn>" . $isbn . "</isbn>\n";
  print FILE "  <price>" . $price . "</price>\n";
  print FILE " </saleDetails>\n";
  print FILE " <description>";
  print FILE "<![CDATA[" . $description . "]]>";
  print FILE "</description>\n";
  print FILE "</book>\n\n";

  # Give the user a confirmation
  print <<"EOF";
Content-type: text/html

  <html>
   <head>
    <title>Foobar Public Library: Confirmation</title>
   </head>
   <body>
    <h1 align="center">Book Added</h1>
    <p align="center">
     Thank you.  The book you submitted has been added to the Library.
    </p>
   </body>
  </html>
EOF

} else {
  print <<"EOF";
Content-type: text/html

  <html>
   <head>
    <title>Foobar Public Library: Error</title>
   </head>
   <body>
    <h1 align="center">Error in Adding Book</h1>
    <p align="center">
     We're sorry.  The book you submitted has <i>not</i> been added to 
     the Library.
    </p>
   </body>
  </html>
EOF
}
close (FILE);

This program, saved as addBook.pl, is invoked by a form submitted when the supplier enters a new book. The script defines the file to write to, and then assigns the request parameter values to local variables:

$title = $query->param('title');
$author = $query->param('author');
$subject = $query->param('subject');
$publisher = $query->param('publisher');
$isbn = $query->param('isbn');
$price = $query->param('price');
$numPages = $query->param('numPages');
$description = $query->param('description');

Once these values are easily accessible, the script opens the file defined earlier in append mode (signified by >> preceding the filename) and writes raw XML-formatted information about the entered book to the end of the file:

  print FILE "<book subject=\"" . $subject . "\">\n";
  print FILE " <title><![CDATA[" . $title . "]]></title>\n";
  print FILE " <author><![CDATA[" . $author . "]]></author>\n";
  print FILE " <publisher><![CDATA[" . $publisher . "]]></publisher>\n";
  print FILE " <numPages>" . $numPages . "</numPages>\n";
  print FILE " <saleDetails>\n";
  print FILE "  <isbn>" . $isbn . "</isbn>\n";
  print FILE "  <price>" . $price . "</price>\n";
  print FILE " </saleDetails>\n";
  print FILE " <description>";
  print FILE "<![CDATA[" . $description . "]]>";
  print FILE "</description>\n";
  print FILE "</book>\n\n";

The subject is used as an attribute on the enclosing element, book, and the rest of the information is entered in as elements. Because a book's title, author, description, and publisher may include quotation marks, apostrophes, ampersands, and other characters that would have to be escaped, the code encloses that data within a CDATA section so as not to have to worry about escaping the data.

Additionally, you should notice that no XML declaration or root element is created, as multiple books will exist in a single file. Because it is a bit difficult to check if the file exists, write the declaration and root element if the file is new, and then write out the ending element (which has to be overwritten at each new entry), the file is left as an XML document fragment. For example, here is what the file might look like after two books have been entered:

<book subject="Computers">
 <title><![CDATA[Java Servlet Programming]]></title>
 <author><![CDATA[Jason Hunter]]></author>
 <publisher><![CDATA[O'Reilly & Associates]]></publisher>
 <numPages>753</numPages>
 <saleDetails>
  <isbn>0596000405</isbn>
  <price>44.95</price>
 </saleDetails>
 <description><![CDATA[This book is a superb introduction to Java 
  servlets and their various communications mechanisms.]]></description>
</book>

<book subject="Fiction">
 <title><![CDATA[Second Foundation]]></title>
 <author><![CDATA[Isaac Asimov]]></author>
 <publisher><![CDATA[Bantam Books]]></publisher>
 <numPages>279</numPages>
 <saleDetails>
  <isbn>0553293362</isbn>
  <price>5.59</price>
 </saleDetails>
 <description><![CDATA[fter the First Foundation was taken over by the 
  Mule, only the Second Foundation stood between order and the utter 
  destruction the Mule would bring.]]></description>
</book>

Although not a complete XML document, this fragment is well-formed and could be inserted into an XML document with the header and root element already set. In fact, when I look at providing a listing of books in the next section, that is precisely how I'll handle output of the fragment.

The rest of the script outputs HTML indicating whether the book was successfully added or if errors occurred. Once a book has been added to the XML storage, the supplier receives the simple confirmation message shown in Figure 14-2.

Figure 14-2. Confirmation message when a book is added

Now that there is an XML document fragment with information about new books, you'll need to take that file and provide it to requestors.

14.1.3. Providing a Listing of Available Books

We again can use Perl as a mechanism to provide clients and customers with an XML listing of new books. I'm making the assumption that some other portion of the library's application periodically reads the XML data and updates the library's catalog; at this point, that application component is responsible for removing the entries within the file (or the file itself) so that the books within it are no longer regarded as new entries. With this assumption, all a second Perl script has to do is read the XML fragment and add the data within it to an XML document that is output to the screen. As I already mentioned, the script also needs to add an XML declaration and a root element to surround the content within the new books file. This new script, shown in Example 14-3, reads the file created by the addBook.pl script and outputs the content within an XML document when it is requested over HTTP.

Example 14-3. Perl CGI script to output XML document with new book listings

#!/usr/local/bin/perl

# This should be the directory you wish to write files to
$baseDir = "/home/bmclaugh/javaxml/foobar/books/";

# This should be the filename to use
$filename = "books.txt";

$bookFile = $baseDir . $filename;

# First open the file
open(FILE, $bookFile) || die "Could not open $bookFile.\n";

# Let browser know what is coming
print "Content-type: text/plain\n\n";

# Print out XML header and root element
print "<?xml version=\"1.0\"?>\n";
print "<books>\n";

# Print out books
while (<FILE>) {
  print "$_";
}

# Close root element
print "</books>\n";

close(FILE);

This script, saved as supplyBooks.pl, accepts a request, reads the file created by addBook.pl, and outputs XML upon an HTTP request. The result of requesting this script in a web browser (with several books added) is shown in Figure 14-3.

Figure 14-3. XML output from supplyBooks.pl

As you can see, this easily turned the library's simple Perl-based application into a component capable of supplying useful information to its clients, including the mytechbooks.com technical bookstore. Additionally, we were able to accomplish this without installing new software, changing the architecture of the library's system or application, or even writing a line of Java!

Chapter 14. Content Syndication

Contents: