XML-Journal
www.XML-Journal.com

On the surface, JDOM appears to be just another API for XML document manipulation, like Simple API for XML (SAX) and Document Object Model (DOM). However, JDOM takes document manipulation to the next level by providing a Java-centric, object-oriented approach to document manipulation.

JDOM was formally submitted to and recently accepted by the Java Community Process (JCP) as a Java Specification Request (JSR-102). JSRs define formal Java specifications. JDOM's acceptance as a JSR opens the door for JDOM to be incorporated into the core Java platform and escalates corporate adoption of the APIs.

In the first of this two-part series on JDOM, we introduce JDOM and explore the basic components of a typical JDOM document. Part 2 will detail how to use JDOM.

JDOM: A Closer Look
While JDOM defines an API for Java, the name JDOM does not reflect the name Java, which is trademarked by Sun Microsystems. JDOM is not an acronym and therefore also is not tied to the name DOM, which is the Document Object Model defined by the World Wide Web Consortium (W3C).

So what is JDOM? JDOM was designed by Jason Hunter and Brett McLaughlin, with the help of James Duncan Davidson, as an open source API for XML document manipulation. Developers can use JDOM in their own products without having to release their products as open-source software. This licensing concept is similar to that of the Apache Web server. The only restriction is that developers adhere to all other standards terms of the license agreement. One advantage of its being open source, subscribers to the JDOM-interest mailing list contribute to JDOM's success by submitting bug fixes and upgrades to the API.

According to the JDOM Web site, JDOM was designed to behave like Java, using the Java Collections API. The goal was to make JDOM a natural API for Java developers, and provide a low-cost alternative for using XML.

To build a JDOM document object, developers have two options. First, they can build it from scratch using JDOM objects. Or, as an alternative, they can build it indirectly, either using a SAX parser to read XML input (as a file or data stream) or build the JDOM document directly from a DOM document.

JDOM vs SAX and DOM
JDOM fills in the gaps where SAX and DOM leave off. All three - JDOM, SAX, and DOM - are capable of manipulating XML documents; however, JDOM eliminates some of the weaknesses of the other two APIs.

JDOM interoperates effectively with existing APIs, such as SAX and DOM, however, JDOM is not an abstraction layer or enhancement of those standards. As an alternative, JDOM provides a solid yet "lightweight means of reading and writing XML data," without the baggage SAX and DOM bring with them. (DOM is known to be memory-intensive while SAX fights a reputation for being difficult to use.)

This is not to say that SAX and DOM don't have a place in the XML arena. SAX offers a fast and powerful API if your goal is parsing and processing XML documents in an event-driven manner. It is also the fastest way to parse an XML file and generate a JDOM document from it. DOM provides an API that allows developers to use a common document object model across different programming languages; however, this too, comes with a price. DOM requires that the entire XML document be loaded in memory as long as the application is running, has a slow start-up time, and has a somewhat cumbersome process for manipulating documents.

JDOM's open source nature benefits other XML document manipulation APIs, such as DOM and SAX, because JDOM is able to quickly adopt new standards for working with these tools.

Developers often use JDOM with SAX and/or DOM, however, JDOM also stands on its own with no support from either. To create a JDOM document directly, developers can use the JDOM classes. This approach is typically used if developers are creating an XML document from scratch.

JDOM complements SAX and DOM by simplifying XML document manipulation. It also provides easy conversion from a JDOM document into a DOM document or set of SAX events. JDOM's design also supports the runtime plug-in of any DOM or SAX parser. Therefore you can input an XML document from any SAX or DOM source, manipulate it with ease using the JDOM API, and then output it to any destination.

Downloading and Installing JDOM
Minimum Requirements
JDOM requires that you have a recent version of the Java Development Kit - Java 2, version 1.2 or 1.3, preferably. JDK 1.1 also will work, but if you use JDK 1.1, you will need to download and install the collections.jar from Sun's Web site.

Downloading JDOM
You can obtain the JDOM distribution as a Zip file from the JDOM Web site or directly via Concurrent Versions System (CVS) from the JDOM Web site. The simplest way to download JDOM for the first time is by downloading the latest distribution source code, as a Zip file, from the JDOM Web site.

Alternatively, using CVS, not only do you get the latest version of the code, but version control is automatic, and changes to the same file from multiple sources are easily merged. The JDOM Web site provides instructions for accessing the JDOM source code from CVS.

Downloading JDOM following the instructions on the JDOM Web site will copy the JDOM module from the JDOM CVS server to your local computer.

Installing JDOM
After downloading and uncompressing the JDOM source files, you must compile the JDOM Java archive (JAR) file, for which build scripts are available courtesy of the JDOM team. The build process uses Ant, which is a platform-independent build tool from the Apache Software Foundation that ensures the environment is initialized properly. Build instructions are available in the build.xml file in the root JDOM directory. Since the path varies between UNIX and Microsoft Windows, the JDOM developers have provided build scripts for these platforms.

Installing Under Windows
To compile the JDOM package under Windows:

  • Change into the root JDOM directory. Assuming this is c:\jdom, then, at the command prompt, type:

    c:
    cd \jdom

  • To execute the build script, type:

    build

    This should create the jdom.jar file in the build subdirectory. In our example, this would be located at c:\jdom\build\jdom.jar.

    Installing Under UNIX

    To compile the JDOM package on a UNIX platform:

  • Change into the root JDOM directory. Assuming this is /opt/jdom, then, at the command prompt, type:

    cd /opt/jdom

  • To execute the build script, type:

    ./build.sh

    This should create the jdom.jar file in the build subdirectory. In our example, this would be located at /opt/build/jdom.jar.

    Other Build Options
    To see a complete list of build options just run the build command - as described previously - but specify the single command line option: usage. Other options allow you to build the JDOM API documentation (using javadoc), and perform a clean build.

    JDOM Packages
    Java developers should find JDOM very user-friendly, which was one of its original purposes. It requires only a basic understanding of XML.

    There are four packages that make up JDOM:

    Let's take a closer look at each of these packages in turn.

    org.jdom
    This is the main JDOM package. It contains the main JDOM component classes, such as Document, Element, and DocType. The org.jdom package contains the primary JDOM classes used to construct a JDOM document from scratch - or to manipulate an existing JDOM document. These classes represent the main XML constructs in an XML document.

    The most widely used classes in org.jdom include Document, DocType, Namespace, Element, and Attribute classes.

    org.jdom.input
    JDOM's input package contains helper classes used to construct a JDOM document from an existing XML source. It contains just two main classes - SAXBuilder and DOMBuilder - that allow developers to build a JDOM document using a SAX or DOM parser.

    To create a JDOM document object indirectly from a DOM document, use the DOMBuilder class. This is a useful approach when integrating JDOM with another application that has already constructed a DOM document object. To create a JDOM document object using SAX, developers use the SAXBuilder class. SAX's speed makes this approach ideal for reading and parsing an XML file to create a JDOM document.

    Details on using SAXBuilder and DOMBuilder follow in a later section.

    org.jdom.output
    The output package consists of helper classes for converting a JDOM document into another representation of an XML document. Classes include the XMLOutputter, DOMOutputter, and SAXOutputter classes, via the XMLOutputter, DOMOutputter, and SAXOutputter classes. These classes can be used to output a DOM document, generate a series of SAX events, and output an XML document, respectively.

    The XMLOutputter class allows for the output of individual JDOM components, such as Element, Comment, CDATA, Entity, and ProcessingInstruction objects, or an entire JDOM document in XML. Details of these components follow in a later section.

    Outputting an entire JDOM document using any one of these output classes requires that developers instantiate an object of the required output class, then invoke one of the output() methods.

    org.jdom.adapters
    The JDOM adapters package provides a common interface to some of the more widely used DOM implementations. They include Apache's Xerces, Apache's Crimson DOM parser, Sun's Project X, and Oracle's Version 1 and Version 2 DOM parsers.

    It is unlikely that you'll ever directly need to use any of the classes in this package. About the only time you may need to do so is if you wish to use a currently unsupported DOM parser. This package contains a class named AbstractDOMAdapter. It acts as a base class to all of the DOM adapters defined in the package. It also should be used as a base class if you need to define your own DOM adapter class.

    JDOM Document Structure
    Several components, all part of the org.jdom package, are used to make up a JDOM document - DocType, ProcessingInstruction, Comment, and a single Element referred to as the "root" Element. The root Element, as you can see from Figure 1, can contain CDATA, Comment, Element, Entity, and ProcessingInstruction objects, as well as text.

    Let's explore each of the JDOM components below (see Figure 1).

    DocType
    The document type associated with an XML document and specified with the DOCTYPE tag in an XML file is modeled in JDOM by the DocType class. The DocType object is associated with a document when the document object is constructed initially, or later specified using the Document.setDocType method. It can be retrieved with the Document.getDocType method. A DocType object has three properties that match the component parts of an XML document type - the name of the constrained element, the public ID, and the system ID.

    Element
    The main XML element contained within an XML document is referred to as the root element. In a JDOM document the root element is modeled in just the same way as any other element. The root element encloses the main body of the XML document, and it can contain other XML elements such as subelements, comments, attributes, and CDATA sections.

    The JDOM Element class defines the behavior of an XML element, allowing developers to get and set the element's textual context, attributes, and child elements.

    Attribute
    The Attribute class models the attributes, or properties, of an element. An attribute is simply a name-value pair, valid names of which are declared in the XML file's DTD.

    The Element.getAttribute() and Element.getAttributes() methods return an attribute object or a list of attributes (respectively).

    Comment
    The Comment class models an XML comment in JDOM. The text of the Comment can be retrieved using the Comment.getText() method. The Comment.setText()method allows developers to programmatically change the text in the comment.

    Other JDOM Components
    Other XML components modeled by JDOM include entities, namespaces, processing instructions, and CDATA sections. These XML components are modeled by the JDOM classes org.jdom.Entity, org.jdom.Namespace, org.jdom.ProcessingInstruction, and org.jdom.CDATA, respectively. While the type of operation you'd expect to be able to perform on each of these components is supported by JDOM, details of each are beyond the scope of this introductory article.

    Summary
    This first part of a two-part series on JDOM defined what JDOM is and what it is not. JDOM takes document manipulation to the next level by providing a Java-centric, object-oriented approach to document manipulation. We also learned that it bridges the gap between inconsistencies in XML parser APIs via adapters, making it more than another XML API for document manipulation. This article introduced the important packages and classes of the JDOM API and explained the basics of how to download and install the JDOM distribution. Part 2 examines creating JDOM documents, outputting and inputting them, and manipulating children.

    Acknowledgments
    Special thanks to Steven Gould for sharing his expertise in JDOM and working so diligently with me on this series of articles.

    Resources

    1. JDOM Web site: http://www.jdom.org/
    2. JDOM discussion lists: www.jdom.org/involved/lists.html
    3. The Collections API for JDK 1.1 can be downloaded from Sun's Web site at: www.java.sun.com/products/javabeans/infobus/
    4. Apache Software Foundation: http://www.apache.org/
    5. Ant home page: http://jakarta.apache.org/ant/


    Figure 1

    All Rights Reserved
    Copyright © 2001 SYS-CON Media, Inc.
    E-mail: info@sys-con.com