JSRs: Java Specification Requests
JSR 5: XML Parsing Specification
Section 1: Identification
David Brownell and Nancy K. Lee,
Section 2: Request
The intended specification will address the need for a complete set of implementation-independent portable APIs supporting XML 1.0. The XML specification is available on-line at: http://www.w3.org/TR/REC-xml.
XML is a platform-independent data representation, which may be viewed as a simplified web-aware version of SGML. It is serving as a foundation for a new generation of web technologies. Today, it is used in web application servers as part of dynamic content generation systems and as part of messaging systems (e.g. for business-to-business web commerce and workflow) uniting system components written in many programming languages.
Existing specifications for JavaTM APIs for XML do not address the full set of requirements for complete applications (see below for more information). In brief, the accepted portable XML APIs (SAX and DOM) have portability limitations in basic functionality, such as validation, constructing DOM trees from input documents, writing out well formed XML, and working with XML namespaces.
B. Scope and Content
We propose to develop a set of modular library APIs, a 100% Pure JavaTM Reference Implementation (RI), and a Compatibility Test Suite (CTS), addressing at least the issues noted below.
This targets the desktop and enterprise versions of the Java Platform, based on the JDK 1.1 API set. The APIs can also support the Personal Java platform with little or no extra effort.
It is recognized that there is much work going on in the area of XML at this time. This proposal will provide a set of core features that will form the building blocks for fully-functional XML-based applications.
We propose that this core should include:
(1) Event-based parsing of XML.
We anticipate that extensions not already defined by an external organization would be in a
javax.xmlpackage (or sub-package).
This technology has no direct security implications. However, it may be used in security-sensitive contexts, such as web commerce messages.
XML technology is targeted at internationalized systems. It has been defined in terms of the Unicode character set, supports a wide (and extensible) variety of character encodings, and has direct support for representing text in multiple languages within the same XML document. For example, a common use of XML in web-based systems requires support for multiple languages concurrently in the same Java Virtual Machine* (as in the case where one client uses English while another uses Japanese.)
In terms of localization, the reference implementation of this level of XML technology will require diagnostic messages and documentation to be translatable into local languages.
The risk of not providing a specification as outlined in this JSR is that fragmentation will exist in core XML APIs. Also, there will be wide variations in compatibility between different implementations. There is no particular difficulty in providing an RI or CTS, although XML conformance work (as noted below) will be required.
D. Existing Specifications
David Megginson, coordinating input from members of the "XML-DEV" mailing list, has produced the SAX 1.0 specification. The reference is specified in Java, but the intent is that this API not be specific to Java; for example, Python bindings exist.
In terms of XML conformance, there is an Oasis/NIST working group which is working to produce a set of accepted XML conformance tests and associated infrastructure.
In all of these cases, it is the desire of this process not to preempt the work being done there, but rather to collaborate as appropriate to achieve the intended results. In particular, this process will provide a focus on Java Platform integration issues, which are not the primary goal of any of those existing efforts.
Section 3: Contributions
Sun Microsystems has a highly conformant implementation of the basic standards identified above. This is accessible from http://java.sun.com/jdc/earlyAccess/xml at this time. This is one of several 100% Pure Java implementations of those standards, and is well advanced in conformance testing and performance tuning.
In conjunction with the above, Sun Microsystems has developed a set
of SAX and XML conformance tests. These tests build on top of well
accepted tests that are freely available from James Clark, called XMLTEST
validation and more complete coverage of all the testable statements in
the XML 1.0 specification.