Find JSRs
Submit this Search


Ad Banner
 
 
 
 

JSRs: Java Specification Requests
JSR 73: Data Mining API

Java Data Mining Public Page on java.net

Expert Group Private Page on java.net

Patent Notifications on java.net

Updates to the Java Specification Request (JSR)

Comments from the JSR Review resulted in the following updated responses to several questions in the original JSR:

2.1 Please describe the proposed Specification:

The JDMAPI specification will address the need for a pure JavaTM API that supports the building of data mining models, the scoring of data using models, as well as the creation, storage, access and maintenance of data and metadata supporting data mining results, and select data transformations.

2.3 What need of the Java community will be addressed by the proposed specification?

The Java community needs a standard way to create, store, access and maintain data and metadata supporting data mining models, data scoring, and data mining results serving J2EE-compliant application servers. Currently, there is no widely agreed upon, standard API for data mining. By using JDMAPI, implementers of data mining applications can expose a single, standard API that will be understood by a wide variety of client applications and components running on the J2EE Platform.

Similarly, Data Mining clients can be coded against a single API that is independent of the underlying data mining system. The ultimate goal of JDMAPI is to provide for data mining systems what JDBCTM did for relational databases.

A sister JSR, JSR-000069 supporting an API for OLAP, will share a common basis in the OMG CWM meta-model, noted below. As such there will be some overlap in concepts to be resolved. We plan to work with the JSR-000069 to minimize overlap and leverage common infrastructure.

To clarify the distinction between OLAP and Data Mining, consider the following: OLAP follows a deductive (query-oriented) strategy of analyzing data. Users formulate hypotheses, and execute queries to gain understanding of the underlying data. Data Mining follows an inductive strategy of analyzing data where users apply machine learning algorithms to gain non-obvious knowledge from the data.

2.6 Is there a proposed package name for the API Specification? (i.e., javapi.something, org.something, etc.)

The following are proposed as JDMAPI standard extension packages:

  • javax.datamining
  • javax.datamining.settings
  • javax.datamining.models
  • javax.datamining.transformations
  • javax.datamining.results


Original Java Specification Request (JSR)

Identification | Request | Contributions | Additional Information

Section 1. Identification

Submitting Member: Oracle Corporation

Name of Contact Person: Mark F. Hornick

E-Mail Address: mhornick@us.oracle.com

Telephone Number: +1 781 684 7564

Fax Number: +1 781 684 7564


Specification Lead: Mark F. Hornick

E-Mail Address: mhornick@us.oracle.com

Telephone Number: +1 781 684 7564

Fax Number: +1 781 684 7564


Projected Expert Group will include Experts from:

  • Data mining tool vendors
  • Business intelligence/analytics application vendors
  • Data warehousing system and tool vendors


Section 2: Request

2.1 Please describe the proposed Specification:

(NOTE that this response has been updated since the original.)

The JDMAPI specification will address the need for a pure JavaTM API that supports the creation, storage, access and maintenance of data and metadata supporting data mining models, data scoring, data mining results, and data transformations.

2.2 What is the target Java platform? (i.e., desktop, server, personal, embedded, card, etc.)

JDMAPI is targeted for the Java 2TM Platform, Enterprise Edition (J2EETM).

2.3 What need of the Java community will be addressed by the proposed specification?

(NOTE that this response has been updated since the original.)

The Java community needs a standard way to create, store, access and maintain data and metadata supporting data mining models, data scoring, and data mining results serving J2EE-compliant application servers. Currently, there is no widely agreed upon, standard API for data mining. By using JDMAPI, implementers of data mining applications can expose a single, standard API that will be understood by a wide variety of client applications and components running on the J2EE Platform.

Similarly, Data Mining clients can be coded against a single API that is independent of the underlying data mining system. The ultimate goal of JDMAPI is to provide for data mining systems what JDBCTM did for relational databases.

2.4 Why isn't this need met by existing specifications?

Currently, no existing Java platform specification provides a standard API for data mining systems. Existing APIs are generally vendor-proprietary.

2.5 Please give a short description of the underlying technology or technologies:

JDMAPI will be based on a highly-generalized, object-oriented, data mining conceptual model leveraging emerging data mining standards such OMG's CWM, SQL/MM for Data Mining, and DMG's PMML. The JDMAPI model will support four conceptual areas that are generally of key interest to users of data mining systems: settings, models, transformations, and results. The object model provides a core layer of services and interfaces that are available to all clients. Clients consistently see the same interfaces and semantics and are coded to these interfaces. A particular deployment of the object model may not necessarily support all interfaces and services defined by JDMAPI. However, JDMAPI will provide mechanisms for client discovery of supported interfaces, capabilities, and constraints.

It is up to each vendor to decide how to implement JDMAPI. Some vendors may decide to implement JDMAPI as the native API of their product. Others may opt to develop a driver/adapter that mediates between a core JDMAPI layer and multiple vendor products. The JDMAPI specification does not prescribe any particular implementation strategy.

To ensure J2EE compatibility and eliminate duplication of effort, JDMAPI will leverage existing specifications. In particular, JDMAPI will rely on the Java Connection Architecture (JSR-000016) to provide resource management, transaction management, security, and record mapping and result set management. JDMAPI will also leverage the forthcoming Java Metadata Interface (JSR-000040) for core metadata management (i.e., JDMAPI metadata interfaces will most likely extend core JMI interfaces to represent data mining metadata concepts, such as model and settings).

2.6 Is there a proposed package name for the API Specification? (i.e., javapi.something, org.something, etc.)

(NOTE that this response has been updated since the original.)

The following are proposed as JDMAPI standard extension packages:

  • javax.dmapi
  • javax.dmapi.settings
  • javax.dmapi.models
  • javax.dmapi.transformations
  • javax.dmapi.results

2.7 Does the proposed specification have any dependencies on specific operating systems, CPUs, or I/O devices that you know of?

JDMAPI has no specific operating system or hardware dependencies.

2.8 Are there any security issues that cannot be addressed by the current security model?

JDMAPI will exploit the existing security mechanisms of both J2EE (JSR-000016 in particular) and those of the underlying data mining systems.

2.9 Are there any internationalization or localization issues?

JDMAPI uses the I18N support in the Java 2 Platform, Standard Edition (J2SETM).

2.10 Are there any existing specifications that might be rendered obsolete, deprecated, or in need of revision as a result of this work?

There are no existing specifications or specification requests pending that would be rendered obsolete by the JDMAPI specification. There are no existing specifications that would require revision as a result of JDMAPI.

2.11 Please describe the anticipated schedule for the development of this specification.

We plan a community draft before end 2000.





Section 3: Contributions

3.1 Please list any existing documents, specifications, or implementations that describe the technology. Please include links to the documents if they are publicly available.

The following specifications serve (in part) as design references for JDMAPI:

  • Common Warehouse Metamodel (CWM)

    http://www.omg.org/techprocess/faxvotes/CWMI_RFP.html



  • CWM Specification, Volume 1 (ad/2000-01-01)

    CWM Specification, Volume 1, Chapter 14, Data Mining provides a sense of the overall structure of the metadata that the metadata-oriented interfaces of JDMAPI will support.

  • CWM Specification, Volume 2 (ad/2000-01-02)

    CWM Specification, Volume 2, Sections 2.14 DataMining.idl, provide a general idea of how the metadata-oriented interfaces of JDMAPI might be structured (once again, generally extending the appropriate JSR-000040 interfaces).

  • DMG PMML

    http://www.dmg.org

    PMML provides an XML-based representation for mining models and facilitates interchange among vendors for model results.

  • ISO SQL/MM Part 6. Data Mining

    SQL/MM Part 6 Data mining provides a standard interface to RDMBSs for performing data mining. Concepts from this approach may prove useful in the overall JDMAPI design.

3.2 Explanation of how these items might be used as a starting point for the work.

The above sources generally serve (in part) as design references for JDMAPI.



Section 4: Additional Information

4.1 This section contains any additional information that the submitting Member wishes to include in the JSR.

The availability of a J2EE-compliant data mining API will provide great benefit to both vendors and users of tools and applications in the areas of business intelligence/business analytics, data mining systems, and data warehousing. It will provide a standard API for creating, storing, accessing, and managing all metadata and data related to data mining systems, and greatly simplify client logic by providing a common data mining interface. Clients coded to these interfaces will be capable of connecting to a diverse set of data mining systems provided by different vendors. Similarly, data mining systems supporting JDMAPI will be capable of offering their services to a wide range of clients that can immediately connect to them without re-coding or using adapters.

Furthermore, JDMAPI's close alignment with JSR-000040 and the CWM Data Mining metamodels means that it directly supports the construction and deployment of data warehousing and business intelligence applications, tools, and platforms based on OMG open standards for metadata and system specification (i.e., MOF, UML, XMI, CWM) and the forthcoming Java metadata standard (JSR-000040).