Printed: May 3, 2024
From: http://www.jcp.org/en/jsr/detail?id=73
|
Java Data Mining Public Page on java.net
Expert Group Private Page on java.net
Patent Notifications on java.net
Updates to the Java Specification Request (JSR)
Comments from the JSR Review resulted in the following updated
responses to several questions in the original JSR:
2.1 Please describe the proposed Specification:
The JDMAPI specification will address the need for a pure JavaTM API that supports the building of data mining models, the scoring of data using models, as well as the creation, storage, access and maintenance of data and metadata supporting data mining results, and select data transformations.
The Java community needs a standard way to create, store, access and maintain data and metadata supporting data mining models, data scoring, and data mining results serving J2EE-compliant application servers. Currently, there is no widely agreed upon, standard API for data mining. By using JDMAPI, implementers of data mining applications can expose a single, standard API that will be understood by a wide variety of client applications and components running on the J2EE Platform.
Similarly, Data Mining clients can be coded against a single API that is independent of the underlying data mining system. The ultimate goal of JDMAPI is to provide for data mining systems what JDBCTM did for relational databases.
A sister JSR, JSR-000069 supporting an API for OLAP, will share a common basis in the OMG CWM meta-model, noted below. As such there will be some overlap in concepts to be resolved. We plan to work with the JSR-000069 to minimize overlap and leverage common infrastructure.
To clarify the distinction between OLAP and Data Mining, consider the following: OLAP follows a deductive (query-oriented) strategy of analyzing data. Users formulate hypotheses, and execute queries to gain understanding of the underlying data. Data Mining follows an inductive strategy of analyzing data where users apply machine learning algorithms to gain non-obvious knowledge from the data.
The following are proposed as JDMAPI standard extension packages:
Identification |
Request |
Contributions |
Additional Information
Section 1. Identification
Submitting Member: Oracle Corporation Name of Contact Person: Mark F. Hornick E-Mail Address: mhornick@us.oracle.com Telephone Number: +1 781 684 7564 Fax Number: +1 781 684 7564 Specification Lead: Mark F. Hornick E-Mail Address: mhornick@us.oracle.com Telephone Number: +1 781 684 7564 Fax Number: +1 781 684 7564 Projected Expert Group will include Experts from:
Section 2: Request
(NOTE that this response has been updated
since the original.) The JDMAPI specification will address
the need for a pure JavaTM API that
supports the creation, storage,
access and maintenance of data and metadata supporting data mining
models, data scoring, data mining results, and data transformations. JDMAPI is targeted for the Java 2TM Platform, Enterprise Edition
(J2EETM).
2.1 Please describe the proposed Specification:
2.2 What is the target Java platform? (i.e., desktop, server, personal, embedded, card, etc.)
(NOTE that this response has been updated since the original.)
The Java community needs a standard way to create, store, access and maintain data and metadata supporting data mining models, data scoring, and data mining results serving J2EE-compliant application servers. Currently, there is no widely agreed upon, standard API for data mining. By using JDMAPI, implementers of data mining applications can expose a single, standard API that will be understood by a wide variety of client applications and components running on the J2EE Platform.
Similarly, Data Mining clients can be coded against a single API that is independent of the underlying data mining system. The ultimate goal of JDMAPI is to provide for data mining systems what JDBCTM did for relational databases.
Currently, no existing Java platform specification provides a standard API for data mining systems. Existing APIs are generally vendor-proprietary.
JDMAPI will be based on a highly-generalized, object-oriented, data mining conceptual model leveraging emerging data mining standards such OMG's CWM, SQL/MM for Data Mining, and DMG's PMML. The JDMAPI model will support four conceptual areas that are generally of key interest to users of data mining systems: settings, models, transformations, and results. The object model provides a core layer of services and interfaces that are available to all clients. Clients consistently see the same interfaces and semantics and are coded to these interfaces. A particular deployment of the object model may not necessarily support all interfaces and services defined by JDMAPI. However, JDMAPI will provide mechanisms for client discovery of supported interfaces, capabilities, and constraints.
It is up to each vendor to decide how to implement JDMAPI. Some vendors may decide to implement JDMAPI as the native API of their product. Others may opt to develop a driver/adapter that mediates between a core JDMAPI layer and multiple vendor products. The JDMAPI specification does not prescribe any particular implementation strategy.
To ensure J2EE compatibility and eliminate duplication of effort, JDMAPI will leverage existing specifications. In particular, JDMAPI will rely on the Java Connection Architecture (JSR-000016) to provide resource management, transaction management, security, and record mapping and result set management. JDMAPI will also leverage the forthcoming Java Metadata Interface (JSR-000040) for core metadata management (i.e., JDMAPI metadata interfaces will most likely extend core JMI interfaces to represent data mining metadata concepts, such as model and settings).
(NOTE that this response has been updated since the original.)
The following are proposed as JDMAPI standard extension packages:
JDMAPI has no specific operating system or hardware dependencies.
JDMAPI will exploit the existing security mechanisms of both J2EE (JSR-000016 in particular) and those of the underlying data mining systems.
JDMAPI uses the I18N support in the Java 2 Platform, Standard Edition (J2SETM).
There are no existing specifications or specification requests pending that would be rendered obsolete by the JDMAPI specification. There are no existing specifications that would require revision as a result of JDMAPI.
We plan a community draft before end 2000.
Section 3: Contributions
The following specifications serve (in part) as design references for JDMAPI:
CWM Specification, Volume 1, Chapter 14, Data Mining provides a sense of the overall structure of the metadata that the metadata-oriented interfaces of JDMAPI will support.
CWM Specification, Volume 2, Sections 2.14 DataMining.idl, provide a general idea of how the metadata-oriented interfaces of JDMAPI might be structured (once again, generally extending the appropriate JSR-000040 interfaces).
PMML provides an XML-based representation for mining models and facilitates interchange among vendors for model results.
SQL/MM Part 6 Data mining provides a standard interface to RDMBSs for performing data mining. Concepts from this approach may prove useful in the overall JDMAPI design.
The above sources generally serve (in part) as design references for JDMAPI.
Section 4: Additional Information
The availability of a J2EE-compliant data mining API will provide great benefit to both vendors and users of tools and applications in the areas of business intelligence/business analytics, data mining systems, and data warehousing. It will provide a standard API for creating, storing, accessing, and managing all metadata and data related to data mining systems, and greatly simplify client logic by providing a common data mining interface. Clients coded to these interfaces will be capable of connecting to a diverse set of data mining systems provided by different vendors. Similarly, data mining systems supporting JDMAPI will be capable of offering their services to a wide range of clients that can immediately connect to them without re-coding or using adapters.
Furthermore, JDMAPI's close alignment with JSR-000040 and the CWM Data Mining metamodels means that it directly supports the construction and deployment of data warehousing and business intelligence applications, tools, and platforms based on OMG open standards for metadata and system specification (i.e., MOF, UML, XMI, CWM) and the forthcoming Java metadata standard (JSR-000040).