JSRs: Java Specification Requests
JSR 381: Visual Recognition (VisRec) Specification
JCP version in use: 2.10
Java Specification Participation Agreement version in use: 2.0
Java APIs for detecting, recognizing and annotating images with focus on the content of the images, facial detection, facial emotions, image tagging, specifying image classifiers and training the visual data.
Expert Group Transparency:
Public Project Page
Section 1. Identification
Submitting Member: Zoran Sevarac
Name of Contact Person: Zoran Sevarac
E-Mail Address: sevarac
Telephone Number: +381 63 8427533
Fax Number: +381 63 8427533
Specification Lead Members: Zoran Sevarac, Frank Greco, IBM
Specification Leads: Zoran Sevarac, Frank Greco, Sandhya Kapoor
E-Mail Addresses: sevarac
Telephone Numbers: +381 63 8427533, +1 732 259 9020, +1 512 527 4251
Fax Number: +381 63 8427533, -, -
Initial Expert Group Membership:
Supporting this JSR:
* Crossroads Technologies
Section 2: Request
2.1 Please describe the proposed Specification:
To simplify and standardize a set of Java APIs for detecting, recognizing and annotating images.
Machine Learning (ML) is a huge industry trend and is projected to be for the next 5+ years. There are wide business implications for machine learning capabilities in all applications across many types of devices. Visual Recognition (VisRec) is an important subset of ML. Right now the primary language for ML is Python. We feel Java needs to play a major role in ML, starting with VisRec.
2.2 What is the target Java platform? (i.e., desktop, server, personal, embedded, card, etc.)
Desktop, server, embedded, mobile
2.3 The Executive Committees would like to ensure JSR submitters think about how their proposed technology relates to all of the Java platform editions. Please provide details here for which platform editions are being targeted by this JSR, and how this JSR has considered the relationship with the other platform editions.
Our Visual Recognition Specification would apply to all Java platform editions. Typically the API will be used in conjunction with an ML engine, package or set of libraries. Typically this would execute on a server or set of servers (as most ML applications) and callable from either the server side or remotely from a distributed client (JavaFX, web, command-line).
2.4 What need of the Java community will be addressed by the proposed specification?
There is a need for a standard and flexible set of APIs for Java application developers. A set of well-defined APIs is essential for robust system architecture, ease of development and portability. Since there is still much innovation with the underlying ML engines particularly with ML algorithms, these APIs will offer high-level abstractions that support sustainable development of products and protect developers from lower-level changes. And if developers need access to the foundation libraries or services, there will be hooks allowing lower-level access. There are many ML toolkits, libraries and frameworks available today; some are OSS and some are commercial. Their programmatic levels vary from extremely low-level APIs that require significant expertise to very high-level APIs with pre-defined behaviors. There are no standards for easy-to-use, flexible high-level APIs that developers can rely on.
A proposed set of high-level APIs would also offer hooks to the underlying ML engines (eg, Tensorflow, Watson, etc). Note that other than a bitmap abstraction, we are not standardizing any lower-level APIs; there is still much innovation occurring at that level. Our proposal is targeting the needs of Java application developers in building custom Image Classifiers (not just using pre-trained Classifiers).
2.5 Why isn't this need met by existing specifications?
There are no Java standards for easy-to-use, flexible high-level APIs that developers can rely on. There is a wide, disparate collection of open-source and proprietary ML engines, toolkits and packages, each with their own APIs. None are standardized.
2.6 Please give a short description of the underlying technology or technologies:
Machine learning is essentially parsing data, learning patterns and then predicting. Visual Recognition applies this model to images for classification, annotating, recognizing, etc. Our API relies on the underlying machine learning engines from the OSS and commercial offerings.
IBM Watson, Google TensorFlow, Deeplearning4j, Azure ML, Weka, RapidMiner, etc. Each "engine" has its own API and many require a deep understanding of data science and image processing even for simple applications. Some of these offer a high-level API, but typically it is very restrictive (eg, use only pre-built classifiers) or proprietary.
2.7 Is there a proposed package name for the API Specification? (i.e., javapi.something, org.something, etc.)
2.8 Does the proposed specification have any dependencies on specific operating systems, CPUs, or I/O devices that you know of?
No. The VisRec API only has a dependency on an underlying ML engine.
2.9 Are there any security issues that cannot be addressed by the current security model?
2.10 Are there any internationalization or localization issues?
2.11 Are there any existing specifications that might be rendered obsolete, deprecated, or in need of revision as a result of this work?
None that we are aware of.
2.12 Please describe the anticipated schedule for the development of this specification.
* JSR submittal April 2017
2.13 Please describe the anticipated working model for the Expert Group working on developing this specification.
Weekly skype meetings, online collaboration
2.14 Provide detailed answers to the transparency checklist, making sure to include URLs as appropriate:
Q: Is the schedule for the JSR publicly available, current, and updated regularly?
Q: Can the public read and/or write to a wiki for the JSR?
Q: Is there a publicly accessible discussion board for the JSR that you read and respond to regularly?
Q: Have you spoken at conferences and events about the JSR recently?
Q: Are you using open-source processes for the development of the RI and/or the TCK?
Q: What is the location of your publicly-accessible Issue list? In order to enable EC members to judge whether Issues have been adequately addressed, the list must make a clear distinction between Issues that are still open, Issues that have been deferred, and those that are closed, and must indicate the reason for any change of state.
Q: What is the mechanism for the public to provide feedback on your JSR?
Q: Where is the publicly-accessible document archive for your Expert Group?
Q: Does the Community tab for my JSR have links to and information about all public communication mechanisms and sites for the development of my JSR?
Q: Do you have a Twitter account or other social networking feed which people can follow for updates on your JSR?
Q: Which specific areas of feedback should interested community members (such as the Adopt-a-JSR program) provide to improve the JSR (please also post this to your Community tab)?
2.15 Please describe how the RI and TCK will de delivered, i.e. as part of a profile or platform edition, or stand-alone, or both. Include version information for the profile or platform in your answer.
2.16 Please state the rationale if previous versions are available stand-alone and you are now proposing in 2.13 to only deliver RI and TCK as part of a profile or platform edition (See sections 1.1.5 and 1.1.6 of the JCP 2 document).
2.17 Please provide a description of the business terms for the Specification, RI and TCK that will apply when this JSR is final.
The specification will be licensed using the standard specification license (see http://jcp.org/aboutJava/communityprocess/speclead/final-license.txt).
The RI and TCK will be licensed via the Apache license (see http://www.apache.org/licenses/LICENSE-2.0.html)
Version 2.0, January 2004
2.18 Please describe the communications channel you have established for the public to observe Expert Group deliberations, provide feedback, and view archives of all Expert Group communications.
2.19 What is the URL of the Issue Tracker that the public can read, and how does the public log issues in the Issue Tracker?
2.20 Please provide the location of the publicly accessible document archive you have created for the Expert Group.
Section 3: Contributions
3.1 Please list any existing documents, specifications, or implementations that describe the technology. Please include links to the documents if they are publicly available.
3.2 Explanation of how these items might be used as a starting point for the work.
We have created initial architecture, a set of interfaces and abstract classes, and provided pathways for creating specific implementations using existing open source solutions in that domain. The presentation provides our vision and general guidelines. The google document contains all the materials that we've discussed during the several months of preparation for the JSR proposal.
Section 4: Additional Information
4.1 This section contains any additional information that the submitting Member wishes to include in the JSR.
Visual recognition is driving force for machine learning development because of several reasons: It is a high dimensional, complex, and computationally demanding problem There are many large, open data sets available The results improvements achieved in that domain can also be applicable in other application domains easily (where neural network based classifiers are used)
We believe that our VisRec JSR will assist Java developers to develop innovative Visual Recognition applications based on Java (and the JVM in general). Since ML (and AI in general) is a powerful trend for many years to come, this will foster more Java focus on machine learning in general.