Agility Case Study: Charles Hemphill
By Susan Mitchell
Java Specification Requests (JSRs) vary a lot in how long they take to produce. For a variety of reasons, JSR 113, Java Speech API 2.0, took longer to complete than many JSRs in the Java Community Process (JCP) program. Even so, by moving forward at a steady, measured pace it reached the desired goal and sailed through the final vote by the Executive Committee (EC) for the Java Micro Edition (ME) Platform.
Charles Hemphill was the co-Spec Lead for JSR 113, and he is now its Maintenance Lead. As president and CTO of EverSpeech, he is involved in developing speech technology for mobile devices. By leading JSR 113 while working with Conversational Computing Corporation as senior speech scientist, Charles helped set a new Java standard for speech interfaces, enabling faster, easier, and potentially safer interactions with hardware and software. This was originally based on the version 1.0 standard that had been released in 1998 by Sun Microsystems.
Subsequent versions of specifications tend to be more ambitious in scope than their parents, and Java Speech API 2.0 was clearly a sophisticated undertaking. On the Final Approval Ballot, Nokia Corporation's representative voted yes, then commented, "We think that the API is well designed and has very comprehensive functions. However, it is therefore highly complex and requires fairly advanced speech recognition and synthesis features. It also assumes a high level of speech recognition understanding from the application developer. It might not be feasible in many Java ME devices in the near term, but can provide good features in those high end platforms where applicable."
In approaching such a complicated task, Charles says, "We were able to focus on open, efficient communication during each milestone for our JSR, but we took a serial approach to the specification followed by implementation and test cases." This approach was in line with early recommendations from the JCP program, but differs from the Wikipedia definition Charles references for agility: "Agile software development is a group of software development methodologies based on iterative and incremental development, where requirements and solutions evolve through collaboration between self-organizing, cross-functional teams." This definition has caught on more in recent years, but work on JSR 113 began well before such an approach to software development was well known or accepted. "We might look at things very differently if we were starting today," Charles notes. Still, the JSR 113 Expert Group relied on frequent and open communication and collaboration to get the job done with a high degree of quality.
Collaborating with Contributors
Optimally, a Spec Lead will anticipate and deflect problems so they do not occur, but confront and work through them if they happen anyway. Fostering collaboration among the various contributors is one way to keep problems at bay. This requires excellent communication mechanisms, a lot of transparency, and a measure of compromise. "People need to know that we listen and respond to feedback," says Charles.
To maintain momentum, the Expert Group held "fairly efficient weekly discussions" up until the last several months, when they dropped down to every other week and focused on completion of the test cases. Most communication was written up as "agenda notes" before the meetings and then discussed through e-mail between meetings.
The downloadable Public Review, Proposed Final Draft, and Proposed Final Draft 2 were posted on the regular jcp.org page. Even now, anyone can ask a question or make a suggestion about the JSR by emailing the address listed on the webpage, firstname.lastname@example.org. During the project's active phase, a collaboration website held additional information, such as a free trial Reference Implementation (RI) for the PC and a developer forum open to anyone, with no charge for registering to participate. A second website at www.jsapi2.org currently remains password-protected to allow access only to Expert Group members. Most of the working information was posted here, including the latest specification version under consideration (Javadoc), previous meeting minutes, upcoming agenda, a Bugzilla to track change requests, working examples, RI and Technology Compatibility Kit (TCK) documentation, and maintenance ideas.
As with any project that involves people and competing interests, the JSR 113 effort encountered some speed bumps. There was some miscommunication involving the original JSR goals, and resolving the shift in expectations took extra time to recognize, address, and clarify. For example, one important member of the Expert Group made feature change requests that would potentially result in fragmentation. In response, Charles says, "We went to great lengths to develop a consensus solution. The response was not necessarily agile, but perhaps the best we could do under the circumstances."
In general, Charles was highly pleased with the level of collaboration among his Expert Group and greatly appreciates their help. On a few occasions when people paused their work on the JSR project to deal with tasks outside of it, the Expert Group would redirect attention to a different productive task. "We did have a few replacement Experts from the Expert Group companies as we proceeded," says Charles, and they were readily incorporated as part of the team.
Collaborating with the Industry
There is a risk in relying on entities outside of the Expert Group, but sometimes the benefits can far outweigh any potential problems. To avoid duplication of effort, the Expert Group adopted standards from the World Wide Web Consortium (W3C) so that organization effectively became a contributor to the JSR. Charles says, "By design, we committed to using the Speech Synthesis Markup Language (SSML) and the Speech Recognition Grammar Specification (SRGS). These took longer than anticipated for the W3C to complete, and there were changes along the way as well." During what could have been downtime, the Expert Group simply focused on resolving other aspects of the work.
Charles likes having industry pressure, and he wishes his group had received more of it, both from inside and outside the Expert Group. "We devoted some time evangelizing the JSR, trying to get feedback and promoting adoption. Examples include our JavaOne presentations, booths featuring the JSR, and numerous customer calls and visits," he says.
JSR 113 was originally technology motivated in anticipation of market demands. In Charles' view, that's not an ideal situation. It's much better if the market wants a specific technology that can be provided. "Our initial task, per the original Java Specification Request, was to incorporate Java Speech API 1.0 functionality into the Java ME platform. After we finished necessary specification changes, the result was deemed too big for the platform and too demanding of embedded speech engines. We undertook a paring down process and gave up on backward compatibility," says Charles.
This is the kind of willingness to compromise and adapt that marks an agile effort. Those changes were rewarded with a signal of approval from the community in that there was just a small amount of public feedback between the Proposed Final Draft and Final Draft. "We seemed to do a fairly good job of anticipating public wishes," Charles concludes.
He advises new Spec Leads to focus on market demands first, if possible. He says, "Try to work with companies that will adopt the JSR, and drive the specification toward that." However, Charles calls this a "tricky proposition" because once there is a newly desired feature boosting market demand, companies may rush forward with their own solutions and not wait for released standards. The JSR 113 Expert Group found it helpful to make an early-release RI available.
Wrapping Up and Looking Ahead
A technology team may work with great efficiency all the way through development only to flounder during the release cycle. The JSR 113 Expert Group certainly didn't flounder at this final stage, but progress slowed for several reasons.
The relatively large scope of the API became even more apparent at this stage as it impacted the release time. Instead of working fast and loose, the group took time to ensure the quality of the specification with sufficient assertions for the TCK. "We wanted to confirm that everything was ready and worked together. It is a highly event-driven API, including real-time input and output, encompassing speech recognition and synthesis technologies. Additionally, we needed to address the impact of different spoken languages," says Charles. "The RI and TCK were initially for American English, but we needed to make certain that they worked well for other languages, too, such as French, Italian, German, Spanish, and so on."
Another factor in the release speed was the limited ability of the Expert Group to delegate outside of the Spec Lead's company for the final release materials. Many Expert Groups get help from larger companies that have created RIs and TCKs before. But in this case, the RI and TCK depended on company-internal speech recognition and synthesis engines. Expert Group members also had limited hours available to dedicate to these final, time consuming, tasks.
Finalizing the licensing terms also took somewhat longer than anticipated. More recent JSRs enjoy a process that requires those terms up front, so that this is rarely a problem anymore.
Now that the JSR is final, Charles' main focus has shifted to adoption. Any issues related to maintenance will be prioritized based on market demand. JSR 113 has received some adoption, but much more is expected due to recent improvements in conditions. For example, hardware includes faster CPUs and more memory, demand for downloaded applications continues to increase dramatically, and acceptance of speech technology has increased along with improvements in the underlying technology.