JSR-282 SI 17.1: Reduce use of Immortal Memory - class loading --------------------------------------------------------------- This is one of several possible SI's on the use of immortal memory. Last Updated: 3/2/2006 ----------------------- Summary ------- The proposal reduces the use of immortal memory by defining how class loading should work in the RTSJ to avoid the need for all class object to go in immortal memory, to avoid all static initialization to occur in immortal and to allow class unloading to occur. An overview of how classloading works in J2SE is at the end of this document. Specification References ----------------------- Chapter 4: Standard Java Classes [PERHAPS] Chapter 7: MemoryManagement Problem being addressed ------------------------------ The 1.0.x version of the RTSJ requires that all Class objects be allocated in ImmortalMemory and that all static initialization occur in the context of Immortal Memory. However no mention is made of ClassLoader objects or in general how, or if the class loading architecture is impacted by the RTSJ. Background note: These requirements were put in place so that use of the same classes by heap and no-heap schedulable objects would not be subject to race conditions based on which schedulable object happened to trigger class loading and/or class initialization. Class unloading can occur when the defining class loader becomes reclaimable by the garbage collector (JLS 12.7). As every object maintains an implicit reference to its Class object and every Class object maintains a reference to its ClassLoader object, then it follows that a ClassLoader is only reclaimable when all classes it has loaded and all instances of those classes are themselves reclaimable. Classes loaded by the bootstrap loader may not be unloaded. With the existing specification there is no prohibition on creating ClassLoader objects in heap or even in a scoped memory area. However the requirement that all Class objects be placed in ImmortalMemory means that class unloading becomes "impossible" as the Class objects always exist. Now the above is not technically correct. Even objects in immortal memory can become unreachable and reclaimable, so it is still possible to perform class unloading - but it requires some form of reference tracking in immortal memory. The RTSJ even allows for reclaiming of space in immortal memory as long as no-heap entities can't be delayed by something akin to garbage collection for immortal memory. That aside, it would be much simpler if for example, all classes associated with a scope allocated class loader could be unloaded when the scope was cleared. Further, the requirement to run all static initializers in immortal memory can lead to excessive use of immortal (through creation of temporary objects) and potential leakage if unreachable objects are never reclaimed. While there are implementation techniques that can be used to address this (eg. using a form of escape analysis to only create objects that will be accessible externally in immortal, otherwise they can be created in a special "scratch pad" location) it would be better if the allocation context for static initialization were related to the class (and hence class loader) being initialized. Proposed Solution Summary -------------------------------------- In RTSJ 1.1 it is proposed that class loading and initialization follow these rules: 1. A Class object defined by a given class loader is created in the same memory area as the ClassLoader object was 2. When a class is initialized, the current allocation context when running the static initializers, is the memory area in which the Class object was created (which will always be on the current scope stack) 3. By default the bootstrap, extensions and application loader would all be created in immortal memory. 4. The use of immortal memory for the boostrap, extensions and application loader could be changed to heap memory using an option (system property) 5. Application-defined class loaders would be created in the current allocation context, whatever that may be. 6. Instances of a class would continue to be allocated in the current allocation context. 7. JLS 12.7 is refined so that the reclaiming of a ClassLoader object can be done through scope memory reclaimation, not just through garbage collection. This allows greater flexibility for the application and avoids polluting immortal with the entire application. However the application must take great care if it causes No-heap SO's to initiate any class loading as any heap-allocated class loader in the delegation hierarchy would cause a MemoryAccessError even if that loader would not actually be used for defining the class in question. EG NOTE: Implementaors would have to look very carefully at the library code to make sure it was safe wrt. MemoryAccessError being thrown. Semantics of Proposed Solution ---------------------------------------- As summarized above. Points 1 to 7 will probably become the basis of a new semantics sub-section in Ch 7, replacing some of the existing references to class objects and static initialization. Issues: ------- 1. What happens if we attempt to create an instance of a class in a memory area that has a longer lifetime than the area in which the Class (and ClassLoader) exist? If the area of the Class object is a scope then quite evidently we must prohibit this as it would break referential integrity. However, this would introduce a new failure mode for object allocation - perhaps requiring a new exception type. Application code would have to be very aware of which classloader was responsible for which classes. For example, a simple use of executeInArea could fail: // currently in scopeB with parent scopeA // this class was loaded by Loader1 which itself is in ScopeA final List l = new CustomList(); // loaded by Loader1 ...; // populate list with data Runnable r = new Runnable() { Iterator iter = l.iterator(); // Oh oh! while(iter.hasNext()) { // transfer data to more permanent storage } }; ImmortalMemory.executeInArea(r); In the above, when executing in immortal the call to iterator() will most likely create a new object the class of which is a nested class of CustomList. If that class has been loaded by loader1 then we are creating an immortal object whose class is scope allocated. This must be prevented for referential integrity but we also need to consider the implication for the programmer as the code fragment is "innocent" enough and is idiomatic of the way to use executeInArea. If the area of the Class object is heap and the allocation context is immortal then allowing the instantiation of the object to go ahead would in essence prevent the unloading of all classes for the associated defining ClassLoader. This could either be prohibited - again introducing a new failure mode for object creation - or allowed, with the onus being on the application to do sensible things. We could note that doing such an allocation might prevent class unloading, with the "might" arising from the fact that it is allowable to detect unreachable objects in immortal and treat that as logically reclaimed (even if not physically reclaimed). It would then become a quality-of-implementation issue whether a particular VM could unload in these cases. 2. Are there any issues with the VM having to keep references to scope allocated class loaders or class objects? I think this is just an engineering issue. The VM has to track the initiating loaders and defining loader of all classes, and enforce the loader constraints specified by the JLS/JVMS. Any references to scope allocated objects would be safe until the scope was reclaimed, but at that time the classloader is unreachable and so all associated classes can be unloaded, at which point all VM data structures are updated with the about-to-be-deallocated objects removed. Background: The Class Loading Architecture ------------------------------------------ Note: the term "class" in class-loading is meant as a generic references to classes and interfaces (and as of Java 5 enums and annotations). In Java a type is defined as a name and class-loader pair. Two types are the same only if they have the same fully-qualified name and they were loaded by the same class-loader. Thus you can have multiple types of the same name, provided they are loaded by different class loaders. In a Java program the class loader associated with a type name in the code is the class loader that loaded the class in which that code is defined. The VM enforces special rules to ensure that whenever two types of the same name can interact, that they are in fact the same type. Java defines a hierarchical delegation-based class loading model. Every class loader (except the root) has a parent, and every class loader is supposed to delegate loading to its parent (and so to its grandparent etc up to the root) before trying to load the class itself. This is to ensure that all parts of the system that should refer to the same type for a given type-name do indeed see the same type. (The VM thwarts attempts to violate this by class-loaders that don't delegate correctly.) At the root of the hierarchy is the bootstrap loader. This loader is used to load all the "core" system classes, like java.lang.Object, java.lang.Class etc. In practice this loader will load everything that is available on the boot classpath. An application is loaded by the application or "system" class loader. This class loader is created to load the class passed to the Java runtime execution engine. It will typically load all classes available on the application classpath. In J2SE there is an intermediate class loader between the application loader and the bootstrap loader, known as the "extensions" loader. The Java extensions mechanism allows third-party frameworks and libraries to be installed with special security privileges (they act as if loaded by the bootstrap loader for security purposes) without having to force them onto the bootclasspath. So in a default J2SE environment the application class loader delegates to the extensions class loader which delegates to the bootstrap class loader. Applications and libraries can install additional class loaders that can be parented by any of the existing class loaders, but typically a new class loader is always installed as a leaf, otherwise requests to load classes it is not responsible for might fail if the responsible loader (that which has the class definition in its search path) is not an ancestor. Classes are either loaded implicitly through execution of code that refers to them, such as "Foo f = new Foo();" or explicitly through either the Class or ClassLoader API's. Any attempt to load a class causes the current class loader to be marked as an initiating loader for a type. The loader that actually defines a class (ie finds the .class file and causes it to be loaded into the VM) is known as the defining loader for the type. A type can have many initiating loaders but only one defining loader. Additionally every thread has a notion of a "context class loader". The context class loader can be set and retrieved via methods of class Thread. The context loader is used to inform library code that it should attempt to load a class starting with a specific class loader. For example, many service provider frameworks (such as an encryption library) allows for pluggable service providers - when the framework is initialized you pass it the name of the class that will provide the service, and the framework typically loads the class and creates an instance of it. If the framework is installed as an extension then the only loaders available to load the service are the extension loader and the bootstrap loader. As the actual service provider class is meant to be definable by the application, this means that the framework would be unable to load the service provider class - neither the extensions nor bootstrap loader would know where to find it. To deal with this the framework instead tries to use the context loader to load the class. The requirement being that the application set the context loader to be the correct loader prior to initializing the framework.