JavaPartyA distributed companion to Java
Current release 1.9.5
Bernhard Haumacher, Thomas Moschny and Michael Philippsen
Transparent remote objects
JavaParty allows easy port of multi-threaded Java programs to distributed environments such as clusters. Regular Java already supports parallel applications with threads and synchronization mechanisms. While multi-threaded Java programs are limited to a single address space, JavaParty extends the capabilities of Java to distributed computing environments.
The normal way of porting a parallel application to a distributed environment is the use of a communication library. Java's Remote method invocation (RMI) renders the implementation of communication protocols unnecessary, but still leads to increased program complexity. The reasons for increased complexity are the limited RMI capabilities and additional functionality that must be implemented for creation and access of remote objects.
The JavaParty approach is different. JavaParty classes can be declared as remote. While regular Java classes are limited to one Java virtual machine, remote classes and their instances are visible and accessible anywhere in the distributed JavaParty environment. As far as remote classes are concerned, the JavaParty environment can be viewed as a Java virtual machine that is distributed over several computers.
The access and the creation of remote classes is syntactically indistinguishable from regular Java classes.
Location transparency ends where performance considerations come into play. Even in state-of-the-art distributed systems (KaRMI, uka.transport, ParaStation), the access latency to a remote object is orders of magnitude slower than to a local object. A parallel program executed in a distributed JavaParty environment therefore can only utilize the full power of the parallel machine, if communication is minimized. Object migration is one way of adapting the distribution layout to changing locality requirements of the application. Manual object placement is also possible. For details see the JavaParty syntax section later on.
Unless remote objects are declared to be resident they can migrate from one node to another within the distributed JavaParty environment. This is an important capability even if remote objects are syntactically indistinguishable from regular Java objects, and their location within the distributed environment does not influence the semantics of a JavaParty program.
Besides optimizing parallel distributed read access, collective replicated objects are a new objectoriented way of expressing data-parallel operations in the bulk-synchronous model. Data- parallel algorithms are widespread in high-performance computing applications, because they can reach a high degree of parallelism and are able to process large amounts of data. Collective replication is a new form of object replication, which allows a seamless integration of control and data parallelism in an object- oriented language. This is an important contribution, since control parallelism was believed to match object-orientation, while data parallelism was restricted to array structures in procedural languages with explicit message passing. Collective replication integrates data parallelism into an object oriented language without conflicting with inheritance, modularization or encapsulation.
All language extensions are automatically transformed back to pure Java. This ensures full portability of the generated code. In the extended language, all concepts like easy object orientation, garbage collection, built-in parallelism and coordination, which made Java popular, are still usable in the distributed environment. This enables easy programming of cluster computers without abstaining from the comfort of an object- oriented language. A prototype shows that the extensions make porting of a parallel application to a distributed cluster environment particularly easy. If there is a data-parallel decomposition of the problem, collective replication even eases the distributed parallelization of sequential programs. With collective replication, the parallelization does not require modifications to the algorithm itself. Guided through an annotation, the data structures are automatically transformed to provide operations for reestablishing consistency after a data- parallel modification. There is no need for any additional coding. The transformation is based on a library for extended remote method invocation for cluster computers which provides communication primitives for collective replication and transparent remote access. Both enable the efficient execution of computing-intensive application in clusters.
For comments and bug reports please use the JavaParty users mailing list.
Page design & maintenance: Bernhard Haumacher.
Last update: Fri Mar 30 18:46:00 GMT+01:00 2007
Java is a trademark of Sun Microsystems.