An architecture for self-healing autonomous object groups
Original version
In 4th International Conference on Autonomic and Trusted Computing, volume 4610 of Lecture Notes in Computer Science, pages 156-168, Hong Kong, China, July 2007 - The final publication is available at www.springerlink.comAbstract
Jgroup/ARM is a middleware for developing and operating
dependable distributed Java applications. Jgroup integrates the distributed
object model of Java RMI with the object group paradigm, enabling
construction of replicated servers that offer dependable services
to clients. ARM aims to improve the dependability characteristics of
systems through fault treatment, focusing on operational aspects where
the gain in terms of improved dependability is likely to be the greatest.
ARM offers two core mechanisms: recovery from node, object and network
failures and distribution of replicas. ARM identifies failures and
reconfigures the system according to its dependability requirements.
This paper proposes an enhancement of the ARM framework in which
replica placement is performed in a distributed manner, eliminating the
need for a centralized manager with global information about all object
groups. Instead each autonomous object group handles their own replica
placement based on information from nodes. Assuming that multiple objects
groups are deployed in the system, this constitutes a distributed
replica placement scheme. This scheme enables the implementation of
self-healing object groups that can perform fault treatment on themselves.
Advantages of the approach: (a) no need to maintain global information
about all object groups which is costly and limits scalability, (b)
reduced infrastructure complexity, and (c) less communication overhead.