(c) Larry Ewing, Simon Budig, Garrett LeSage
Ó 1994 Ç.

Department of Computer Science

PetrSU | Software projects | AMICT | Staff | News archive | Contact | Search

Automating Service Management

Dr. Tiina Niklander (University of Helsinki, Finland)

To improve the availability and cost-effectiveness of computer-based services we need to find ways to automate the maintenance and management of the systems. To enhance the current levels of availability, the entire system may need to react to changes in milliseconds, which is faster than any expert human maintenance person could be able to do. In addition to faster reaction times, removing the maintenance personnel from the cycle would also help to reduce the cost. Still more urgently, even if the cost would not be an issue, there is global shortage of expert maintenance personnel.

The IBM's autonomous computing initiative gives us a target of a system that had all kind of self-* properties. The system could be self-aware and based on this awareness it is able to optimise and even reconfigure itself. The eventual goal is a system that is capable of managing itself. The maintenance personnel would still need to give guidelines for the organisation and management, but should not need to make basic maintenance tasks.

During the year 2007, we had a project where a simple proof-of-concept prototype was created to demonstrate an automatic service deployment mechanism in a distributed environment. The prototype architecture had a "gateway" machine for client connections. It forwarded the client requests to the management node, which handled the service deployment and starting. Once everything was ready the client got the access information via the gateway node.

The management node is the key in this architecture. It makes autonomous decisions about the locations of the requested services. It must also monitor the system to detect failures and inconsistencies. The goal is naturally to improve the availability of the services. The management node could, if intelligent enough, make the decisions fully without human interruptions or it can be just alerting mechanism for the maintenance personnel.