System management issues

The are a number of important issues related to the run time management of the control system.

What follows are some important points:

The whole control system must be started/stopped from a centralized procedure[RD01 - 5.1.2 Procedures]. This must be able to:

Start/stop processes in a predefined order and handling dependencies between processes

Verify the health of the system after startup/shutdown

During run-time a management application must monitor the system:

Detect terminating processes (and eventually restart them automatically) and notify other interested processes of the fact

Identify locked processes

Identify lack of resources

The mechanism used by the ESO Common Software is very well tested and satisfies all requirements (see next paragraph).

Alternatively an implementation of another environment for network and distributed system management can be investigated (some alternatives for the implementation of these concepts are: SNMP, Unicenter TNG [RD20] or Tivoli [RD21]).

Management of the control system should be possible without requiring root access.