Fault tolerance in this context means that a UTM application can still remain operational when errors occur in individual program units that force openUTM to abort a transaction. openUTM then ensures that the application program is terminated and reloaded so that the error does not spread any further and have a negative effect on other users of the application and their data.
With regard to the error behavior of openUTM, a distinction is made between:
Internal UTM errors and errors in the system environment
These errors result in an abnormal termination of the application, just like the administration command KDCSHUT KILL or when issuing a KDCADMI call with operation code KC_SHUTDOWN and subcode KC_KILL.
openUTM creates a UTM dump for each process of the application. The UTM dump is edited using the UTM tool KDCDUMP. A description of this procedure can be found in the openUTM manual “Messages, Debugging and Diagnostics on BS2000 Systems”.Errors in the application program
These are errors in program units. They can be divided into two groups:
errors that lead to the reloading of the application
errors that permit the program to continue.