Your Browser is not longer supported

Please use Google Chrome, Mozilla Firefox or Microsoft Edge to view the page correctly
Loading...

{{viewport.spaceProperty.prod}}

Actions performed by the node applications if a failure is detected

It is assumed that a node application has failed if the monitored application does not respond to the messages within the configured reply time and taking account of the number of retries configured and if, on the basis of the KDCFILE of the monitored application, it is then detected that this application is no longer running but was also not terminated normally.

If failure or abnormal termination of the monitored node application is detected, openUTM proceeds as follows:

  • The node application is flagged as failed in the cluster configuration file and removed from the monitoring relationships.

  • If you have specified a so-called failure script during UTM generation, the monitoring node application starts this script on the computer of the monitoring node application. The following data of the failed application is passed to the failure script:

    • the application name

    • the base name of the node application

    • the host name

    • the virtual host name or blanks

    • the reference name of the node application

    • the error code of the UTM dump (Term Application Reason)

    openUTM manual “Generating Applications”, CLUSTER statement
    To configure the failure script, specify the operand FAILURE-CMD. This operand passes a command string containing a command to be executed and any arguments.

  • The monitoring node application starts a restart monitoring timer if you have configured this:


    openUTM manual “Generating Applications”, CLUSTER statement
    To configure the restart monitoring timer, specify the operand RESTART-TIMER-SEC. This specifies the maximum time in seconds that a node application requires for a warm start after a failure.

  • If you have specified an emergency script during UTM generation, the monitoring node application starts this script if the failed node application does not become available again after the restart monitoring timer has expired. The following data of the failed application is passed to the emergency script:

    • the application name

    • the base name of the node application

    • the host name

    • the virtual host name or blanks

    • the reference name of the node application

    • the error code of the UTM dump (Term Application Reason)

    openUTM manual “Generating Applications”, CLUSTER statement
    To configure the emergency script, specify the operand EMERGENCY-CMD.
    This operand passes a command string containing a command to be executed and any arguments.

Sample script on detection of a failure

Sample failure and emergency scripts are supplied with openUTM. These examples output the parameters passed when they are called. If you wish to use the samples in a live environment, you must adapt them to suit the requirements of the relevant cluster.

Unix and Linux systems

The following sample scripts are supplied in the library utmpath/shsc:

  • utm-c.emergency

  • utm-c.failure

Windows systems

The following sample scripts are supplied in the directory utmpath\shsc:

  • utm-c.emergency.cmd

  • utm-c.failure.cmd