Your Browser is not longer supported

Please use Google Chrome, Mozilla Firefox or Microsoft Edge to view the page correctly
Loading...

{{viewport.spaceProperty.prod}}

Summary of failure scenarios

Failure

Condition              

Reported with               

Operation interrupted?

Impact

Response
(measure)

Data inconsistency on later switch to targets?

Source or target unit

Protected

NJD0012 1

No

-

Customer support
(measure A)

No

Unprotected,
ON-ERR=*CONT

NJD0012 1,
NDE0020

No

Performance, if source unit is affected

Customer support
Note
inconsistency
(measure B)

Yes, if target unit is affected

Unprotected,
ON-ERR=*HOLD

NJD0012 1,
NDE0020

REMOUNT
NKVD014

Applications wait

Measure A or continue with remaining unit

No
Yes

Single remote link


NJD0012 1,
NDE0010

No

Write performance

Customer support

No

Last remote link

ON-ERR=*CONT

NJD0012 1,
NDE0010,
NDE0012

No

-

Customer support
Note
inconsistency

Yes

ON-ERR=*HOLD

NJD0012 1,
NDE0010,
NDE0020,
NDE0012

REMOUNT
NKVD014

Applications wait for response

Measure A or continue with remaining unit

No
Yes

Storage system with source units


PGER
message

Yes


Measure A

Possible 2

Local system

-

..

Yes

-

Restart

No

Complete failure 3


..

Yes


Measure A

Possible 2

Failback to local storage system


-

Yes


Measure B


1

NJD0012 messages are not supported for x86 servers.

2

Data inconsistency on later switching to the targets is possible unless synchronous or asynchronous (SRDF/A) processing mode is set or if errors have already occurred on remote links or target units.

3

Failure of the local storage system with source units and failure of the local system

Failure recovery measures

Measure

Description

Condition

Action

Command

A

Switch to target unit, local system affected


Start standby host, attach target units

/ATTACH-DEVICE

Source and target units were synchronized

Make target units available

/SET-REMOTE-COPY-ACCESS TARGET-ACCESS=*DIRECT

Source and target units were not synchronized, inconsistencies acceptable (or reset to last synchronization point)


/SET-REMOTE-COPY-ACCESS TARGET-ACCESS=*DIRECT(PEND-UPD-ALLOWED=*YES)

B

Failback to the local storage system, operation on standby system


Terminate use of target units

/EXPORT-PUBSET


Make target units unavailable

/SET-REMOTE-COPY-ACCESS TARGET-ACCESS=*BY-SOURCE


Disable all channels and remote links on the local storage system

(Service)


Start local storage system

(Service)

Local storage system OK
(Service will check)

Attach and enable remote links

(Service)

Comparison OK / automatic synchronization begun?

Attach channels

(Service)


Start local system


Special information on failure scenarios with SRDF/A

SRDF/A always builds on an existing SRDF replication (see "SRDF/A configurations"). Restart of SRDF/A after a failure is therefore performed in two steps. SRDF replication must be restarted first (as described in the above sections) and then the SRDF/A session can be reactivated.

If a failure occurs, the following should be noted with regard to SRDF/A replication.

  • SRDF link failure

    • Temporary failure:
      SRDF/A is able to compensate temporary failures of SRDF links. A time interval of 0 to 10 seconds can be configured in the storage system for which SRDF/A will tolerate an SRDF link failure. If the links are reestablished within this interval, there is no impact on the application. After expiry of the interval, the failure is treated as a permanent failure.

    • Permanent failure:
      The SRDF/A session is automatically terminated in the event of a permanent failure. The data on the target side is consistent. Once the links are reestablished, SRDF operation can be resumed using normal SRDF recovery procedures and a new SRDF/A session can be activated.

  • Available cache for SRDF/A in the local storage system is full

    If the I/O load for the local storage system, the available bandwidth for SRDF/A replication and the cache size of the storage system are not (or no longer) correctly configured, the entire write cache for SRDF/A in the local storage system may be used up.

    In this case two alternative procedures can be set by customer support:

    • The application is slowed down to the transmission speed of the SRDF links. This means that during this period performance is poorer than with synchronous SRDF mode in the same configuration.

    • The SRDF/A session is terminated immediately and automatically. Termination can be delayed by a configurable time interval (the default setting is 0 seconds). The application is slowed during this interval. If the bottleneck is cleared within this time interval, the SRDF/A session is continued; otherwise it is terminated.

  • Disaster recovery, failback procedure on the target side

    Data on the target side is consistent in the event of a failure. The failback procedure is the same as that for SRDF. After a failback, SRDF/A can be reactivated as soon as the application is available again on the local server.