One of the failure situations you must prepare for is when the communication link between the database nodes fails, and both database servers might assume that the other one is down. This can lead into a dual primary situation (split brain), and you might lose transactions when databases are later synchronized. To avoid a wrong decision by HAC, you can use a network reference device, an External Reference Entity (ERE) to check the health of the network. For example, if a network adapter fails in one computer, HAC can detect this situation and is able to set the correct database node to continue as the Primary database, while the other one continues as Secondary.
If ERE is used, HAC checks the status of the physical link between the HotStandby node and the ERE device by pinging ERE. If the physical link to the nearest ERE is not operational, the local HAC sets the local server to the SECONDARY ALONE state. If the nearest physical link is operational, and no connection is available to the other server, the local HAC concludes that the local server is the one to continue offering the service, and sets it to PRIMARY ALONE. Consequently, The HotStandby node, which loses its connection to the opposite HotStandby node and to the nearest ERE, becomes the Secondary. In this way, the two Primaries (a split brain) situation is prevented in the case of network failure.
The figure above depicts two possible locations of ERE:
▪The cluster switch
▪Any computer in the network outside the cluster. If a redundant network (that is, duplicate network controllers, cables and switches) is used in a cluster, define ERE outside the cluster.
Important: If the HotStandby link is considered unreliable (including all the cases where ERE is used), the following HotStandby server parameter must be set to its factory value:
HotStandby.AutoPrimaryAlone=no
ERE must use the same HSB link that the keepalive messages do.