beautypg.com

Amer Networks E5Web GUI User Manual

Page 704

background image

3.

The inactive (slave) unit reconfigures to activate the new database files.

4.

The active (master) unit now reconfigures to activate the new database files causing a
failover to the slave unit. The slave is now the active unit.

5.

After reconfiguration of the master is complete, failover occurs again so that the master
once again becomes the active unit.

Dealing with Sync Failure

An unusual situation that can occur in an HA cluster is if the sync connection between the master
and slave experiences a failure with the result that heartbeats and state updates are no longer
received by the inactive unit.

Should such a failure occur then the consequence is that both units will continue to function but
they will lose their synchronization with each other. In other words, the inactive unit will no
longer have a correct copy of the state of the active unit. A failover will not occur in this situation
since the inactive unit will realize that synchronization has been lost.

Failure of the sync interface results in the generation of hasync_connection_failed_timeout log
messages by the active unit. However, it should be noted that this log message is also generated
whenever the inactive unit appears to be not working, such as during a software upgrade.

Failure of the sync interface can be confirmed by comparing the output from certain CLI
commands for each unit. The number of connections could be compared with the stats
command. If IPsec tunnels are heavily used, the ipsecglobalstat -verbose command could be used
instead and significant differences in the numbers of IPsec SAs, IKE SAs, active users and IP pool
statistics would indicate a failure to synchronize. If the sync interface is functioning correctly,
there may still be some small differences in the statistics from each cluster unit but these will be
minor compared with the differences seen in the case of failure.

Once the broken sync interface is fixed, perhaps by replacing the connecting cable,
synchronization between active and inactive units will not take place automatically. Instead, the
unsynchronized inactive unit must be restarted after which the following takes place:

During startup, the inactive unit sends a message to the active unit to flag that its state has
been initialized and it requires the entire state of the active unit to be sent.

The active unit then sends a copy of its entire state to the inactive unit.

The inactive unit then becomes synchronized after which a failover can take place
successfully if there is a system failure.

Note: An inactive unit restart is required for resynchronization

A restart of cOS Core on the inactive unit is the only time when the entire state of the
active unit is sent to the inactive unit and this is the reason why a restart is required for
resynchronization. This is achieved using the CLI command:

Device:/> shutdown

Chapter 11: High Availability

704

This manual is related to the following products: