High availability mechanisms – D-Link DFL-2500 User Manual
Page 291
11.2. High Availability Mechanisms
D-Link HA provides a redundant, state-synchronized hardware configuration. The state of the active
unit, such as the connection table and other vital information, is continuously copied to the inactive
unit via the sync interface. When cluster failover occurs, the inactive unit knows which connections
are active, and traffic can continue to flow.
The inactive system detects that the active system is no longer operational when it no longer detects
sufficient Cluster Heartbeats. Heartbeats are sent over the sync interface as well as all other
interfaces. NetDefendOS sends 5 heartbeats per second from the active system and when three
heartbeats are missed (that is to say, after 0.6 seconds) a failover will be initiated. By sending
heartbeats over all interfaces, the inactive unit gets an overall view of the active unit's health. Even
if sync is deliberately disconnected, failover may not result if the inactive unit receives enough
heartbeats from other interfaces via a shared switch, however the sync interface sends twice as many
heartbeats as any of the normal interfaces. The administrator can disable heartbeat sending on any of
the interfaces.
Heartbeats are not sent at smaller intervals because such delays may occur during normal operation.
An operation such as opening a file, could result in delays long enough to cause the inactive system
to go active, even though the other is still active.
Cluster heartbeats have the following characteristics:
•
The source IP is the interface address of the sending firewall
•
The destination IP is the shared IP address
•
The IP TTL is always 255. If NetDefendOS receives a cluster heartbeat with any other TTL, it is
assumed that the packet has traversed a router, and hence cannot be trusted.
•
It is a UDP packet, sent from port 999, to port 999.
•
The destination MAC address is the ethernet multicast address corresponding to the shared
hardware address. In other words, 11-00-00-C1-4A-nn. Link-level multicasts are used over
normal unicast packets for security: using unicast packets would mean that a local attacker could
fool switches to route heartbeats somewhere else so the inactive system nevers receives them.
The time for failover is typically about one second which means that clients may experience a
failover as a slight burst of packet loss. In the case of TCP, the failover time is well within the range
of normal retransmit timeouts so TCP will retransmit the lost packets within a very short space of
time, and continue communication. UDP does not allow retransmission since it is inherently an
unreliable protocol.
Both master and slave know about the shared IP address. ARP queries for the shared IP address, or
any other IP address published via the ARP configuration section or through Proxy ARP, are
answered by the active system. The hardware address of the shared IP address and other published
addresses are not related to the actual hardware addresses of the interfaces. Instead the MAC address
is constructed by NetDefendOS from the Cluster ID in the following form: 10-00-00-C1-4A-nn,
where nn comes from combining the Cluster ID configured in the Advanced Settings section with
the hardware bus/slot/port of the interface. The Cluster ID must be unique for each cluster in a
network.
As the shared IP address always has the same hardware address, there will be no latency time in
updating ARP caches of units attached to the same LAN as the cluster when failover occurs.
When a cluster member discovers that its peer is not operational, it broadcasts gratuitous ARP
queries on all interfaces using the shared hardware address as the sender address. This allows
switches to re-learn within milliseconds where to send packets destined for the shared address. The
only delay in failover therefore, is detecting that the active unit is down.
ARP queries are also broadcast periodically to ensure that switches don't forget where to send
11.2. High Availability Mechanisms
Chapter 11. High Availability
291