Difference between revisions of "System Failover"

Line 1: Line 1:
 +
__FORCETOC__
 +
=== System Failover ===
 
[[File:failover.png|400px|thumb|Failover Flow]]
 
[[File:failover.png|400px|thumb|Failover Flow]]
  
Line 7: Line 9:
 
A passive system sends a heartbeat about every 10 seconds to an active system. If the passive does not receive a response from an active system for the [[CLI - Configuring System Failover | Deadtime]], the passive system switches the mode to "active".
 
A passive system sends a heartbeat about every 10 seconds to an active system. If the passive does not receive a response from an active system for the [[CLI - Configuring System Failover | Deadtime]], the passive system switches the mode to "active".
  
This system failover enables continuous service and you can connect the management page without changing the URI because a virtual IP is automatically configured in a active system by the [[ImRAD services(daemons) | failover service]].
+
This system failover enables continuous service and you can connect the management page without changing the URI because a virtual IP is automatically configured in an active system by the [[ImRAD services(daemons) | failover service]].
  
 
The failover service in an active system listens on UDP port 6010 to receive a heartbeat from a passive system.
 
The failover service in an active system listens on UDP port 6010 to receive a heartbeat from a passive system.
Line 23: Line 25:
 
| logexp || running  || running but only saves its Syslog.
 
| logexp || running  || running but only saves its Syslog.
 
|-
 
|-
| failover || Configuring a Virtual IP || Monitoring a active system and replicating the database
+
| failover || Configuring a Virtual IP || Monitoring an active system and replicating the database
 
|-
 
|-
 
| Database || Master || Slave
 
| Database || Master || Slave
Line 29: Line 31:
 
|}
 
|}
  
Note that, If an active system is recovered from a fault after another system has switched to the active mode, the recovered system switches to passive mode. In other words, The failback<ref>https://en.wikipedia.org/wiki/Failover</ref> does not occur.
+
''Note that, If an active system is recovered from a fault after another system has switched to the active mode, the recovered system switches to passive mode. In other words, The failback<ref>https://en.wikipedia.org/wiki/Failover</ref> does not occur.''
 +
You can configure the System Failover via the [[CLI - Configuring System Failover | CLI]].
 +
 
 +
=== System Failover Switch-Over===
 +
The following table shows you when a system switches its mode. Several case numbers indicate a Switch-Over condition and you can see them while monitoring the [[CLI - Log | logs]] of the failover service.
 +
''Note that you need to enable the [[CLI - Services(daemons) | "runtime log"]] for the failover service to display logs.
 +
''
 +
 
 +
{| class="wikitable"
 +
! Init mode !! Peer Response !! Current mode(switched mode) !!  Case Number
 +
|-
 +
| rowspan='4' | active || no-response || active || C7
 +
|-
 +
| zero(initializing) || active || C1, C5
 +
|-
 +
| passive || active || C5
 +
|-
 +
| active || passive || C4
 +
|-
 +
| rowspan='4' | passive || no-response || active || C6
 +
|-
 +
| zero(initializing) || passive || C1, C3
 +
|-
 +
| passive || active || C4
 +
|-
 +
| active || passive || C5
 +
|-
 +
|}
 +
 
 +
===== System Failover Case Numbers =====
 +
{| class="wikitable"
 +
! Case Number !! Description
 +
|-
 +
| C1 || If DEVICE#1 is in an initialization state, and it gets a response from DEVICE#2 that is also initializing, DEVICE#1 switches its mode to the Initial Mode configured.
 +
|-
 +
| C2 || If DEVICE#1 is in an initialization state, and it gets a response from DEVICE#2 that is in either an "active" or "passive" state, DEVICE#1 switches to the opposite mode from DEVICE#2.
 +
|-
 +
| C3 ||  If DEVICE#1 is in a "passive" state, and it gets a response from DEVICE#2 that is initializing, DEVICE#1 keeps its current state.
 +
|-
 +
| C4 || If DEVICE#1 is in either an "active" or "passive" state, and it gets a response from DEVICE#2 that is in the same state as it is, DEVICE#1 switches to the opposite mode from DEVICE#2.<br>
 +
Typically this case rarely occurs but it can occur because of the misconfiguring of the system failover(e.g. configuring the same initial mode to both devices).
 +
|-
 +
| C5 || If DEVICE#1 is in either an "active" or "passive" state, and it gets a response from DEVICE#2 that is in the opposite state as it is, DEVICE#1 keeps its current state.
 +
|-
 +
| C6 || If DEVICE#1 is in an initialization state, the initial mode is "passive", and it does not get a response from DEVICE#2, DEVICE#1 tries again to connect to DEVICE#2 without switching its mode.
 +
If DEVICE#2 does not respond during the INIT-DEADTIME(30 seconds), DEVICE#1 switches its mode to "active".
 +
|-
 +
| C7 || If DEVICE#1 is in an initialization state, the initial mode is "active", and it does not get a response from DEVICE#2, DEVICE#1 switches its mode to "active".
 +
|-
 +
| C8 || If DEVICE#1 is in a "passive" state, and it does not get a response from DEVICE#2, DEVICE#1 tries again to connect to DEVICE#2 without switching its mode.
 +
If DEVICE#2 does not respond during the deadtime configured, DEVICE#1 switches its mode to "active".
 +
|-
 +
| C9 || AAA
 +
|-
 +
| C10 || AAA
 +
|-
 +
| C11 || AAA
 +
|-
 +
| C12 || AAA
 +
|-
 +
| C13 || AAA
 +
|-
 +
|}
  
You can configure the System Failover via the [[CLI - Configuring System Failover | CLI]].
 
  
 
=== References ===
 
=== References ===

Revision as of 16:32, 28 April 2021

System Failover

Failover Flow

The system failover is switching to a passive system when an active system is in a state of failure(e.g. hardware fault, network problem).

A passive system synchronizes all data from the master database in which is running in an active system by the Database Replication[1] and monitors the active system.

A passive system sends a heartbeat about every 10 seconds to an active system. If the passive does not receive a response from an active system for the Deadtime, the passive system switches the mode to "active".

This system failover enables continuous service and you can connect the management page without changing the URI because a virtual IP is automatically configured in an active system by the failover service.

The failover service in an active system listens on UDP port 6010 to receive a heartbeat from a passive system.

All services in the active and passive devices work as shown in the table below.

Service Active System Passive System
dhcpv4 running running
dhcpv6 running running
radius running running
logexp running running but only saves its Syslog.
failover Configuring a Virtual IP Monitoring an active system and replicating the database
Database Master Slave

Note that, If an active system is recovered from a fault after another system has switched to the active mode, the recovered system switches to passive mode. In other words, The failback[2] does not occur. You can configure the System Failover via the CLI.

System Failover Switch-Over

The following table shows you when a system switches its mode. Several case numbers indicate a Switch-Over condition and you can see them while monitoring the logs of the failover service. Note that you need to enable the "runtime log" for the failover service to display logs.

Init mode Peer Response Current mode(switched mode) Case Number
active no-response active C7
zero(initializing) active C1, C5
passive active C5
active passive C4
passive no-response active C6
zero(initializing) passive C1, C3
passive active C4
active passive C5
System Failover Case Numbers
Case Number Description
C1 If DEVICE#1 is in an initialization state, and it gets a response from DEVICE#2 that is also initializing, DEVICE#1 switches its mode to the Initial Mode configured.
C2 If DEVICE#1 is in an initialization state, and it gets a response from DEVICE#2 that is in either an "active" or "passive" state, DEVICE#1 switches to the opposite mode from DEVICE#2.
C3 If DEVICE#1 is in a "passive" state, and it gets a response from DEVICE#2 that is initializing, DEVICE#1 keeps its current state.
C4 If DEVICE#1 is in either an "active" or "passive" state, and it gets a response from DEVICE#2 that is in the same state as it is, DEVICE#1 switches to the opposite mode from DEVICE#2.

Typically this case rarely occurs but it can occur because of the misconfiguring of the system failover(e.g. configuring the same initial mode to both devices).

C5 If DEVICE#1 is in either an "active" or "passive" state, and it gets a response from DEVICE#2 that is in the opposite state as it is, DEVICE#1 keeps its current state.
C6 If DEVICE#1 is in an initialization state, the initial mode is "passive", and it does not get a response from DEVICE#2, DEVICE#1 tries again to connect to DEVICE#2 without switching its mode.

If DEVICE#2 does not respond during the INIT-DEADTIME(30 seconds), DEVICE#1 switches its mode to "active".

C7 If DEVICE#1 is in an initialization state, the initial mode is "active", and it does not get a response from DEVICE#2, DEVICE#1 switches its mode to "active".
C8 If DEVICE#1 is in a "passive" state, and it does not get a response from DEVICE#2, DEVICE#1 tries again to connect to DEVICE#2 without switching its mode.

If DEVICE#2 does not respond during the deadtime configured, DEVICE#1 switches its mode to "active".

C9 AAA
C10 AAA
C11 AAA
C12 AAA
C13 AAA


References