solidDB Help : solidDB Grid : Failure handling in the grid : solidDB consensus algorithm
  
solidDB consensus algorithm
Note Check the solidDB Release Notes for any limitations that are associated with using a grid in the current release.
solidDB Grid uses a consensus algorithm for implementing the leader-based synchronous replication of grid metadata. The solidDB consensus algorithm is based on the Raft consensus algorithm, see https://raft.github.io/.
Since there must always be a grid leader, in which the Grid Availability Manager (GAM) can be active, the consensus algorithm provides an automated mechanism for choosing the grid leader whenever the majority of nodes are available.
The consensus algorithm guarantees that, when the grid leader fails, a new grid leader is elected. The duration of an election depends partly on the user configuration but, typically, a new grid leader is elected in less than twice the election timeout period. The algorithm ensures that one of the nodes is eventually elected.
The consensus algorithm makes decisions about grid leadership autonomously, when needed, so human interaction is not needed to handle grid leader failures. Instead, nodes use the consensus algorithm to vote a new grid leader from the remaining grid nodes. As soon as a grid node is elected as grid leader, the grid node starts to run the GAM and monitor the availability of other nodes.
If half of the nodes fail, or if the network splits in such a way that all parts include less than the majority of the nodes, then a new grid leader is not elected automatically. When the majority of nodes recover to the point that they are able to receive and send heartbeat messages, a new grid leader is elected.
The states of grid nodes can be monitored by using the following command:
ADMIN COMMAND 'grid nodeinfo'
The command returns the consensus algorithm state (LEADER, FOLLOWER, CANDIDATE) and the grid membership state (MEMBER_ONLINE, MEMBER_OFFLINE, MEMBER_FAILED) of each node, see GRID.
In certain overload or error situations, the consensus algorithm might not receive heartbeat acknowledgments from a node for a period of time, in which case the status of the node is set to MEMBER_FAILED. The algorithm continues to send heartbeat messages and, if the node starts to respond again, the node can usually be reconnected without operator intervention. If the node remains unresponsive, you might have to perform some manual steps, see What to do if a grid node remains unresponsive.
The following consensus algorithm concepts are used in solidDB Grid environments:
 
Term
Explanation
Leader
The node that performs the following tasks:
executes synchronously replicated transactions, including grid configuration changes and schema changes
sends heartbeat messages to the other grid nodes and monitors the responsiveness of the grid nodes
runs the GAM and balances the total number of replication units for each partition across the grid.
Follower
A node that performs the following tasks:
receives transactions from the leader
responds to heartbeat messages from the leader
initiates election of new leader (if election timeout is exceeded)
Candidate
A node that stands as a candidate for the grid leader, by initiating a new leadership term, and sending vote requests to other grid nodes.
Heartbeat
A regular message sent by the leader to inform followers that the leader is still active.
Election timeout
Maximum time interval between heartbeat messages from the grid leader to a node. If the election timeout is exceeded, the node assumes that the leader has failed and initiates the election of a new leader (see pre-voting).
You set the election timeout by configuring the Grid.RaftElectionTimeout parameter, see RaftElectionTimeout.
Leadership term
The period of time that starts with a leadership election and ends when the elected leader fails, or if the election fails to elect a leader.
Pre-voting
If pre-voting is enabled, a follower (for which the election timeout has expired) sends a pre-vote request to all grid nodes to check if other nodes would allow the follower to become the new leader. If the majority of nodes respond positively, the follower changes state to CANDIDATE, starts a new leadership term, and sends out vote requests.
Unlike vote requests, nodes can accept pre-vote requests from multiple followers during a single leadership term.
Note If pre-voting is not enabled, the follower changes state to CANDIDATE immediately after the election timeout.
You enable pre-voting by configuring the Grid.RaftPreVoteEnabled parameter, see Grid section.
Voting
During an election, nodes receive vote requests from candidates.
Before a node accepts a vote request, the node checks that the leadership term that the candidate has started is not earlier than the current leadership term.
A node can accept only one vote request in any leadership term.
Leader stickiness
If leader ‘stickiness’ is enabled, a node does not accept any vote requests (or pre-vote requests) if its own election timeout has not expired. This prevents the grid from changing the leader if the current leader is performing properly for the majority of nodes.
You enable leader stickiness by configuring the Grid.RaftLeaderStickinessEnabled parameter, see Grid section.
Go up to
Failure handling in the grid