The term becomes 0 when a new leader is elected

Khang_Nguyen_Phuc · October 27, 2023, 3:30pm

Hello everyone,

I have a HA cluster running normally, but due to an incorrect operation, a leader election occurred, and a new leader was elected. Everything seems to be fine, but the term has reset to 0; previously, it was around 1550. The index and commit both appear to be normal. Is my cluster operating incorrectly?

Some log oned.log from the follower just before it became the current leader:

Fri Oct 27 10:23:17 2023 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Vote not granted from follower 2: [one.zone.voterequest] Candidate's term is outdated
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Follower 2 is in term 4294967295 current term is 1576. Turning into follower
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: oned is set to follower mode
Fri Oct 27 10:23:17 2023 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Vote not granted from follower 2: [one.zone.voterequest] Candidate's term is outdated
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Follower 2 is in term 4294967295 current term is 0. Turning into follower
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: oned is set to follower mode
Fri Oct 27 10:23:17 2023 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Vote not granted from follower 2: [one.zone.voterequest] Candidate's term is outdated
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Follower 2 is in term 4294967295 current term is 0. Turning into follower
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: oned is set to follower mode
Fri Oct 27 10:23:17 2023 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Vote not granted from follower 2: [one.zone.voterequest] Candidate's term is outdated
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Follower 2 is in term 4294967295 current term is 0. Turning into follower
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: oned is set to follower mode
Fri Oct 27 10:23:17 2023 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Vote not granted from follower 2: [one.zone.voterequest] Candidate's term is outdated
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Follower 2 is in term 4294967295 current term is 0. Turning into follower
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: oned is set to follower mode
Fri Oct 27 10:23:17 2023 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Vote not granted from follower 2: [one.zone.voterequest] Candidate's term is outdated
Fri Oct 27 10:23:17 2023 [Z0][RCM][I]: Follower 2 is in term 4294967295 current term is 0. Turning into follower

Onezone show:

ZONE 0 INFORMATION
ID                : 0
NAME              : opennebula.zone
STATE             : ENABLED


ZONE SERVERS
ID NAME            ENDPOINT
 2 server-2        http://192.168.0.3:2633/RPC2
 3 server-0        http://192.168.0.1:2633/RPC2
 4 server-1        http://192.168.0.2:2633/RPC2

HA & FEDERATION SYNC STATUS
ID NAME            STATE      TERM       INDEX      COMMIT     VOTE  FED_INDEX 
 2 server-2        follower   0          708728     708728     4     -1
 3 server-0        follower   0          708728     708728     -1    -1
 4 server-1        leader     0          708728     708728     4     -1

ZONE TEMPLATE
ENDPOINT="http://localhost:2633/RPC2"

I use Opennebula Frontend CE 6.6.3 and server’s os is Ubuntu 22.04
Thank you.

vpalma · October 30, 2023, 3:58pm

Hi @Khang_Nguyen_Phuc

I think the issue with the TERM attribute comes from the fact that the maximum size of the variable has been reached. To fix the problem, please try to follow the steps below:

Stop all the frontends.
Create a backup of each frontend database.
Update the TERM attribute value in your database. It should be to run a command like this:
UPDATE logdb SET sqlcmd = "<TEMPLATE><TERM><![CDATA[0]]></TERM><VOTEDFOR><![CDATA[-1]]></VOTEDFOR></TEMPLATE>" WHERE log_index = -1;
Restart all the frontends.

Khang_Nguyen_Phuc · October 31, 2023, 7:40am

Thank you very much for your response. Currently, the cluster is running smoothly with term = 0. Is this currently a “problem,” and do I need to fix it soon? I have also read the log file I sent, “Follower 2 is in term 4294967295,” and I suspect there is an issue with Follower 2. Everything seems to be fine, with only one leader experiencing issues, which leads to an election taking place. Why does this lead to the term on Follower 2 increasing to the maximum?
Thanks you,

Topic		Replies	Views
All host in follower status HA / Federation	3	333	April 18, 2022
OpenNebula RAFT HA questions Community Support	1	690	March 5, 2018
Incomprehensible behavior RAFT Community Support	2	344	June 18, 2021
Frontend HA, leader changed due to timeout HA / Federation	0	871	February 20, 2020
Can't select leader Community Support	1	344	June 7, 2022

The term becomes 0 when a new leader is elected

Related topics