All host in follower status

image
the leader server had disc problem. after disk repair and reboot server we hawe 8 server in state follower, in log error Candidate’s term is outdated, can we manualy update term timestamp?

Sat Apr 16 10:10:14 2022 [Z0][ReM][E]: Req:2720 UID:0 one.zone.voterequest result FAILURE [one.zone.voterequest] Candidate’s term is outdated
Sat Apr 16 10:10:16 2022 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Sat Apr 16 10:10:25 2022 [Z0][ReM][E]: Req:6128 UID:0 one.zone.voterequest result FAILURE [one.zone.voterequest] Candidate’s term is outdated
Sat Apr 16 10:10:27 2022 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Sat Apr 16 10:10:35 2022 [Z0][ReM][E]: Req:1648 UID:0 one.zone.voterequest result FAILURE [one.zone.voterequest] Candidate’s term is outdated
Sat Apr 16 10:10:37 2022 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess
Sat Apr 16 10:10:45 2022 [Z0][ReM][E]: Req:272 UID:0 one.zone.voterequest result FAILURE [one.zone.voterequest] Candidate’s term is outdated
Sat Apr 16 10:10:47 2022 [Z0][RRM][E]: Failed to get heartbeat from leader. Starting election proccess

I try some variant for resolve problem and now I’m sure that the problem is in the turn number in the raft, because of the failure it turned out that the version in the database is lower than the version in the raft, need a method how to manually correct the turn number in database or in raft.
image

It seems that the TERM value have reached the UINT_MAX limit. This is probably caused by a problem electing a new leader in the cluster. The best way to proceed is probably to select the latest leader node (which should have the latest changes) and rebuild the HA cluster from there.

I fixed problem, take backup database, manualy changed term number in first row in logdb table then restore database to all server and run opennebula.

1 Like