Hello all,
I have attempted upgrading my preproduction raft cluster from 5.4.6 to 5.4.13 and i am experiencing a lot of:
Fri Jun 8 11:56:13 2018 [Z0][RCM][I]: Error requesting vote from follower 1:RPC call timed out and aborted
Fri Jun 8 11:56:14 2018 [Z0][RCM][I]: Error requesting vote from follower 1:RPC call timed out and aborted
Fri Jun 8 11:56:15 2018 [Z0][RCM][I]: Error requesting vote from follower 1:RPC call timed out and aborted
Fri Jun 8 11:56:15 2018 [Z0][RCM][I]: Error requesting vote from follower 1:RPC call timed out and aborted
And these on the other side:
Fri Jun 8 12:00:06 2018 [Z0][RCM][I]: Detetected error condition on follower 0. Last error was: Error replicating log entry 0 on follower 0: RPC call timed out and aborted
I am running Centos 7.5 fully update on each of the 3 servers. These servers work working beautifully in a 3 node raft cluster with 5.4.6 and i made sure to run /usr/share/one/install_gems as instructed in the upgrade documentation.
I have the raft timeouts set to 0.
Here is the raft configuration from SERVER_ID 0:
FEDERATION = [
MODE = “STANDALONE”,
ZONE_ID = 0,
SERVER_ID = 0,
MASTER_ONED = “”
]
RAFT = [
LOG_RETENTION = 500000,
LOG_PURGE_TIMEOUT = 600,
ELECTION_TIMEOUT_MS = 0,
BROADCAST_TIMEOUT_MS = 500,
XMLRPC_TIMEOUT_MS = 0
]
RAFT_LEADER_HOOK = [
COMMAND = “raft/vip.sh”,
ARGUMENTS = “leader team0 10.xx.0.xxx/24”
]
RAFT_FOLLOWER_HOOK = [
COMMAND = “raft/vip.sh”,
ARGUMENTS = “follower team0 10.xx.0.xxx/24”
]
Any thoughts???