Front-End HA and consistent OpenNebula DB

I have installed OpenNebula 6.2.1 front-end on three servers using the HA setup. Since there are 3 different OpenNebula (MySQL) databases, one on each server, I wonder which updates are applied on all three databases and which are not? For example, I can see that the updates on MySQL table host_pool of leader are applied to the corresponding tables of followers. However the same does not happen in terms of MySQL table host_monitoring, since this table has different entries on each server. As far as I see, the host_monitoring table of a server is updated only when this server is the leader and the same updates do not pass to followers.

I’m just guessing, but it seems only logical that host_monitor table would not be synchronized because in case of failover, new primary will check hosts availability itself, not relying on monitoring data from failed front-end.

Alright, I follow your reasoning. However, I would expect from the host monitoring data to be synchronized between all the front-end nodes of a HA setup. For example, the following picture shows the monitoring history of a host CPU activity that is currently stored inside host_monitoring table of the current leader. If this leader fails and another front-end node takes its place then the monitoring history of the hosts is empty on the new leader and the corresponding CPU activity graph starts from the beginning (the time that the new leader was elected). As a result, you lose the activity history of your hypervisor hosts after a front-end failover.

Screenshot from 2022-02-17 18-36-43


In the initial implementation all the tables and operations were replicated across all followers. When it was released, installations with a large number of hosts and VMs required to replicate a huge amount of data (with the associated API calls of the RAFT algorithm). Considering also that the monitoring data has a retention period, we opted for not replicating monitor data; and assuming that a change in the leader will reset monitoring.

I understand that this may be confusing after a leader change, however it should not affect any other functional parts…



Thanks Rubén!