Upgrade MySQL to MySQL cluster

Hi,

is there some general guide how to migrate from MySQL backend when upgrading Opennebula 5.0.2 to 5.4 ?
In documentation ( https://docs.opennebula.org/5.4/intro_release_notes/upgrades/upgrade_50.html ) there is in first step mentioned - Stop the MySQL replication in all the slaves, but I have only single instance of MySQL on node that running Sunstone and other nodes are only nodes with shared storage connected and opennebula-node package installed.

I would like to upgrade my setup to 5.4 (latest) and migrate MySQL backend to HA and start using HA in Opennebula.

Thanks for suggestions.

hi Martin,

ONE 5.4 already has a new feature for that, which makes the ONE master HA using Raft.
(See this cool animation on how it works: https://raft.github.io/raftscope/index.html)

Using this method, your master server, with oned, sunstone and mysql will exist 3x or 5x (or 7x, etc,etc).
So you will need to install ALL packages and settings and configfiles on those servers.
When one of them dies, the other 2 can take over, including running mysql.

Here is the documentation about setting up Raft for opennebula:
http://docs.opennebula.org/5.4/advanced_components/ha/frontend_ha_setup.html

This is in my opinion way simpler, and not dependant on mysql with master/slave replication.

If I understand correctly, there is no clustered mysql i.e. galera cluster ? And all nodes runs only single instance of mysql server without replication and Raft will take care of sync ?

Thanks.

exactly, every “candidate master” has its own mysql DB, raft takes care of holding changes to the database, when it needs to switch, an election starts, someone wins and it will be talking to its local DB, holding the same content.

oneadmin@server000:~$ onezone show 0
ZONE 0 INFORMATION
ID : 0
NAME : OpenNebula

ZONE SERVERS
ID NAME ENDPOINT
0 server000 http://10.1.16.200:2633/RPC2
1 server001 http://10.1.16.201:2633/RPC2
2 server002 http://10.1.16.202:2633/RPC2

HA & FEDERATION SYNC STATUS
ID NAME STATE TERM INDEX COMMIT VOTE FED_INDEX
0 server000 leader 236 266124 266124 0 -1
1 server001 follower 236 266124 266124 -1 -1
2 server002 follower 236 266124 266124 -1 -1

ZONE TEMPLATE
ENDPOINT=“http://localhost:2633/RPC2”

This is the running situation at the moment, when I kill opennebula on server000 (current master), it will switch and will show:

oneadmin@server001:~$ onezone show 0
ZONE 0 INFORMATION
ID : 0
NAME : OpenNebula

ZONE SERVERS
ID NAME ENDPOINT
0 server000 http://10.1.16.200:2633/RPC2
1 server001 http://10.1.16.201:2633/RPC2
2 server002 http://10.1.16.202:2633/RPC2

HA & FEDERATION SYNC STATUS
ID NAME STATE TERM INDEX COMMIT VOTE FED_INDEX
0 server000 error - - - -
1 server001 leader 237 266149 266149 1 -1
2 server002 follower 237 266149 266149 -1 -1

ZONE TEMPLATE
ENDPOINT=“http://localhost:2633/RPC2”

Now server001 is leader and talks to its own database locally. the INDEX and COMMIT value shows how up-to-date they are, the TERM number has increased by 1, to show there has been a new election, which resulted in server001 becoming leader.
All you need to add is a floating IP, available on all 3 servers, this will enable the web interface to also move to server001. So this IP will become active on 000/001/002, depending which server “wins” the election.

If I then enable opennebula on server000 again, it shows:

oneadmin@server000:~$ onezone show 0
ZONE 0 INFORMATION
ID : 0
NAME : OpenNebula

ZONE SERVERS
ID NAME ENDPOINT
0 server000 http://10.1.16.200:2633/RPC2
1 server001 http://10.1.16.201:2633/RPC2
2 server002 http://10.1.16.202:2633/RPC2

HA & FEDERATION SYNC STATUS
ID NAME STATE TERM INDEX COMMIT VOTE FED_INDEX
0 server000 follower 237 266218 266218 -1 -1
1 server001 leader 237 266218 266218 1 -1
2 server002 follower 237 266218 266218 -1 -1

ZONE TEMPLATE
ENDPOINT=“http://localhost:2633/RPC2”

Now there is no need for a new election, because the master (001) is still up. So when server000 becomes active again, it will be made up-to-date and joins as a follower. The TERM stays at 237, because no new election was necessary, and the master is now still @ server001, together with the floating IP and the web interface.

NOTE: remember to apply any changes (EDIT: like changes to /etc/one/oned.conf - thx Martin) 3 times tho, as the configuration of opennebula between the 3 servers MUST stay identical !!

1 Like

In note I suppose you mean changes of oned.conf in etc folder and so on.

Thanks a lot for explanation, I am sure it will help a lot to people searching for how Raft works. It’s crystal clear now.

Also I wanted to ask if there is some hook in Raft configuration that will provide some “stonith” functionality when one of nodes in Opennebula cluster will fail, so other running and connected nodes will be able to fence “error” host/node.

BR,
Martin

I think you mean VM HA ?
so a host dies, all VMs need to be restarted on a host that does work?
ONE is monitoring all hosts already in it’s “default” configuration, and it will not be selected as host for a VM deployment until it is out of error status. All VM’s on that broken host can be redeployed, if you want, using this method:

http://docs.opennebula.org/5.4/advanced_components/ha/ftguide.html

You can configure a hook to take action for a broken VM once it detects a problem.

No I dont, VM HA is next step after fencing/stonith configuration. For example :
If i have shared FS shared between let’s say 3 nodes as you mentioned upper. When 1 node from 3 will fail in Rift, other 2 nodes can continue to deliver services, they can make “VM HA” as you mentioned but only after they ensure themselves that the failed node is for sure down, so starting failed VMs on remaining 2 nodes will not damage shared filesystem.

Imagine that SAN will work and opennebula connection between node will fail. Opennebula will re-start (VM HA) VMs on remaining nodes but they will also be running on failed node on shared FS, it can create serious situations. So I thought it will be best to shoot the failed node with IPMI or DRAC/iLO down so you can be safe from conflict.

EDIT: on your link http://docs.opennebula.org/5.4/advanced_components/ha/ftguide.html
there is fencing mentioned:
Note that spurious network errors may lead to a VM started twice in different hosts and possibly contend on shared resources. The previous script needs to fence the error host to prevent split brain VMs. You may use any fencing mechanism for the host and invoke it within the error hook.

I don’t think that needs any work, opennebula already monitors all physical hosts constantly. If it finds that it cant reach one of them, it will go to “error” state and is basically fenced already.
So when you, or a user, starts a new VM, the opennebula scheduler will search for a suitable host, and the broken host will never come up as a candidate, because of it’s error state.

That “serious situation” happened to us already, two VMs using the same disk (NFS + qcow2) but it still worked :slight_smile:
When the broken host got up, we killed the VM processes on it, and users never noticed anything happened (besidess a VM reboot). I’d rather fix a split-brain situation during office hours then having to wake up at night to manually restart VMs, but that’s a personal preference, I guess ^^

This is exact situation that I am talking about. I want to prevent this kind of split-brain situation because this time maybe because of NFS you was saved by NFS locking system, next time it can be totaly damaged qcow2 image. I use GlusterFS (libgfapi) and I want to ensure my customers/users that this situation will not happen. I will take a look at hooks, there should be possibilty to implement fencing and prevent splitbrain at all. VM HA is also good and I will implement it after upgrade as welll. As you said I’d rather fix a split-brain situation during office hours then having to wake up at night to manually restart VMs.