Does anyone have any experience of changing the monitors in their ceph cluster transparently for running VMs?
We have recently bought new hardware to replace our old Ceph monitors, I have replaced two of the three monitors in the Ceph cluster (one at a time, addding one then removing one). I have updated the CEPH_HOST property of our datastore in opennebula and newly launched VMs pick this up and use the right mons. However VMs instantiated before I made this change are still using the old mons for CEPH_HOST. In practice they are all talking to the one original mon remaining as the others are no longer in the ceph cluster and are now shutdown. I would very much like to migrate these remaining VMs to using the new mons, as I would like to remove the remaining old one (and it is precarious having them rely on a single host).
I can’t think of a clean way to make that work, maybe, just maybe, you could perform a migration (or live migration) and in the new host use iptables to route the connection to the proper host (-t mangle?). So in the deployment file you will still have the old ips, but iptables will forward the connection to the new ips. Not sure if that would work, and I believe it would be quite dangerous…
Another possibility would be to stop opennebula and edit the database in order to replace the ips.
This makes a strong case to use domain names instead of ips in the ceph_mons. I hadn’t thought of that…
We’ve also recently moved to some new hardware and have run into a problem migrating machines which have a stale list of ceph_host. I am able to update vm_pool set body = … and this does work, but only if the change is made while oned is stopped. I assumed the vm description data is stored in memory on the host while it’s running for (potentially?) performance reasons but it seems odd that this description would clobber any changes made while the vm is running.
I’m sure this is just a misunderstanding of how opennebula works but I’m curious about the design decision.