Restrict live migration to the same CPU generation with host-passthrough

Hi all,

we recently added some new nodes to our ONe cluster. The new ones are EPYC 75F3 (they are FAST!!!), and old ones are two gens older, EPYC 7351. I discovered that I can live migrate VMs between them without a problem, unless they use CPU host-passthrough. I would prefer to not restrict users to one CPU type (e.g. by moving the new hosts to a dedicated cluster), nor I want to restrict myself to live-migrate all VMS (including those which don’t use CPU host-passthrough) only inside the same CPU generation hosts during system upgrades and maintenance.

Can I somehow say “when the VM CPU-type is host-passthrough, don’t live-migrate to a different CPU generation”, and have this restriction work automatically e.g. during “onehost flush” or “onevm resched”?

Thanks!

-Yenya

2 Likes

I’d say the best way to proceed is add the nodes into a different cluster. You can add the datastores and netowrks from the old one that are accesible to the new nodes; so basically sharing them across clsuters

1 Like

Hello,

today, while trying to gradually reboot the cluster nodes, I discovered that the above does not work. I have two clusters - one with older 7351 CPUs, and the other one with 75F3 CPUs. When I ran onehost flush <newer_node>, I discovered that some machines got migrated to nodes from a different cluster.

So creating a two clusters for the two CPU sub-archs does not work. Do you have any other ways how to restrict the live migration? Of course, something that would work even for VMs created by ordinary users in the future.

Thanks!

HI @Yenya,

Could you share your cluster configuration and the *_SCHED_REQUIREMENTS of these VMs which were migrated to host in the wrong cluster?

Is it possible that these VMs were running before you configured the cluster? Probably if that’s the case these VMs won’t have the right scheduling requirements.

Hello, @cgonzales,

thanks for the reply - I am sorry that I have apparently overlooked it. So - I don’t think I explicitly set *SCHED_REQUIREMENTS for the newly created VMs. One of my recently created VMs has the following in the onevm show output:

AUTOMATIC_DS_REQUIREMENTS="(\"CLUSTERS/ID\" @> 101 | \"CLUSTERS/ID\" @> 104) & (TM_MAD = \"ceph\")"
AUTOMATIC_NIC_REQUIREMENTS="(\"CLUSTERS/ID\" @> 101 | \"CLUSTERS/ID\" @> 104)"
AUTOMATIC_REQUIREMENTS="(CLUSTER_ID = 101 | CLUSTER_ID = 104) & !(PUBLIC_CLOUD = YES) & !(PIN_POLICY = PINNED)"

Cluster 101 is the original cluster with older CPUs, and 104 is the one with new CPUs.

-Yenya

Hi @Yenya,

Yes, those requirements are automatically filled by the OpenNebula service based on the compatibility of the resources (i.e images, networks, …) used by the VM. If you want to enforce VMs to be deployed/migrated only on an specific cluster you need to manually manage that (e.g by setting the corresponding sched requirement manually in the VM Template).

@cgonzales:

if I were both the infrastructure maintainer and a sole template creator, I would do this. However, I am an infrastructure maintainer only - my users (workgroups) create their own templates and VMs. And I also want to provide generic templates such as “Fedora 37” or “AlmaLinux 9” without the scheduling requirements, usable on all my hosts for various purposes.

From the infrastructure maintainer point of view, I need to be able to safely run onehost flush in order to bring the host down for kernel upgrade, without explaining the gory details of CPU steppings to all the template creators and hoping they write their scheduling requirements correctly. Isn’t the point of that “cloud” thing being able to hide these details from users?

Also, your suggestion presumes that I know beforehand on which hosts should a particular VM run. In fact, I need it to run anywhere where there are free resources, but to stay on a host with the same CPU stepping once it is deployed.

Is it possible to pin the VM to a particular cluster once it is deployed? Or should I for example attempt to write a script which would catch the VMs as they are created, and set the scheduling requirements ex post? This would still leave a short time window between such a script scanning all the VMs and me running onehost flush, where a new VM could be deployed and then migrated to a different CPU stepping.

So if I understand it correctly, OpenNebula currently does not have a better (automatic) solution for this. Is it so? Thanks,

-Yenya

So, I opened the VM page (sunstone_url)/#vms-tab/1234 and in the Info card I tried to add a new attribute in the Attributes section. I entered name SCHED_REQUIREMENTS and value CLUSTER_ID=101. Then I clicked on the (+) icon on the right side of the value. Both input fields got empty and there is no notion of a new attribute being set.

However, from the command line it seems that the attribute got set successfully:

$ onevm show 1234 | grep SCHED_REQUIREMENTS
SCHED_REQUIREMENTS="CLUSTER_ID = 101"

Is this a bug in Sunstone? My system runs ONe 6.2 CE.

Running onevm resched 1234 every 10 seconds it seems that the attribute is interpreted correctly, and the VM 1234 does not leave the cluster #101.