I have been testing the arm build of OpenNebula 6.10 on a Turing Pi arm cluster board. my configuration is 2x RK1 32GB, 8 arm cores and 2x Raspberry Pi Compute Module 4 (8GB, 4 arm cores). I did encounter some issues and worked around them. Currently I have the core and fireedge running and can boot and run VMs on both RK1 and CM4 nodes.
These were the issues that I encountered:
- /var/lib/one/remotes/im/kvm-probes.d/host/system/cpu.sh
- Kernel on RK1 and CM4 both report cpu model as ‘cpu model’ in /proc/cpuinfo.
- Modified script to handle this case
- Note that mainline kernels on RK1 do not report CPU model name in /proc/cpuinfo.
lscpu shows as Cortex-A76 for performance cores, and Cortex-A55 for efficiency cores. - CM4 reports correctly as ‘Cortex-A72’
- /var/lib/one/remotes/im/kvm-probes.d/host/system/cpu_features.sh
- ‘libvirt capabilities’ does not report CPU model on RK1 or CM4. This means “virsh cpu-baseline <(virsh capabilities) …” does not work
- added workaround to pull cpu flags from /proc/cpuinfo for aarch64.
- I have heard claims that this can be inaccurate but in my case these are accurate as validated against direct calls to the ARM CPU configuration registers.
- /var/lib/one/remotes/im/kvm-probes.d/host/system/machines_models.rb
- added architecture ‘aarch64’ to GUEST_ARCHS
- /var/lib/one/remotes/im/kvm-probes.d/host/system/numa_host.rb
- RK1 and CM4 standard kernels do not include the NUMA config option, which causes host monitoring to fail with:
Thu Apr 10 17:04:50 2025 [Z0][MDP][I]: from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/numa_host.rb:97:in `foreach'
Thu Apr 10 17:04:50 2025 [Z0][MDP][I]: from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/numa_host.rb:97:in `<main>'
Thu Apr 10 17:04:50 2025 [Z0][MDP][I]: </SYSTEM_HOST>
Thu Apr 10 17:04:50 2025 [Z0][MDP][W]: Start monitor failed for host 0: Error executing monitord-client_control.sh: <MONITOR_MESSAGES><SYSTEM_HOST>Error executing numa_host.rb: <internal:dir>:98:in `open': No such file or directory @ dir_initialize - /sys/bus/node/devices/ (Errno::ENOENT) from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/numa_host.rb:97:in `foreach' from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/numa_host.rb:97:in `<main>' </SYSTEM_HOST>
- Added a feature in numa_common.rb and numa_host.rb to build a ‘dummy’ single-node numa config for hosts without numa configured in the kernel. This runs if the numa configuration /sys/bus/node/devices is missing. Otherwise it uses the original numa code.
- For heterogenous architectures (e.g. big.LITTLE) where cores have different performance characteristics, this change will separate them into clusters based on the ‘cpu part’ attribute. (for RK3588 on RK1, e-cores are 0-3, p-cores are 4-7), however you need to use ‘static’ placement in the domain xml to use this, but it’s ok where all cores have the same characteristics (CM4)
- alternative workaround:
RK1: Use mainline kernel 6.14.8 which has numa support, control core allocation using numa pin policy
CM4: Set pin policy to NONE, numa scripts will build a dummy configuration.
-
big.LITTLE arm architecture bug on kernels < 6.8 on RK1 (VMs fail to start with ‘error copying registers’ and goes to BOOT_FAIL)
Use mainline kernel 6.14+ and control core allocation with pin policy. -
For a mixed x86_64 and aarch64 set of hypervisors, the scheduler does not appear to take into account the architecture of the template or image when scheduling a VM to a host. I think this can be worked around with using affinity/anti_affinity, but is there a better way than this?