Hi all,
I have just upgraded to 6.10.0 from 6.8 (CE), and wanted to reboot all nodes. During onehost flush
I have seen migrating VMs failing with the following message in the VM log:
Mon Feb 3 14:18:16 2025 [Z0][VMM][I]: Command execution fail (exit code: 1): cat << 'EOT' | /var/lib/one/tmp/vmm/kvm/migrate '0b62ee41-3530-459d-9f92-ab0de19d826a' 'node5' 'node4' 3853 node4
Mon Feb 3 14:18:16 2025 [Z0][VMM][I]: virsh --connect qemu:///system migrate --live 0b62ee41-3530-459d-9f92-ab0de19d826a qemu+ssh://node5/system (23.462960391s)
Mon Feb 3 14:18:16 2025 [Z0][VMM][I]: Error mirgating VM 0b62ee41-3530-459d-9f92-ab0de19d826a to host node5: undefined method `upcase' for nil:NilClass
Mon Feb 3 14:18:16 2025 [Z0][VMM][I]: ["/var/lib/one/tmp/vmm/kvm/migrate:255:in `<main>'"]
Mon Feb 3 14:18:16 2025 [Z0][VMM][I]: ExitCode: 1
Mon Feb 3 14:18:16 2025 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_failmigrate.
Mon Feb 3 14:18:16 2025 [Z0][VMM][I]: Failed to execute virtualization driver operation: migrate.
Mon Feb 3 14:18:16 2025 [Z0][VMM][E]: MIGRATE: virsh --connect qemu:///system migrate --live 0b62ee41-3530-459d-9f92-ab0de19d826a qemu+ssh://node5/system (23.462960391s) Error mirgating VM 0b62ee41-3530-459d-9f92-ab0de19d826a to host node5: undefined method `upcase' for nil:NilClass ["/var/lib/one/tmp/vmm/kvm/migrate:255:in `<main>'"] ExitCode: 1
Mon Feb 3 14:18:16 2025 [Z0][VM][I]: New LCM state is RUNNING
Mon Feb 3 14:18:16 2025 [Z0][LCM][I]: Fail to live migrate VM. Assuming that the VM is still RUNNING.
Mon Feb 3 14:18:47 2025 [Z0][LCM][I]: VM running but monitor state is POWEROFF
Now the VM seems to be running on node5 (i.e. it migrated successfully), but OpenNebula reports that it is in POWEROFF state.
The fix seems to be simple:
--- /var/lib/one/remotes-6.10.0-1.el9-dist/vmm/kvm/migrate 2024-08-27 18:27:44.000000000 +0200
+++ /var/lib/one/remotes/vmm/kvm/migrate 2025-02-03 14:58:18.190160184 +0100
@@ -252,7 +252,7 @@
# Compact memory
# rubocop:disable Layout/LineLength
- if ENV['CLEANUP_MEMORY_ON_STOP'].upcase == 'YES'
+ if ENV['CLEANUP_MEMORY_ON_STOP'].to_s.upcase == 'YES'
`(sudo -l | grep -q sysctl) && sudo -n sysctl vm.drop_caches=3 vm.compact_memory=1 &>/dev/null &`
end
# rubocop:enable Layout/LineLength
But how can I recover the VMs without disruption? As I said, they are running on new hosts, so I just need to tell that to ONe. How can I do this? Thanks!
-Yenya