Opennebula upgrade from 7.0.1 CE to 7.2 CE - All KVM nodes are in ERROR state

The upgrade of opennebula from 7.0.1 CE to 7.2 CE seemed to go well, when I login to fireedge the networks, templates,vm’s, images etc are all there and everything looks fine except for the kvm nodes.
The controller node is also a kvm node, I have two other kvm-nodes joined to the default cluster. DEBUG_LEVEL to 4.

All nodes are in ERROR state.

ssh as oneadmin to all nodes is passwordless

oned.log : Does not have any errors

monitor.log : Is full with “Start monitor failed for host 1: Error executing monitord-client_control.sh: #<NoMethodError: undefined method `text’ for nil:NilClass>” errors.

Thu Apr 30 13:02:48 2026 [Z0][MDP][I]: Command execution failed (exit code: 1): ‘if [ -x “/var/tmp/one/im/run_monitord_client” ]; then /var/tmp/one/im/run_monitord_client kvm 0 one-mast-00; else exit 42; fi’
Thu Apr 30 13:02:48 2026 [Z0][MDP][I]: Error executing monitord-client_control.sh: #<NoMethodError: undefined method text' for nil:NilClass> Thu Apr 30 13:02:48 2026 [Z0][MDP][I]: Command execution failed (exit code: 1): 'if [ -x "/var/tmp/one/im/run_monitord_client" ]; then /var/tmp/one/im/run_monitord_client kvm 1 one-node-01; else exit 42; fi' Thu Apr 30 13:02:48 2026 [Z0][MDP][I]: Error executing monitord-client_control.sh: #<NoMethodError: undefined method text’ for nil:NilClass>
Thu Apr 30 13:02:48 2026 [Z0][MDP][I]: Command execution failed (exit code: 1): ‘if [ -x “/var/tmp/one/im/run_monitord_client” ]; then /var/tmp/one/im/run_monitord_client kvm 7 one-node-03; else exit 42; fi’
Thu Apr 30 13:02:48 2026 [Z0][MDP][I]: Error executing monitord-client_control.sh: #<NoMethodError: undefined method `text’ for nil:NilClass>

Thu Apr 30 13:02:54 2026 [Z0][MDP][DD]: [13:2:54] Received START_MONITOR message from host 0:
Error executing monitord-client_control.sh: #<NoMethodError: undefined method text' for nil:NilClass> Thu Apr 30 13:02:54 2026 [Z0][MDP][W]: Start monitor failed for host 0: Error executing monitord-client_control.sh: #<NoMethodError: undefined method text’ for nil:NilClass>
Thu Apr 30 13:02:54 2026 [Z0][HMM][E]: Unable to monitor host id: 0
Thu Apr 30 13:02:54 2026 [Z0][MDP][I]: Command execution failed (exit code: 1): ‘if [ -x “/var/tmp/one/im/run_monitord_client” ]; then /var/tmp/one/im/run_monitord_client kvm 1 one-node-01; else exit 42; fi’
Thu Apr 30 13:02:54 2026 [Z0][MDP][I]: Error executing monitord-client_control.sh: #<NoMethodError: undefined method text' for nil:NilClass> Thu Apr 30 13:02:54 2026 [Z0][MDP][DD]: [13:2:54] Received START_MONITOR message from host 1: Error executing monitord-client_control.sh: #<NoMethodError: undefined method text’ for nil:NilClass>
Thu Apr 30 13:02:54 2026 [Z0][MDP][W]: Start monitor failed for host 1: Error executing monitord-client_control.sh: #<NoMethodError: undefined method `text’ for nil:NilClass>


Versions of the related components and OS (frontend, hypervisors, VMs):
Operating System: Ubuntu 22.04.5 LTS
Kernel: Linux 5.15.0-176-generic
Architecture: x86-64
Hardware Model: B660-ITX

Operating System: Ubuntu 22.04.5 LTS
Kernel: Linux 5.15.0-176-generic
Architecture: x86-64
Hardware Vendor: To Be Filled By O.E.M.
Hardware Model: To Be Filled By O.E.M.

Operating System: Ubuntu 22.04.5 LTS
Kernel: Linux 5.15.0-176-generic
Architecture: x86-64
Hardware Vendor: To Be Filled By O.E.M.
Hardware Model: B660-ITX

Steps to reproduce:

Upgrade from 7.0.1 CE to 7.2 CE

Current results:
All working fine except nodes are in ERROR state.
monitor.log : Is full with “Start monitor failed for host 1: Error executing monitord-client_control.sh: #<NoMethodError: undefined method `text’ for nil:NilClass>” errors.

Expected results:
All kvm
nodes to be available and green.

Hello,

It looks like a problem with some monitoring script

find /var/tmp/one/im/kvm-probes.d/host -type f -printf "---\n%h/%f\n" -executable -exec {} < /dev/null \; 

That may give a hint about the script that may be failing on the hosts.

Also the directory /var/tmp/one/im/kvm-probes.d can be deleted on a host and recreated doing a sudo -u oneadmin onehost sync --force on the frontend

Cheers!

Hi Bruno,

Thanks for getting back to me and my problem.

As requested, the output of your find command:
sudo find /var/tmp/one/im/kvm-probes.d/host -type f -printf “—\n%h/%f\n” -executable -exec {} < /dev/null ;

/var/tmp/one/im/kvm-probes.d/host/monitor/numa_usage.rb
HUGEPAGE = [ NODE_ID = “0”, SIZE = “2048”, FREE = “0” ]
HUGEPAGE = [ NODE_ID = “0”, SIZE = “1048576”, FREE = “0” ]
MEMORY_NODE = [ NODE_ID = “0”, FREE = “59875048”, USED = “5728184” ]

/var/tmp/one/im/kvm-probes.d/host/monitor/prediction.sh
-:2: parser error : Start tag expected, ‘<’ not found

^
usage: prediction.py [-h] --entity ENTITY --pythonpath PYTHONPATH
prediction.py: error: argument --entity: Invalid entity string format: host,0,/var/tmp/one_db

/var/tmp/one/im/kvm-probes.d/host/monitor/linux_usage.rb
HYPERVISOR=kvm
USEDMEMORY=1434616
FREEMEMORY=64168616
FREECPU=2400
USEDCPU=0
NETRX=2864460282
NETTX=633756729

/var/tmp/one/im/kvm-probes.d/host/system/cpu_features.sh
KVM_CPU_FEATURES=“3dnowprefetch,abm,acpi,adx,aes,apic,arat,arch-capabilities,arch-lbr,avx,avx-vnni,avx2,avx512-bf16,avx512bw,avx512cd,avx512dq,avx512f,avx512vl,avx512vnni,bmi1,bmi2,clflush,clflushopt,clwb,cmov,core-capability,cx16,cx8,de,ds,ds_cpl,dtes64,erms,est,f16c,fma,fpu,fsgsbase,fsrm,fxsr,gfni,hle,ht,ibrs-all,intel-pt,invpcid,invtsc,lahf_lm,lm,mca,mce,md-clear,mds-no,mmx,monitor,movbe,movdir64b,movdiri,msr,mtrr,nx,pae,pat,pbe,pcid,pclmuldq,pconfig,pdcm,pdpe1gb,pge,pks,pku,pni,popcnt,pschange-mc-no,pse,pse36,rdctl-no,rdpid,rdrand,rdseed,rdtscp,rtm,sep,serialize,sha-ni,skip-l1dfl-vmentry,smap,smep,smx,spec-ctrl,ss,ssbd,sse,sse2,sse4.1,sse4.2,ssse3,stibp,syscall,taa-no,tm,tm2,tsc,tsc-deadline,tsc_adjust,umip,vaes,vme,vmx,vpclmulqdq,waitpkg,x2apic,xgetbv1,xsave,xsavec,xsaveopt,xsaves,xtpr”

/var/tmp/one/im/kvm-probes.d/host/system/cpu.sh
MODELNAME=“12th Gen Intel(R) Core™ i9-12900”

/var/tmp/one/im/kvm-probes.d/host/system/architecture.sh
ARCH=x86_64

/var/tmp/one/im/kvm-probes.d/host/system/name.sh
HOSTNAME=one-mast-00

/var/tmp/one/im/kvm-probes.d/host/system/clean_db.rb

/var/tmp/one/im/kvm-probes.d/host/system/linux_host.rb
HYPERVISOR=kvm
TOTALCPU=2400
CPUSPEED=0
TOTALMEMORY=65603232
CGROUPS_VERSION=2

/var/tmp/one/im/kvm-probes.d/host/system/version.sh

/var/tmp/one/im/kvm-probes.d/host/system/pci.rb

/var/tmp/one/im/kvm-probes.d/host/system/memory_encryption.rb
MEMORY_ENCRYPTION=NONE

/var/tmp/one/im/kvm-probes.d/host/system/wild_vm.rb


/var/tmp/one/im/kvm-probes.d/host/system/monitor_ds.rb
DS_LOCATION_USED_MB = 1465381
DS_LOCATION_TOTAL_MB = 5676950
DS_LOCATION_FREE_MB = 3925394
DS = [ ID = 0, USED_MB = 1465381,
TOTAL_MB = 5676950,
FREE_MB = 3925394
]

/var/tmp/one/im/kvm-probes.d/host/system/numa_host.rb
HUGEPAGE = [ NODE_ID = “0”, SIZE = “2048”, PAGES = “0” ]
HUGEPAGE = [ NODE_ID = “0”, SIZE = “1048576”, PAGES = “0” ]
CORE = [ NODE_ID = “0”, ID = “37”, CPUS = “21” ]
CORE = [ NODE_ID = “0”, ID = “20”, CPUS = “10-11” ]
CORE = [ NODE_ID = “0”, ID = “16”, CPUS = “8-9” ]
CORE = [ NODE_ID = “0”, ID = “12”, CPUS = “6-7” ]
CORE = [ NODE_ID = “0”, ID = “34”, CPUS = “18” ]
CORE = [ NODE_ID = “0”, ID = “8”, CPUS = “4-5” ]
CORE = [ NODE_ID = “0”, ID = “32”, CPUS = “16” ]
CORE = [ NODE_ID = “0”, ID = “4”, CPUS = “2-3” ]
CORE = [ NODE_ID = “0”, ID = “28”, CPUS = “14-15” ]
CORE = [ NODE_ID = “0”, ID = “0”, CPUS = “0-1” ]
CORE = [ NODE_ID = “0”, ID = “38”, CPUS = “22” ]
CORE = [ NODE_ID = “0”, ID = “24”, CPUS = “12-13” ]
CORE = [ NODE_ID = “0”, ID = “36”, CPUS = “20” ]
CORE = [ NODE_ID = “0”, ID = “20”, CPUS = “10-11” ]
CORE = [ NODE_ID = “0”, ID = “16”, CPUS = “8-9” ]
CORE = [ NODE_ID = “0”, ID = “35”, CPUS = “19” ]
CORE = [ NODE_ID = “0”, ID = “12”, CPUS = “6-7” ]
CORE = [ NODE_ID = “0”, ID = “33”, CPUS = “17” ]
CORE = [ NODE_ID = “0”, ID = “8”, CPUS = “4-5” ]
CORE = [ NODE_ID = “0”, ID = “28”, CPUS = “14-15” ]
CORE = [ NODE_ID = “0”, ID = “4”, CPUS = “2-3” ]
CORE = [ NODE_ID = “0”, ID = “39”, CPUS = “23” ]
CORE = [ NODE_ID = “0”, ID = “24”, CPUS = “12-13” ]
CORE = [ NODE_ID = “0”, ID = “0”, CPUS = “0-1” ]
MEMORY_NODE = [ NODE_ID = “0”, TOTAL = “65603232”, DISTANCE = “0” ]

/var/tmp/one/im/kvm-probes.d/host/system/machines_models.rb
KVM_MACHINES=“pc-i440fx-jammy ubuntu pc-i440fx-impish-hpb pc-q35-5.2 pc-i440fx-2.12 pc-i440fx-2.0 pc-i440fx-xenial pc-i440fx-6.2 pc pc-q35-4.2 pc-i440fx-2.5 pc-i440fx-4.2 pc-i440fx-focal pc-i440fx-hirsute pc-q35-xenial pc-i440fx-jammy-hpb pc-i440fx-5.2 pc-i440fx-1.5 pc-q35-2.7 pc-q35-eoan-hpb pc-i440fx-zesty pc-i440fx-disco-hpb pc-q35-groovy pc-i440fx-groovy pc-q35-artful pc-i440fx-trusty pc-i440fx-2.2 pc-i440fx-eoan-hpb pc-q35-focal-hpb pc-q35-jammy-maxcpus pc-q35-bionic-hpb pc-i440fx-artful pc-i440fx-2.7 pc-q35-6.1 pc-i440fx-jammy-maxcpus pc-i440fx-yakkety pc-q35-2.4 pc-q35-cosmic-hpb pc-q35-2.10 x-remote pc-i440fx-1.7 pc-q35-5.1 pc-q35-2.9 pc-i440fx-2.11 pc-i440fx-jammy-hpb-maxcpus pc-q35-3.1 pc-i440fx-6.1 pc-q35-4.1 pc-q35-jammy ubuntu-q35 pc-i440fx-2.4 pc-i440fx-4.1 pc-q35-eoan pc-q35-jammy-hpb pc-i440fx-5.1 pc-i440fx-2.9 pc-i440fx-bionic-hpb isapc pc-i440fx-1.4 pc-q35-cosmic pc-q35-2.6 pc-i440fx-3.1 pc-q35-bionic pc-q35-disco-hpb pc-i440fx-cosmic pc-q35-2.12 pc-i440fx-bionic pc-q35-groovy-hpb pc-q35-disco pc-i440fx-cosmic-hpb pc-i440fx-2.1 pc-i440fx-wily pc-q35-impish pc-q35-6.0 pc-i440fx-impish pc-i440fx-2.6 pc-q35-impish-hpb pc-q35-hirsute pc-q35-4.0.1 pc-q35-hirsute-hpb pc-i440fx-1.6 pc-q35-5.0 pc-q35-2.8 pc-i440fx-2.10 pc-q35-3.0 pc-i440fx-6.0 pc-q35-zesty pc-q35-4.0 pc-q35-focal microvm pc-i440fx-2.3 pc-q35-jammy-hpb-maxcpus pc-i440fx-focal-hpb pc-i440fx-disco pc-i440fx-4.0 pc-i440fx-groovy-hpb pc-i440fx-hirsute-hpb pc-i440fx-5.0 pc-i440fx-2.8 pc-q35-6.2 q35 pc-i440fx-eoan pc-q35-2.5 pc-i440fx-3.0 pc-q35-yakkety pc-q35-2.11”
KVM_CPU_MODELS=“qemu32 pentium3 pentium2 pentium n270 kvm64 kvm32 coreduo core2duo Westmere-IBRS Westmere Skylake-Client-noTSX-IBRS SandyBridge-IBRS SandyBridge Penryn Opteron_G1 Nehalem-IBRS Nehalem IvyBridge-IBRS IvyBridge Haswell-noTSX-IBRS Haswell-noTSX Conroe Broadwell-noTSX-IBRS Broadwell-noTSX 486”
KVM_CPU_MODEL=“Skylake-Client-noTSX-IBRS”

/var/tmp/one/im/kvm-probes.d/host/beacon/date.sh
1777910412

/var/tmp/one/im/kvm-probes.d/host/beacon/monitord-client-shepherd.sh

I also deleted /var/tmp/one/im/kvm-probes.d and ran the sync command

linux@one-mast-00:~$ sudo -u oneadmin onehost sync --force

  • Adding one-node-03 to upgrade
    nil versions are discouraged and will be deprecated in Rubygems 4
  • Adding one-node-01 to upgrade
  • Adding one-mast-00 to upgrade
    [========================================] 3/3 one-mast-00
    cannot delete non-empty directory: im/lib/python/pyoneai/pycache
    cannot delete non-empty directory: im/lib/python/pyoneai/core/pycache
    cannot delete non-empty directory: im/lib/python/pyoneai/core/tsnumpy/pycache
    cannot delete non-empty directory: im/lib/python/pyoneai/core/tsnumpy/index/pycache
    cannot delete non-empty directory: im/lib/python/pyoneai/ml/pycache

Failed to update the following hosts:

  • one-mast-00

The host where the kvm-probe was deleted is still in error, onehost enable 0 fails.
See the error in the motioning.log

Mon May 4 16:48:56 2026 [Z0][MDP][I]: Error executing monitord-client_control.sh: #<NoMethodError: undefined method text' for nil:NilClass> Mon May 4 16:48:56 2026 [Z0][MDP][DD]: [16:48:56] Received START_MONITOR message from host 1: Error executing monitord-client_control.sh: #<NoMethodError: undefined method text’ for nil:NilClass>
Mon May 4 16:48:56 2026 [Z0][MDP][W]: Start monitor failed for host 1: Error executing monitord-client_control.sh: #<NoMethodError: undefined method `text’ for nil:NilClass>
Mon May 4 16:48:56 2026 [Z0][HMM][E]: Unable to monitor host id: 1
Mon May 4 16:49:34 2026 [Z0][HMM][D]: Monitoring host one-mast-00(0)
Mon May 4 16:49:34 2026 [Z0][MDP][I]: Command execution failed (exit code: 23): exec 2>/dev/null; cd ‘/var/lib/one/remotes’/ && rsync -LRaz --delete . ‘one-mast-00’:‘/var/tmp/one’/
Mon May 4 16:49:34 2026 [Z0][MDP][I]: Received UNDEFINED msg: cannot delete non-empty directory: im/lib/python/pyoneai/pycache

Mon May 4 16:49:34 2026 [Z0][MDP][I]: Received UNDEFINED msg: cannot delete non-empty directory: im/lib/python/pyoneai/core/pycache

Mon May 4 16:49:34 2026 [Z0][MDP][I]: Received UNDEFINED msg: cannot delete non-empty directory: im/lib/python/pyoneai/core/tsnumpy/pycache

Mon May 4 16:49:34 2026 [Z0][MDP][I]: Received UNDEFINED msg: cannot delete non-empty directory: im/lib/python/pyoneai/core/tsnumpy/index/pycache

Mon May 4 16:49:34 2026 [Z0][MDP][I]: Received UNDEFINED msg: cannot delete non-empty directory: im/lib/python/pyoneai/ml/pycache

Mon May 4 16:49:34 2026 [Z0][MDP][DD]: [16:49:34] Received START_MONITOR message from host 0:
Could not update remotes
Mon May 4 16:49:34 2026 [Z0][MDP][W]: Start monitor failed for host 0: Could not update remotes
Mon May 4 16:49:34 2026 [Z0][HMM][E]: Unable to monitor host id: 0

Hello,

I suspect that the problem is the /var/tmp/one/im/kvm-probes.d/host/monitor/prediction.sh script. It is used for the DRS predictive to move resources from a host to another, but in this case it’s failing.

On the frontend, you could comment the last line on the file /var/lib/one/remotes/im/kvm-probes.d/host/monitor/prediction.sh (which is the one executing the probe) and execute a sudo -u oneadmin onehost sync --force, and after it disable and enable the host to ensure that this is the problem.

The pycache is a directory with the inference data used to predict the consumption of the host. you can check with an lsof if any program is locking it. Kill the program and delete the directory, if necessary.

Cheers!

Hi Bruno,

You’re a star, after following your instructions then enable/disable hosts, they came back on-line.
I also deleted /var/lib/one/remotes/im/lib/python/pyoneai.

Brilliant.

I still need to check more of the system, deploying vm’s etc but I’m off on Holiday for two tomorrow, so it will have to wait.

Kind regards
Dennis

Another possible problem is about the PROBES_PERIOD in monitord.conf file. Check that the value for EXEC_VM exists

PROBES_PERIOD = [
    BEACON_HOST    = 30,
    SYSTEM_HOST    = 600,
    MONITOR_HOST   = 120,
    STATE_VM       = 5,
    EXEC_VM        = 5,
    MONITOR_VM     = 30,
    SYNC_STATE_VM  = 180
]

Cheers!