At first. we have a pre-production arch with 1 master and 1 dom0 (only support kvm)
I updated from 4.14 to 5.40, all fine no problems.
MASTER
[root@pre one]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
DOM0
[root@compute-11-25 ~]# cat /etc/redhat-release
Scientific Linux release 6.7 (Carbon)
I know there are some problems with dom0 version 6.7 but i dont know if there are any solution to solve that.
Well the problem.
After update, some machines that are running turn to POWEROFF. (but machines are working fine) and in the log i can found that:
Mon Aug 14 13:50:23 2017 [Z0][LCM][I]: VM running but monitor state is POWEROFF
Mon Aug 14 13:50:23 2017 [Z0][VM][I]: New LCM state is SHUTDOWN_POWEROFF
Mon Aug 14 13:50:23 2017 [Z0][VM][I]: New state is POWEROFF
Mon Aug 14 13:50:23 2017 [Z0][VM][I]: New LCM state is LCM_INIT
In pre production there are no problem, i think this problem is related to “not updated” SSOO at dom0… but if that happens in production we have a problem…
Whats the problem? and… is possible to recover state to running? because i cant update dom0 while have running machines… and cant update opennebula stopping al VM…
Have you issued onehost sync --force to update remote scripts after upgrade.
You can try executing the IM probes manually in the remote machine to check if the VM is detected or for any error message. As oneadmin in the hypervisor:
$ cd /var/tmp/one/im/kvm-probes.d
$ ./poll.sh
Do you get info about VMs running there? Any error message?
I’ve tested machines without network just in case and they are monitored correctly. Also the KVM IM configuration seems correct to me. Monitoring should be working or the host wouldn’t be in MONITORED state.
Can you take a look at oned.log maybe the problem is in some other probe and it’s not able to parse some of the data.