'CLEANUP' after VM state is UNKNOWN?

dchebota · March 17, 2015, 12:45pm

Hi,

I’m seeing following behavior, which I believe is different after I upgraded ONE to 4.10 (from 4.8).
With version 4.8 whenever VM was in UNKNOWN state it was then ‘found again’ on next monitoring cycle and marked as RUNNING. (here is example of a VM which has been running before and after upgrade)
On version 4.8 (state changed UNKNOWN -> RUNNING, no action taken on the VM)

Sat Feb 7 08:12:10 2015 [Z0][LCM][I]: New VM state is UNKNOWN
Sat Feb 7 08:12:35 2015 [Z0][VMM][I]: VM found again, state is RUNNING
Sat Feb 7 09:08:35 2015 [Z0][LCM][I]: New VM state is UNKNOWN
Sat Feb 7 09:09:00 2015 [Z0][VMM][I]: VM found again, state is RUNNING
Sat Feb 7 10:05:05 2015 [Z0][LCM][I]: New VM state is UNKNOWN
Sat Feb 7 10:05:30 2015 [Z0][VMM][I]: VM found again, state is RUNNING
Sat Feb 7 11:01:30 2015 [Z0][LCM][I]: New VM state is UNKNOWN
Sat Feb 7 11:02:00 2015 [Z0][VMM][I]: VM found again, state is RUNNING
Sat Feb 7 11:58:05 2015 [Z0][LCM][I]: New VM state is UNKNOWN
Sat Feb 7 11:58:30 2015 [Z0][VMM][I]: VM found again, state is RUNNING

On version 4.10, when VM goes UNKNOW then RUNNING again, CLEANUP process kicks in 90 seconds after ‘VM found again, state is RUNNING’. I checked many VMs an see the same behavior - UNKNOWN->RUNNING->CLEANUP.

Thu Mar 5 22:04:13 2015 [Z0][LCM][I]: New VM state is RUNNING
Thu Mar 5 22:58:07 2015 [Z0][LCM][I]: New VM state is UNKNOWN
Thu Mar 5 22:58:38 2015 [Z0][VMM][I]: VM found again, state is RUNNING
Thu Mar 5 23:00:08 2015 [Z0][LCM][I]: New VM state is CLEANUP.
Thu Mar 5 23:00:08 2015 [Z0][VMM][I]: Driver command for 54290 cancelled
Thu Mar 5 23:00:17 2015 [Z0][VMM][I]: error: failed to get domain 'one-54290’
Thu Mar 5 23:00:17 2015 [Z0][VMM][I]: error: Domain not found: no domain with matching name 'one-54290’
Thu Mar 5 23:00:17 2015 [Z0][VMM][I]: ExitCode: 0
Thu Mar 5 23:00:17 2015 [Z0][VMM][I]: Successfully execute virtualization driver operation: cancel.

I checked oned.conf for any VM_HOOK, there is non for UNKNOWN. Just to be sure I removed VM_HOOK on FAILED, but it didn’t help. Is there a default VM_HOOK for UNKNOWN? It almost looks like ONE runs ‘delete --recreate’ on UNKNOWN state. Should I define VM_HOOK on UNKNOWN which does nothing? Please help.

Thank you.

ruben · March 17, 2015, 5:03pm

Hi

No there is no default hook for UNKNOWN.

It seems that the clean up is triggered by a resubmit action on the VM by
an external program, probably a hook. It should be more info in oned.log
about triggered hooks…

Cheers

dchebota · March 27, 2015, 5:25pm

Thank you.

I’ve made few changes in oned.conf file:

switched HOST/VM monitoring from UDP_PUSH to TCP_PULL:
For whatever reason UDP_PUSH gives to many monitoring errors, I believe this is what triggers UNKNOWN fro VMs in 1st place. I don’t know if TCP_PULL will be better under the same workload and VMs number…
changed argument for HOST_HOOK definition from “$ID -r -f -p 2” to default “$ID -r”

Will keep monitoring VM.logs and oned.log file…

Thank you.

Topic		Replies	Views
VMs in 'UNKNOWN' state after migration Community Support	4	1349	March 25, 2015
OpenNebula VM's in UNKNOWN state for a brief time Community Support	1	1141	August 27, 2015
VM stuck in RUNNING/POWEROFF cycle Community Support	7	1029	October 13, 2016
Random VMs incorrectly in POWEROFF state after upgrade to 5.12 General	1	496	September 3, 2021
VMs with wrong state (RUNNING) after host reboot General	3	1278	December 15, 2020

'CLEANUP' after VM state is UNKNOWN?

Related topics