No VM logs - Error executing state.rb: database is locked

vitekp · June 2, 2022, 4:09pm

Hello,

for rollback sake I prepared brand new ONE FE 6.0, restored production DB into it, and granularly upgraded it to the 6.4 without problems. After that I am no longer able to see VM logs - they are empty and monitor.log is full of:
[Z0][MDP][W]: Failed to monitor VM state for host X: Error executing state.rb: database is locked.

All nodes (KVM) updated to 6.4, all host force synced. FE in Stand-alone mode. OS Ubuntu 20.04.
“virsh …” can be called by oneadmin.

Any ideas welcomed, thank you.

dclavijo · June 2, 2022, 7:17pm

There is an sqlite database on the virtualization node. The monitoring information is cached on there. There might be an issue with db file itself. It should look like

oneadmin@ubuntu2004-kvm-qcow2-6-4-rpZvf-2:~$ sqlite3 /var/tmp/one/im/status_kvm_1.db 'select * from states'
3320236b-56c8-4481-bd98-10602c2806bb|62|one-62|3320236b-56c8-4481-bd98-10602c2806bb|1654197356|0|RUNNING|kvm

having 1 row per VM. You can delete such database on the host and the monitoring service should recreate it on its own.

vitekp · June 3, 2022, 8:29am

Thank you Daniel for your quick reply.

I can confirm, that there is such a database, it can be read, contains one VM per row, and when I delete it, monitoring re-creates it immediately.
The result stays the same unfortunately, still the same error message with one change - right after deleting the db file, I got:
Failed to monitor VM state for host X: Error executing state.rb: attempt to write a readonly database
for a few moments, then locked database again.

vitekp · June 3, 2022, 11:24am

Problem solved.

There were an overlooked monitoring processes accessing the same DB but with bad One FE address on the hosts (leftover from old FE HA cluster). After killing all process groups error messages stopped coming.

sadig2 · April 19, 2024, 10:11am

Can you please describe the solution ? i am having the same problem
killing processes does not sound like permanent solution

vpalma · April 30, 2024, 10:26am

Hi @sadig2

What steps have you performed? As a summary of this thread I recommend the following:

Delete the sqlite database located on the Hypervisor node in /var/tmp/one (you can delete the entire folder if you wish, this folder will be recreated at sync time).
Make sure to kill all existing monitor processes on the Hypervisor node. Multiple running processes at the same time may be due to a recent upgrade or an improper restart of the service.

Best,
Victor.

sadig2 · April 30, 2024, 10:30am

Hi Victor Thank you for your comment

Hi, indeed right after deleting sqlite3 file it gets recreated.
I tried to restart opennebula and running processes many times - the result stayed the same. Actually i pin pointed that state.rb probe is the one causing errors in the oned.log . I don’t know why.

by “Multiple running processes at the same time” - you mean i get the “database is locked” error because my processes are trying to read from database simultaneously ?

How can i kill state.rb process, when something starts state.rb script as new process every 3-4 seconds/

mkutouski · August 14, 2024, 9:12am

One needs to kill all /var/tmp/one/im/kvm.d/monitord-client.rb processes which spawns state.rb probe on the hypervisor node.

.

Topic		Replies	Views
[Solved] Host in error state, monitoring issue Community Support	3	813	September 27, 2016
Cannot monitor VM status and Host with error Community Support	1	1881	February 5, 2018
Error parsing VM_STATE: in /var/log/one/oned.log Community Support	6	585	November 13, 2020
[4.12] VM stuck in stopped state, how to force new state Community Support	3	889	April 17, 2015
VM in wrong state. Should I update DB manaully to recover? How VM and its state is being monitor? Community Support	4	1827	March 7, 2018

No VM logs - Error executing state.rb: database is locked

Related topics