Problems with "terminate VM" if VM is a LXC container

Hello,

My OpenNebula environment has two KVM hypervisors and one LXC hypervisor. When I instantiate one LXC image, it runs OK but when I terminate it, I have got some errors that avoid the “deletion” process.Log reports this information:

[…]

Mon Oct 13 09:33:26 2025 [Z0][VMM][I]: Running command sudo lxc-info -n ‘one-5886’ -s
Mon Oct 13 09:33:26 2025 [Z0][VMM][I]: Running command sudo lxc-info -n ‘one-5886’ -s
Mon Oct 13 09:33:26 2025 [Z0][VMM][I]: ExitCode: 0
Mon Oct 13 09:33:26 2025 [Z0][VMM][I]: Successfully execute virtualization driver operation: deploy.
Mon Oct 13 09:33:26 2025 [Z0][VMM][I]: ExitCode: 0
Mon Oct 13 09:33:26 2025 [Z0][VMM][I]: Successfully execute network driver operation: post.
Mon Oct 13 09:33:26 2025 [Z0][VM][I]: New LCM state is RUNNING
Mon Oct 13 09:33:32 2025 [Z0][LCM][I]: VM running but monitor state is POWEROFF
Mon Oct 13 09:33:32 2025 [Z0][VM][I]: New LCM state is SHUTDOWN_POWEROFF
Mon Oct 13 09:33:32 2025 [Z0][VM][I]: New state is POWEROFF
Mon Oct 13 09:33:32 2025 [Z0][VM][I]: New LCM state is LCM_INIT
Mon Oct 13 09:34:39 2025 [Z0][VM][I]: New state is ACTIVE
Mon Oct 13 09:34:39 2025 [Z0][VM][I]: New LCM state is EPILOG
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: Command execution failed (exit code: 1): /var/lib/one/remotes/tm/qcow2/delete nebula-4:/var/lib/one//datastores/0/5886 5886 0
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: delete: Deleting /var/lib/one/datastores/0/5886
Mon Oct 13 09:34:53 2025 [Z0][TrM][E]: delete: Command “[ -e “/var/lib/one/datastores/0/5886” ] || exit 0
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]:
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: times=10
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: function=“rm -rf /var/lib/one/datastores/0/5886”
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]:
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: count=1
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]:
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: ret=$($function)
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: error=$?
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]:
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: while [ $count -lt $times -a “$error” != “0” ]; do
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: sleep 1
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: count=$(( $count + 1 ))
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: ret=$($function)
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: error=$?
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: done
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]:
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: [ “x$error” = “x0” ]” failed: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/run/adduser’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/bin.usr-is-merged’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/opt’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/lost+found’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/sbin’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/usr/local/share/ca-certificates’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/usr/local/share/man’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/usr/local/games’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/usr/local/src’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/usr/local/sbin’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/usr/local/bin’: Permission denied
Mon Oct 13 09:34:53 2025 [Z0][TrM][I]: rm: cannot remove ‘/var/lib/one/datastores/0/5886/mapper/disk.0/usr/local/lib/python3.12/dist-packages’: Permission denied

[…]

These errors are reported in /var/log/one/VMID.log in OpenNebula scheduler-frontend server, while in LXC node hypervisor there is no relevant information about that error:

  • /var/log/auth.log:

[…]
Oct 13 09:56:36 nebulas-4 sudo: oneadmin : PWD=/var/tmp/one/im/lxc.d ; USER=root ; COMMAND=/usr/bin/lxc-info -n one-5886 -H
Oct 13 09:56:37 nebula-4 sudo: oneadmin : PWD=/var/tmp/one/im/lxc.d ; USER=root ; COMMAND=/usr/bin/lxc-info -n one-5886 -H
Oct 13 09:56:37 nebula-4 sudo: oneadmin : PWD=/var/tmp/one/im/lxc.d ; USER=root ; COMMAND=/usr/bin/lxc-info -n one-5886 -H
Oct 13 09:56:42 nebula-4 sudo: oneadmin : PWD=/var/tmp/one/im/lxc.d ; USER=root ; COMMAND=/usr/bin/lxc-info -n one-5886 -H
[…]

  • /var/log/kern.log:

[…]
Oct 13 09:33:25 nebula-4 kernel: [14430989.675487] br0: port 2(one-5886-0) entered blocking state
Oct 13 09:33:25 nebula-4 kernel: [14430989.675492] br0: port 2(one-5886-0) entered disabled state
Oct 13 09:33:25 nebula-4 kernel: [14430989.675574] device one-5886-0 entered promiscuous mode
Oct 13 09:33:26 nebula-4 kernel: [14430989.800301] br0: port 2(one-5886-0) entered disabled state
Oct 13 09:33:26 nebula-4 kernel: [14430989.800502] device one-5886-0 left promiscuous mode
Oct 13 09:33:26 nebula-4 kernel: [14430989.800507] br0: port 2(one-5886-0) entered disabled state
Oct 13 09:41:30 nebula-4 kernel: [14431474.758864] loop15: detected capacity change from 0 to 728
[…]

  • /var/log/syslog:

[…]
Oct 13 09:33:25 nebula-4 systemd-networkd[2963]: one-5886-0: Link UP
Oct 13 09:33:25 nebula-4 kernel: [14430989.675487] br0: port 2(one-5886-0) entered blocking state
Oct 13 09:33:25 nebula-4 kernel: [14430989.675492] br0: port 2(one-5886-0) entered disabled state
Oct 13 09:33:25 nebula-4 kernel: [14430989.675574] device one-5886-0 entered promiscuous mode
Oct 13 09:33:25 nebula-4 systemd-networkd[2963]: one-5886-0: Link DOWN
Oct 13 09:33:26 nebula-4 kernel: [14430989.800301] br0: port 2(one-5886-0) entered disabled state
Oct 13 09:33:26 nebula-4 kernel: [14430989.800502] device one-5886-0 left promiscuous mode
Oct 13 09:33:26 nebula-4 kernel: [14430989.800507] br0: port 2(one-5886-0) entered disabled state
Oct 13 09:41:30 nebula-4 kernel: [14431474.758864] loop15: detected capacity change from 0 to 728
Oct 13 09:46:25 nebula-4 systemd[1]: var-lib-lxc\x2done-5886-disk.1.mount: Deactivated successfully.
Oct 13 09:46:27 nebula-4 systemd[1]: var-lib-lxc\x2done-5886-disk.0.mount: Deactivated successfully.
[…]

  • /var/lib/one/datastores/0/5886:

root@nebula-4:/var/lib# ls -hlart one/datastores/0/5886/
total 0
drwxrwxr-x 4 oneadmin oneadmin 46 oct 13 09:33 mapper
drwxrwxr-x 3 oneadmin oneadmin 28 oct 13 09:34 .
drwxrwxr-x 10 oneadmin oneadmin 138 oct 13 09:42 ..
root@nebula-4:/var/lib# ls -hlart one/datastores/0/5886/mapper/
total 6,0K
drwxr-xr-x 2 oneadmin oneadmin 2,0K oct 13 09:33 disk.1
drwxrwxr-x 4 oneadmin oneadmin 46 oct 13 09:33 .
drwxr-xr-x 22 root root 4,0K oct 13 09:33 disk.0
drwxrwxr-x 3 oneadmin oneadmin 28 oct 13 09:34 ..

  • /var/lib/lxc/one-5886:

root@nebula-4:/var/lib# ls -lhart lxc/one-5886
total 8,0K
-rw-r----- 1 root root 725 oct 13 09:33 config
drwxrwx— 2 600100001 600100001 28 oct 13 09:33 .
drwx-----x 17 root root 4,0K oct 13 09:49 ..

  • /var/lib/lxc-one/5886:

root@nebula-4:/var/lib# ls -lhart lxc-one/5886
total 0
drwxrwxr-x 2 oneadmin oneadmin 10 oct 13 09:33 disk.0
drwxrwxr-x 2 oneadmin oneadmin 10 oct 13 09:33 disk.1
drwxrwxr-x 4 oneadmin oneadmin 46 oct 13 09:33 .
drwxr-x–x 9 oneadmin oneadmin 122 oct 13 09:49 ..

Then, after that, Sunstone VM information shows this error:

Driver Error
Mon Oct 13 09:55:45 2025: Error executing image transfer script: INFO: delete: Deleting /var/lib/one/datastores/0/5890 ERROR: … see more details in VM log

and VM remains in system with FAILURE state and in LXC hypervisor remains all mountpoints:

/var/lib/one/datastores/0/5886/mapper/disk.0 on /var/lib/lxc-one/5886/disk.0 type fuse (rw,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
/var/lib/one/datastores/0/5886/mapper/disk.1 on /var/lib/lxc-one/5886/disk.1 type fuse (rw,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)

Where is the bad configuration in my OpenNebula environment? LXC and KVM hypervisors mount datastore 1 and 2 via NFS, so only datastore 0 (where VMs are created) is local to the system.
With this configuration, KVM VMs runs perfectly (normally 300 at the same time). The problem is only with LXC VMs.

Thanks.

The problem stems from a race condition. It should be fixed on 7.0.1. The container VM is supposed to be RUNNING but during the deployment phase the monitoring reported it as POWEROFF since the deploy logic hadn’t finished starting the container. The consequence is that once you try to terminate a powered off container, the unmounting and unmapping logic will not be run, which leads to the errors you see.

Hi,

my OpenNebula environment is running version 6.8.0-1 and for me it is IMPOSSIBLE to upgrade because systems are in continuosly use by the academical students (University). Would be it possible to appy any patch?

Thanks.

6.8 has been EOL for a long time now. If it is a must, you can take a look at the commit diff and apply it to the 6.8 branch in order to build a custom binary. Since the patch is cpp code, you’ll need to replace oned with the custom oned.

I can modify source files src/im/InformationManager.cc and
src/lcm/LifeCycleStates.cc but how can I compile all source tree to get a new “oned*”?

Thanks.