Hello,
I’ve installed OpenNebula 5.12.0.3 together with a ceph cluster for personal use, and facing with a confusing issue.
- Created ceph data stores (image:103 and system:102)
- Downloaded a kvm image from MarketPlaces into the image datastore
- Tried to instantiate the associated template into the ceph_system data store - no success
- To narrow down the issue, tried to instantiate into system data store - no success
VM log:
Sun Oct 25 12:13:55 2020 [Z0][VM][I]: New state is ACTIVE
Sun Oct 25 12:13:55 2020 [Z0][VM][I]: New LCM state is PROLOG
Sun Oct 25 12:14:13 2020 [Z0][VM][I]: New LCM state is BOOT
Sun Oct 25 12:14:13 2020 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/16/deployment.0
Sun Oct 25 12:14:17 2020 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Sun Oct 25 12:14:19 2020 [Z0][VMM][I]: ExitCode: 0
Sun Oct 25 12:14:19 2020 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Sun Oct 25 12:14:20 2020 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/deploy ‘/var/lib/one//datastores/0/16/deployment.0’ ‘node8-kvm’ 16 node8-kvm
Sun Oct 25 12:14:20 2020 [Z0][VMM][I]: error: Failed to create domain from /var/lib/one//datastores/0/16/deployment.0
Sun Oct 25 12:14:20 2020 [Z0][VMM][I]: error: internal error: process exited while connecting to monitor: 2020-10-25T11:14:20.216694Z qemu-system-x86_64: -drive file=rbd:one/one-3-16-0:id=libvirt:auth_supported=cephx;none:mon_host=10.20.0.11:6789;10.20.0.12:6789;10.20.0.13:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0: error connecting: Permission denied
Sun Oct 25 12:14:20 2020 [Z0][VMM][E]: Could not create domain from /var/lib/one//datastores/0/16/deployment.0
Sun Oct 25 12:14:20 2020 [Z0][VMM][I]: ExitCode: 255
Sun Oct 25 12:14:20 2020 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Sun Oct 25 12:14:20 2020 [Z0][VMM][E]: Error deploying virtual machine: Could not create domain from /var/lib/one//datastores/0/16/deployment.0
Sun Oct 25 12:14:20 2020 [Z0][VM][I]: New LCM state is BOOT_FAILURE
I noticed that Opennebula created some files on the compute node:
oneadmin@node8:~/datastores/0/16$ ls -l
total 1163636
-rw-rw-r-- 1 oneadmin oneadmin 1776 Oct 25 12:14 deployment.0
-rw-r–r-- 1 oneadmin oneadmin 2361393152 Oct 25 12:14 disk.0
-rw-r–r-- 1 oneadmin oneadmin 372736 Oct 25 12:14 disk.1
oneadmin@node8:~/datastores/0/16$ qemu-img info disk.0
image: disk.0
file format: raw
virtual size: 2.2G (2361393152 bytes)
disk size: 1.1G
oneadmin@node8:~/datastores/0/16$ qemu-img info disk.1
image: disk.1
file format: raw
virtual size: 364K (372736 bytes)
disk size: 364K
oneadmin@node8:~/datastores/0/16$ file -L -s disk.0
disk.0: DOS/MBR boot sector, extended partition table (last)
oneadmin@node8:~/datastores/0/16$ file -L -s disk.1
disk.1: ISO 9660 CD-ROM filesystem data ‘CONTEXT’
Tried to create the VM by hand:
oneadmin@node8:~/datastores/0/16$ virsh -c qemu:///system create deployment.0
error: Failed to create domain from deployment.0
error: internal error: process exited while connecting to monitor: 2020-10-25T11:18:19.784321Z qemu-system-x86_64: -drive file=rbd:one/one-3-16-0:id=libvirt:auth_supported=cephx;none:mon_host=10.20.0.11:6789;10.20.0.12:6789;10.20.0.13:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0: error connecting: Permission denied
Noticing that only the owner allowed to write disk.0, I modified the file permissions by hand:
oneadmin@node8:~/datastores/0/16$ ls -l
total 1163636
-rw-rw-r-- 1 oneadmin oneadmin 1776 Oct 25 12:14 deployment.0
-rw-rw-rw- 1 oneadmin oneadmin 2361393152 Oct 25 12:14 disk.0
-rw-r–r-- 1 oneadmin oneadmin 372736 Oct 25 12:14 disk.1
The retried vm creation either by hand or via the Opennebula GUI (Recover->Retry) produced the same results as above.
oneadmin@node8:~/datastores/0/16$ virsh -c qemu:///system create deployment.0
error: Failed to create domain from deployment.0
error: internal error: process exited while connecting to monitor: 2020-10-25T11:20:17.967780Z qemu-system-x86_64: -drive file=rbd:one/one-3-16-0:id=libvirt:auth_supported=cephx;none:mon_host=10.20.0.11:6789;10.20.0.12:6789;10.20.0.13:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0: error connecting: Permission denied
oneadmin@node8:~/datastores/0/16$ virsh list
Id Name State
Retry (Recover->Retry) on opennebula GUI:
Sun Oct 25 12:14:20 2020 [Z0][VM][I]: New LCM state is BOOT_FAILURE
Sun Oct 25 12:22:11 2020 [Z0][VM][I]: New LCM state is BOOT
Sun Oct 25 12:22:11 2020 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/16/deployment.0
Sun Oct 25 12:22:15 2020 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Sun Oct 25 12:22:18 2020 [Z0][VMM][I]: ExitCode: 0
Sun Oct 25 12:22:18 2020 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Sun Oct 25 12:22:18 2020 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/deploy ‘/var/lib/one//datastores/0/16/deployment.0’ ‘node8-kvm’ 16 node8-kvm
Sun Oct 25 12:22:18 2020 [Z0][VMM][I]: error: Failed to create domain from /var/lib/one//datastores/0/16/deployment.0
Sun Oct 25 12:22:18 2020 [Z0][VMM][I]: error: internal error: process exited while connecting to monitor: 2020-10-25T11:22:18.687631Z qemu-system-x86_64: -drive file=rbd:one/one-3-16-0:id=libvirt:auth_supported=cephx;none:mon_host=10.20.0.11:6789;10.20.0.12:6789;10.20.0.13:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0: error connecting: Permission denied
Sun Oct 25 12:22:18 2020 [Z0][VMM][E]: Could not create domain from /var/lib/one//datastores/0/16/deployment.0
Sun Oct 25 12:22:18 2020 [Z0][VMM][I]: ExitCode: 255
Sun Oct 25 12:22:18 2020 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Sun Oct 25 12:22:18 2020 [Z0][VMM][E]: Error deploying virtual machine: Could not create domain from /var/lib/one//datastores/0/16/deployment.0
Sun Oct 25 12:22:18 2020 [Z0][VM][I]: New LCM state is BOOT_FAILURE
At this point I decided to create a VM on node8 by virt-manager, using the same disk files as Opennebula created. For my biggest surprise, that VM creation was successful, the VM booted and its console is accessible by virt-manager GUI.
oneadmin@node8:~/datastores/0/16$ virsh list
Id Name State
16 one-16-copy running
My current conclusion is that every components that takes part of a vm creation works properly, yet the end result is unsuccessful. Moreover, I can create lxc containers on any compute node by Opennebula using images from MarketPlaces.
oneadmin@node8:~$ lxc list
±------±--------±---------------------±-----±-----------±----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
±------±--------±---------------------±-----±-----------±----------+
| one-6 | RUNNING | 192.168.1.101 (eth0) | | PERSISTENT | 0 |
±------±--------±---------------------±-----±-----------±----------+
Any advice to solve my issue would be welcome.
Some detailed information follows.
The xml created by virt-manager:
oneadmin@node8:~/datastores/0/16$ sudo cat /etc/libvirt/qemu/one-16-copy.xml