Hi everyone!
I have fresh installed 6.8 version and Quincy Ceph cluster.
For testing purposes I run Ceph services on the same hosts as Opennebula.
Configured datastore as described in docs, following every line.
Finally I could create datastore (and it has Monitored status, shows me space), could download image from marketplace.
But after faced with problems with VM. Start VM from this template, it works for 5 min (status RUNNING) and then fails into POWEROFF state with logs:
Mon Apr 29 20:51:44 2024 [Z0][VM][I]: New state is ACTIVE
Mon Apr 29 20:51:44 2024 [Z0][VM][I]: New LCM state is PROLOG
Mon Apr 29 20:51:46 2024 [Z0][VM][I]: New LCM state is BOOT
Mon Apr 29 20:51:46 2024 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/38/deployment.0
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: ExitCode: 0
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: ExitCode: 0
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/mkdir -p.
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: ExitCode: 0
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/cat - >/var/lib/one//datastores/121/38/vm.xml.
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: ExitCode: 0
Mon Apr 29 20:51:47 2024 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/cat - >/var/lib/one//datastores/121/38/ds.xml.
Mon Apr 29 20:51:50 2024 [Z0][LCM][I]: VM reported RUNNING by the drivers
Mon Apr 29 20:51:50 2024 [Z0][VM][I]: New LCM state is RUNNING
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: Command execution fail (exit code: 255): cat << 'EOT' | /var/tmp/one/vmm/kvm/deploy '/var/lib/one//datastores/121/38/deployment.0' 'onenode-prg-03' 38 onenode-prg-03
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: XPath set is empty
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: error: Failed to create domain from /var/lib/one//datastores/121/38/deployment.0
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: error: internal error: process exited while connecting to monitor: 2024-04-29T20:56:48.469573Z qemu-kvm-one: -blockdev {"driver":"rbd","pool":"one","image":"one-21-38-0","server":[{"host":"onenode-prg-01","port":"6789"},{"host":"onenode-prg-02","port":"6789"},{"host":"onenode-prg-03","port":"6789"}],"user":"libvirt","auth-client-required":["cephx","none"],"key-secret":"libvirt-2-storage-auth-secret0","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error connecting: Connection timed out
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: Could not create domain from /var/lib/one//datastores/121/38/deployment.0
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: ExitCode: 255
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: ExitCode: 0
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: Successfully execute network driver operation: clean.
Mon Apr 29 20:56:48 2024 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Mon Apr 29 20:56:48 2024 [Z0][VMM][E]: DEPLOY: XPath set is empty error: Failed to create domain from /var/lib/one//datastores/121/38/deployment.0 error: internal error: process exited while connecting to monitor: 2024-04-29T20:56:48.469573Z qemu-kvm-one: -blockdev {"driver":"rbd","pool":"one","image":"one-21-38-0","server":[{"host":"onenode-prg-01","port":"6789"},{"host":"onenode-prg-02","port":"6789"},{"host":"onenode-prg-03","port":"6789"}],"user":"libvirt","auth-client-required":["cephx","none"],"key-secret":"libvirt-2-storage-auth-secret0","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error connecting: Connection timed out Could not create domain from /var/lib/one//datastores/121/38/deployment.0 ExitCode: 255
Mon Apr 29 20:56:48 2024 [Z0][LCM][E]: deploy_failure_action, VM in a wrong state
Mon Apr 29 20:57:17 2024 [Z0][LCM][I]: VM running but monitor state is POWEROFF
Opennebula configured properly, I can connect from frontend to node and check libvirt:
oneadmin@hosting-cntl:/root$ ssh onenode-prg-03 virsh -c qemu:///system list --all
Id Name State
-----------------------
3 one-38 paused
On the node in libvirt logs I see
2024-04-29 21:11:18.086+0000: starting up libvirt version: 7.0.0, package: 3+deb11u2 (Guido Günther <agx@sigxcpu.org> Mon, 06 Feb 2023 17:50:14 +0100), qemu version: 5.2.0Debian 1:5.2+dfsg-11+deb11u3, kernel: 5.10.0-28-amd64, hostname: onenode-prg-03
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
HOME=/var/lib/libvirt/qemu/domain-2-one-38 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-2-one-38/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-2-one-38/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-2-one-38/.config \
QEMU_AUDIO_DRV=none \
/usr/bin/qemu-kvm-one \
-name guest=one-38,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-one-38/master-key.aes \
-machine pc-i440fx-5.2,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-cpu qemu64 \
-m 768 \
-object memory-backend-ram,id=pc.ram,size=805306368 \
-overcommit mem-lock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid 37946566-66f0-47da-a606-478b60d31d17 \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=34,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc \
-no-shutdown \
-boot strict=on \
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
-device virtio-scsi-pci,id=scsi0,num_queues=1,bus=pci.0,addr=0x4 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 \
-object secret,id=libvirt-2-storage-auth-secret0,data=[*MASKED*],keyid=masterKey0,iv=[*MASKED*],format=base64 \
-blockdev '{"driver":"rbd","pool":"one","image":"one-21-38-0","server":[{"host":"onenode-prg-01","port":"6789"},{"host":"onenode-prg-02","port":"6789"},{"host":"onenode-prg-03","port":"6789"}],"user":"libvirt","auth-client-required":["cephx","none"],"key-secret":"libvirt-2-storage-auth-secret0","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-2-format","read-only":false,"discard":"unmap","cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"}' \
-device virtio-blk-pci,bus=pci.0,addr=0x6,drive=libvirt-2-format,id=virtio-disk0,bootindex=1,write-cache=on \
-blockdev '{"driver":"file","filename":"/var/lib/one//datastores/121/38/disk.1","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":true,"driver":"raw","file":"libvirt-1-storage"}' \
-device ide-cd,bus=ide.0,unit=0,drive=libvirt-1-format,id=ide0-0-0 \
-netdev tap,fd=36,id=hostnet0,vhost=on,vhostfd=37 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:0a:24:fa:01,bus=pci.0,addr=0x3 \
-chardev socket,id=charchannel0,fd=38,server,nowait \
-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \
-vnc 0.0.0.0:38 \
-device cirrus-vga,id=video0,bus=pci.0,addr=0x2 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
2024-04-29T21:16:18.214503Z qemu-kvm-one: -blockdev {"driver":"rbd","pool":"one","image":"one-21-38-0","server":[{"host":"onenode-prg-01","port":"6789"},{"host":"onenode-prg-02","port":"6789"},{"host":"onenode-prg-03","port":"6789"}],"user":"libvirt","auth-client-required":["cephx","none"],"key-secret":"libvirt-2-storage-auth-secret0","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error connecting: Connection timed out
2024-04-29 21:16:18.311+0000: shutting down, reason=failed
In the same time I can connect to ceph using my libvirt user:
oneadmin@onenode-prg-03:~$ rbd ls -p one --id libvirt
one-21
one-21-38-0
Versions of the related components and OS (frontend, hypervisors, VMs): Debian 11, Opennebula v6.8.0
Steps to reproduce:
- Install fresh Debian 11
- Install Opennebula using official documentation
- Deploy Ceph Quincy using Cephadm and official docs
- Configure Opennebula’s datastore using official docs
Current results: VM cannot start, fails in POWEROFF
Expected results: VM is running without fails