Failing to deploy LXD VM

After setting up OpenNebula (5.8.1 on Ubuntu 18.04) + Sunstone, I’ve deployed two LXD nodes: 1st on the same node as one and sunstone and 2nd completely independent (different sub network). I use shared fs network and distinct storages for image and shared system datastores.

I can start and import Wild VMs in both. I can only deploy to the node with OpenNebula. Deploying to the 2nd node (either via shared or ssh system datastore) produces BOOT_FAILURE and this log:

[Z0][VM][I]: New state is ACTIVE
[Z0][VM][I]: New LCM state is PROLOG
[Z0][VM][I]: New LCM state is BOOT
[Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/37/deployment.0
[Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
[Z0][VMM][I]: Successfully execute network driver operation: pre.
[Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy ‘/var/lib/one//datastores/115/37/deployment.0’ ‘’ 37
[Z0][VMM][I]: /usr/lib/ruby/2.5.0/open3.rb:199:in spawn': No such file or directory - file (Errno::ENOENT) [Z0][VMM][I]: from /usr/lib/ruby/2.5.0/open3.rb:199:inpopen_run’
[Z0][VMM][I]: from /usr/lib/ruby/2.5.0/open3.rb:95:in popen3' [Z0][VMM][I]: from /usr/lib/ruby/2.5.0/open3.rb:258:incapture3’
[Z0][VMM][I]: from /var/tmp/one/vmm/lxd/command.rb:35:in execute' [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:459:innew_disk_mapper’
[Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:375:in setup_disk' [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:258:inblock in setup_storage’
[Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:251:in each' [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:251:insetup_storage’
[Z0][VMM][I]: from /var/tmp/one/vmm/lxd/deploy:71:in `’
[Z0][VMM][I]: ExitCode: 1
[Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
[Z0][VMM][E]: Error deploying virtual machine
[Z0][VM][I]: New LCM state is BOOT_FAILURE

Any hints on where to search for a fix or a workaround would be greatly appreciated :slight_smile:

I can start and import Wild VMs in both. I can only deploy to the node with OpenNebula. Deploying to the 2nd node (either via shared or ssh system datastore) produces BOOT_FAILURE and this log:

Does this mean you can succesfully import the wild containers on both nodes, but the you, for example, try to deploy wild container from host A on host B ?

Hi Daniel :slight_smile: Thank you for your answer.
No, I’m not trying to deploy a wild container to another host.
I tested and mentioned wild containers just to confirm that the node is operational and to isolate the problem to only deployment.

The problem I’m facing is occurring with the image downloaded from the marketplace: Ubuntu_bionic - LXD. I also tried LXD images for Alpine 3.8, 3.9 from the marketplace with the same result. :frowning:

It seems oneadmin cannot access the image file, whose path should be something like "#{sysds_path}/#{vm_id}/disk.#{disk_id}", can you check, in the node the VM will be deployed, that image file exists (replacing the variables with your scenario context) and if it does exists, run file -L -s as oneadmin?

should be something like /var/lib/one/datastores/0/666/disk.0

Yes, you’re spot on. Below are the contents of the 114 Datastore (on shared fs) and 4 respective VMs, where 2 that succeeded have the “disk” files and two that failed don’t have them.

oneadmin@Ubuntu-1804-bionic-64-minimal:~$ ls ~/datastores/114/4*

/var/lib/one/datastores/114/42:
context.sh  deployment.0  disk.0  disk.1  transfer.0.prolog

/var/lib/one/datastores/114/43:
context.sh  deployment.0  disk.0  disk.1  mapper  transfer.0.prolog

/var/lib/one/datastores/114/44:
context.sh  deployment.0  transfer.0.prolog

/var/lib/one/datastores/114/45:
context.sh  deployment.0  transfer.0.prolog

I’m guessing its something wrong with deployment of the LXD VM, tested a bunch of combinations with storage, but didn’t get to anything meaningful. Any ideas how to debug / fix this?

Even missing those files I’ve run (looking what package to install to get it working…):

oneadmin@Ubuntu-1804-bionic-64-minimal:~$ file -L -s
bash: file: command not found

Maybe this is the problem?

Yup :slight_smile: I was missing:

  apt-get install file libmagic1

Thank you Daniel @dann1 for your quick response and help :slight_smile:

1 Like

Glad to help :slight_smile:

Hi,

i was running into a similar issue (see below) and with installing the libmagic1 and file package, it was running again but i don’t really get why the node needs those packages.

Could you explain it please?

Log

Sat May 9 19:02:15 2020 [Z0][VM][I]: New state is ACTIVE
Sat May 9 19:02:15 2020 [Z0][VM][I]: New LCM state is PROLOG
Sat May 9 19:04:59 2020 [Z0][VM][I]: New LCM state is BOOT
Sat May 9 19:04:59 2020 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/1000/deployment.0
Sat May 9 19:05:01 2020 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Sat May 9 19:05:01 2020 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Sat May 9 19:05:02 2020 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy ‘/var/lib/one//datastores/104/1000/deployment.0’ ‘sat04lxd’ 1000 sat04lxd
Sat May 9 19:05:02 2020 [Z0][VMM][I]: deploy: Processing disk 0
Sat May 9 19:05:02 2020 [Z0][VMM][I]: /usr/lib/ruby/2.5.0/open3.rb:199:in spawn': No such file or directory - file (Errno::ENOENT) Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /usr/lib/ruby/2.5.0/open3.rb:199:in popen_run’
Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /usr/lib/ruby/2.5.0/open3.rb:95:in popen3' Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /usr/lib/ruby/2.5.0/open3.rb:258:in capture3’
Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/command.rb:35:in execute' Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:526:in new_disk_mapper’
Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:411:in setup_disk' Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:301:in block in setup_storage’
Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:300:in each' Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:300:in setup_storage’
Sat May 9 19:05:02 2020 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/deploy:61:in `’
Sat May 9 19:05:02 2020 [Z0][VMM][I]: ExitCode: 1
Sat May 9 19:05:02 2020 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Sat May 9 19:05:02 2020 [Z0][VMM][E]: Error deploying virtual machine