LXD - Guest failed to boot after Stop

Hello Guys,

The LXC container cannot be started and shows the following errors in the logs:

Wed Jun 2 10:25:12 2021 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Wed Jun 2 10:25:15 2021 [Z0][VMM][I]: ExitCode: 0
Wed Jun 2 10:25:15 2021 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Wed Jun 2 10:25:17 2021 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy '/var/lib/one//datastores/107/2409/deployment.4' 'zdh-004' 2409 zdh-004
Wed Jun 2 10:25:17 2021 [Z0][VMM][I]: deploy: Overriding container
Wed Jun 2 10:25:17 2021 [Z0][VMM][I]: deploy: Processing disk 0
Wed Jun 2 10:25:17 2021 [Z0][VMM][I]: deploy: Using qcow2 mapper for /var/lib/one/datastores/107/2409/disk.0
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: deploy: Mapping disk at /var/snap/lxd/common/lxd/storage-pools/default/containers/one-2409/rootfs using device /dev/nbd7
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: deploy: Mounting /dev/nbd7p1 at /var/snap/lxd/common/lxd/storage-pools/default/containers/one-2409/rootfs
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: deploy: Mapping disk at /var/lib/one/datastores/107/2409/mapper/disk.1 using device /dev/loop14
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: deploy: Mounting /dev/loop14 at /var/lib/one/datastores/107/2409/mapper/disk.1
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: deploy: --- Starting container ---
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: deploy: Name: one-2409
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: Remote: unix://
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: Architecture: x86_64
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: Created: 2021/06/02 08:06 UTC
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: Status: Stopped
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: Type: persistent
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: Profiles: default
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]:
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: Log:
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]:
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.613 ERROR terminal - terminal.c:lxc_terminal_create:858 - No such file or directory - Failed to open terminal
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.613 ERROR start - start.c:lxc_init:900 - Failed to create console
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.613 ERROR start - start.c:__lxc_start:1971 - Failed to initialize container "one-2409"
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.655 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:842 - No such file or directory - Failed to receive the container state
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc 20210602082516.655 WARN commands - commands.c:lxc_cmd_rsp_recv:135 - Connection reset by peer - Failed to receive response for command "get_state"
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]:
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: /var/tmp/one/vmm/lxd/deploy:72:in `rescue in <main>': undefined local variable or method `e' for main:Object (NameError)
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/deploy:69:in `<main>'
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: /var/tmp/one/vmm/lxd/client.rb:102:in `wait': {"type"=>"sync", "status"=>"Success", "status_code"=>200, "operation"=>"", "error_code"=>0, "error"=>"", "metadata"=>{"id"=>"2d53fe41-e8a7-436a-9262-6b80cde4edb5", "class"=>"task", "description"=>"Starting container", "created_at"=>"2021-06-02T10:25:16.438107883+02:00", "updated_at"=>"2021-06-02T10:25:16.438107883+02:00", "status"=>"Failure", "status_code"=>400, "resources"=>{"containers"=>["/1.0/containers/one-2409"]}, "metadata"=>nil, "may_cancel"=>false, "err"=>"Failed to run: /snap/lxd/current/bin/lxd forkstart one-2409 /var/snap/lxd/common/lxd/containers /var/snap/lxd/common/lxd/logs/one-2409/lxc.conf: ", "location"=>"none"}} (LXDError)
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:517:in `wait?'
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:529:in `change_state'
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/container.rb:211:in `start'
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: from /var/tmp/one/vmm/lxd/deploy:70:in `<main>'
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: ExitCode: 1
Wed Jun 2 10:25:19 2021 [Z0][VMM][I]: ExitCode: 0
Wed Jun 2 10:25:19 2021 [Z0][VMM][I]: Successfully execute network driver operation: clean.
Wed Jun 2 10:25:19 2021 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Wed Jun 2 10:25:19 2021 [Z0][VMM][E]: Error deploying virtual machine
Wed Jun 2 10:25:19 2021 [Z0][VM][I]: New state is POWEROFF
Wed Jun 2 10:25:19 2021 [Z0][VM][I]: New LCM state is LCM_INIT

Versions:
OpenNebula 5.12.0.3
lxc --version
3.0.4

What could be the issue? I tried to research with no luck, also tried to manually umount:

root@zdh-004:~# umount /var/snap/lxd/common/lxd/storage-pools/default/containers/one-2409/rootfs
root@zdh-004:~# umount /var/lib/one/datastores/107/2409/mapper/disk.1
root@zdh-004:~# losetup -d /dev/loop12
root@zdh-004:~# qemu-nbd -d /dev/nbd7p1

It’s 2nd time I’m seeing this on production containers, all are Ubuntu 18.04 LTS guests.

Thank you!

Hello,

It seems like the new LXC containers cannot be launched on this node as well. The rest 2 nodes in cluster are fine.

Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.613 ERROR terminal - terminal.c:lxc_terminal_create:858 - No such file or directory - Failed to open terminal
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.613 ERROR start - start.c:lxc_init:900 - Failed to create console
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.613 ERROR start - start.c:__lxc_start:1971 - Failed to initialize container “one-2409”
Wed Jun 2 10:25:18 2021 [Z0][VMM][I]: lxc one-2409 20210602082516.655 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:842 - No such file or directory - Failed to receive the container state

Not sure if this refers to missing files inside the image, or missing lxc/lxd application files. Maybe asking in the LXD forum will bring some clarity.

Try manually creating a container with the structure that the LXD Driver uses and troubleshoot from there. In a nutshell:

  • create a container
  • mount the images (context disk and disk.0) in the container rootfs
  • start the container