Can't start Firecraker vm: VM running but monitor state is POWEROFF

Trying to run Firecracker using ThingsBoard image from article https://support.opennebula.pro/hc/en-us/articles/360045122532-How-to-Use-miniONE-to-Deploy-a-Firecracker-Edge-Cloud-for-Containers

Mon Jan 25 11:06:55 2021 [Z0][VM][I]: New state is ACTIVE
Mon Jan 25 11:06:55 2021 [Z0][VM][I]: New LCM state is PROLOG
Mon Jan 25 11:06:56 2021 [Z0][VM][I]: New LCM state is BOOT
Mon Jan 25 11:06:56 2021 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/64/deployment.0
Mon Jan 25 11:06:57 2021 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Mon Jan 25 11:06:57 2021 [Z0][VMM][I]: pre: Executed "sudo -n ip link add name onebr8 type bridge ".
Mon Jan 25 11:06:57 2021 [Z0][VMM][I]: pre: Executed “sudo -n ip link set onebr8 up”.
Mon Jan 25 11:06:57 2021 [Z0][VMM][I]: pre: Running /var/tmp/one/vnm/bridge/pre.d/firecracker
Mon Jan 25 11:06:57 2021 [Z0][VMM][I]: pre:
Mon Jan 25 11:06:57 2021 [Z0][VMM][I]: ExitCode: 0
Mon Jan 25 11:06:57 2021 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Mon Jan 25 11:06:58 2021 [Z0][VMM][I]: ExitCode: 0
Mon Jan 25 11:06:58 2021 [Z0][VMM][I]: Successfully execute virtualization driver operation: deploy.
Mon Jan 25 11:06:59 2021 [Z0][VMM][I]: ExitCode: 0
Mon Jan 25 11:06:59 2021 [Z0][VMM][I]: Successfully execute network driver operation: post.
Mon Jan 25 11:06:59 2021 [Z0][VM][I]: New LCM state is RUNNING
Mon Jan 25 11:07:38 2021 [Z0][LCM][I]: VM running but monitor state is POWEROFF
Mon Jan 25 11:07:38 2021 [Z0][VM][I]: New LCM state is SHUTDOWN_POWEROFF
Mon Jan 25 11:07:38 2021 [Z0][VM][I]: New state is POWEROFF
Mon Jan 25 11:07:38 2021 [Z0][VM][I]: New LCM state is LCM_INIT

What’s wrong? Where I can see an exact error, in what log?


Versions of the related components and OS (frontend, hypervisors, VMs):
opennebula-node-firecracker-5.12.0.3-1.ce.el8.x86_64
CentOS Linux release 8.3.2011
Kernel: 4.18.0-240.1.1.el8_3.x86_64

It seems that the VM is starting to boot but it finish after that so the monitoring process cannot find it and switch it to POWEROFF state.

Please could you share the output of ps -ef | grep firecracker and the content of /var/lib/one/datastores/<sys_ds_id>/<vm_id>/logs.fifo?

cat /mnt/mfs/107/69/logs.fifo

Running Firecracker v0.21.1
2021-01-25T11:26:09.402310765 [one-69] Setting configuration for VMM from one single json failed: Invalid block device path!

ps -ef | grep firecracker

oneadmin 1777932 1 5 12:38 ? 00:00:00 ruby /var/tmp/one/im/firecracker.d/monitord-client.rb firecracker 24 on-node-7
oneadmin 1778233 1777932 7 12:38 ? 00:00:00 ruby /var/tmp/one/im/firecracker.d/…/firecracker-probes.d/host/monitor/linux_usage.rb firecracker 24 on-node-7
oneadmin 1778271 1777932 8 12:38 ? 00:00:00 ruby /var/tmp/one/im/firecracker.d/…/firecracker-probes.d/host/system/linux_host.rb firecracker 24 on-node-7
oneadmin 1778494 1777932 0 12:38 ? 00:00:00 /usr/bin/ruby /var/tmp/one/im/firecracker.d/…/firecracker-probes.d/vm/monitor/poll.rb firecracker 24 on-node-7

It seems there’s something wrong with the VM disks. Could you share the output of the commands below?

  • ls -lh /var/lib/one/datastores/<sys_ds>/<vm_id>
  • file /var/lib/one/datastores/<sys_ds>/<vm_id>/*

Also, it would be nice to see the content of the VM deployment file, it should be located also in the above path.

# ls -lh /var/lib/one/datastores/107/69/
total 2.1G
-rw-rw-r-- 1 oneadmin oneadmin  644 Jan 25 14:26 deployment.file
drwxr-xr-x 3 oneadmin oneadmin    1 Jan 25 14:26 dev
-rw-r--r-- 1 oneadmin oneadmin 2.0G Jan 25 11:24 disk.0
lrwxrwxrwx 1 oneadmin oneadmin   60 Jan 25 14:26 disk.1 -> /var/lib/one/datastores/105/634a80bbfa46afc6b85927d25d7a433e
-rw------- 1 oneadmin oneadmin 2.0M Jan 25 14:26 disk.2
-rwxr-xr-x 1 root     root     2.6M Jan 25 14:26 firecracker
-rw-r--r-- 1 oneadmin oneadmin  21M Dec 30 11:33 kernel
-rw-rw-r-- 1 oneadmin oneadmin  153 Jan 25 14:26 logs.fifo
-rw-rw-r-- 1 oneadmin oneadmin    0 Jan 25 14:26 metrics.fifo
drwxr-xr-x 2 oneadmin oneadmin    1 Jan 25 14:26 run

# file /var/lib/one/datastores/107/69/*
/var/lib/one/datastores/107/69/deployment.file: ASCII text, with very long lines, with no line terminators
/var/lib/one/datastores/107/69/dev:             directory
/var/lib/one/datastores/107/69/disk.0:          Linux rev 1.0 ext4 filesystem data, UUID=77a87059-ac4b-499d-baa2-a1e2ecdcb97a (extents) (64bit) (large files) (huge files)
/var/lib/one/datastores/107/69/disk.1:          symbolic link to /var/lib/one/datastores/105/634a80bbfa46afc6b85927d25d7a433e
/var/lib/one/datastores/107/69/disk.2:          Linux rev 1.0 ext2 filesystem data, UUID=ddff4a42-80c6-4a8b-b392-8302ee8d4fc7, volume name "CONTEXT" (extents) (64bit) (large files) (huge files)
/var/lib/one/datastores/107/69/firecracker:     ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=1c36a25e62de58eebee1780179591b0bfdb0f919, with debug_info, not stripped
/var/lib/one/datastores/107/69/kernel:          ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=7ed17dfc5162026e495f9ccc302831a2f2320e42, not stripped
/var/lib/one/datastores/107/69/logs.fifo:       ASCII text
/var/lib/one/datastores/107/69/metrics.fifo:    empty
/var/lib/one/datastores/107/69/run:             directory

# cat /var/lib/one/datastores/107/69/deployment.file | json_reformat 
{
    "boot-source": {
        "kernel_image_path": "kernel",
        "boot_args": "console=ttyS0 reboot=k panic=1 pci=off"
    },
    "drives": [
        {
            "drive_id": "disk.0",
            "path_on_host": "disk.0",
            "is_root_device": true,
            "is_read_only": false
        },
        {
            "drive_id": "disk.1",
            "path_on_host": "disk.1",
            "is_root_device": false,
            "is_read_only": false
        },
        {
            "drive_id": "disk.2",
            "path_on_host": "disk.2",
            "is_root_device": false,
            "is_read_only": true
        }
    ],
    "machine-config": {
        "mem_size_mib": 4096,
        "vcpu_count": 1,
        "ht_enabled": false
    },
    "network-interfaces": [
        {
            "iface_id": "eth0",
            "host_dev_name": "one-69-0",
            "guest_mac": "02:00:c3:02:44:0a",
            "allow_mmds_requests": true
        }
    ],
    "logger": {
        "log_fifo": "logs.fifo",
        "metrics_fifo": "metrics.fifo"
    }
}

Hello @OldBeaver, it seems that you’re using a persistent image in a shared/qcow2 datastore:

lrwxrwxrwx 1 oneadmin oneadmin   60 Jan 25 14:26 disk.1 -> /var/lib/one/datastores/105/634a80bbfa46afc6b85927d25d7a433e

This is not supported as Firecracker microVM process are isolated in the VM folder (via chroot) in order to increase the security. This is explained here in the documentation. Could you try using a non persistent image?

After switching to non-persistent image VM started. Thank you.

1 Like

I have the same problem but I don’t see that error about failed: Invalid block device path!. Basically, my microVM starts and stops. Any idea how to resolve it?

Hi @mehrdad-yousefi . This is the same problem you reference here. Since the microVM starts and stops, the monitoring systems detects the status has changed and even though it was RUNNING after successfully executing the instantiation, the hypervisor reports the VM as POWEROFF.