Lxdone + aarch64 + 'cannot contact server error.'

Hi all,
I have completed the lxd opennebula configuration for pine64(aarch64), soon I have completed the hosts addition, i finally added the lxdone image. The datastore is nfs share and upload the image successfully but at the end of the upload operation I get the “Cannot contact server: is it running and reachable?” message unfortunately. I wonder what is that error message for? Did I miss something for the configuration?
Also after initialize the lxdone template and instantiate the vm which printed the lcm state boot_failure.
I looked at the oned.log but nothing related about the mentioned errors. Could someone help me or give any clue what is happening with the opennebula/lxd configuration?
Thanks.

Hello

I have been seeing the “Cannot contact server: is it running and reachable?” message for a while now, it happens after the image finished being uploaded to OpenNebula’s temp dir and before OpenNebula starts uploading to the datastore. I’ve seen this error with the Pine64 setup and also on our datacenter so don’t worry about that. After you get the error the image will get LOCKED state and then it will get READY. I have no idea why this happens and would love to know, but bottom line here is that this is not an issue that will affect you.

If you got boot_failure state that’s a major problem. Could you send me the oned.log? Also, did you follow the “Short road” guide from CloX website and uploaded our pre-created pine64 images or did you follow “The not so short road” and installed and compile everything on your own?

Thanks Sergio for the reply, I have followed the ‘long road’ and attached the latest log file.
logfile

The long road is…well, actually really long and there are several things that could go wrong. This is the reason why we recommend following “The short road” and downloading our pre-created images for pine64/raspberry pi3. Unless you have a specific reason to try to compile everything frpm scratch I will recommend the pine64 image.
About the log file, please send me the log file from the “virtual machine” you are trying to deploy, Your issue is reflected there and there is nothing wrong I can see on the oned.log file

Here is the actual log.

Mon Mar 19 00:48:25 2018 [Z0][VM][I]: New state is ACTIVE
Mon Mar 19 00:48:25 2018 [Z0][VM][I]: New LCM state is PROLOG
Mon Mar 19 00:48:29 2018 [Z0][VM][I]: New LCM state is BOOT
Mon Mar 19 00:48:29 2018 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/8/deployment.0
Mon Mar 19 00:48:31 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Mon Mar 19 00:48:31 2018 [Z0][VMM][I]: ExitCode: 0
Mon Mar 19 00:48:31 2018 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy ‘/var/lib/one//datastores/100/8/deployment.0’ ‘pine1’ 8 pine1
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: deploy.py: ########################################
Mon Mar 19 00:48:43 2018 [Z0][VMM][E]: deploy.py: container: Error calling ‘lxd forkstart one-8 /var/lib/lxd/containers /var/log/lxd/one-8/lxc.conf’: err='Failed to run: /usr/bin/lxd forkstart one-8 /var/lib/lxd
/containers /var/log/lxd/one-8/lxc.conf: ’
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: lxc 20180318214837.506 ERROR lxc_start - start.c:start:1443 - Exec format error - Failed to exec “/sbin/init”.
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: lxc 20180318214837.507 ERROR lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: lxc 20180318214837.507 ERROR lxc_start - start.c:__lxc_start:1358 - Failed to spawn container “one-8”.
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: lxc 20180318214838.958 ERROR lxc_conf - conf.c:run_buffer:416 - Script exited with status 1.
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: lxc 20180318214838.958 ERROR lxc_start - start.c:lxc_fini:546 - Failed to run lxc.hook.post-stop for container “one-8”.
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]:
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: ExitCode: 1
Mon Mar 19 00:48:43 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Mon Mar 19 00:48:43 2018 [Z0][VMM][E]: Error deploying virtual machine
Mon Mar 19 00:48:43 2018 [Z0][VM][I]: New LCM state is BOOT_FAILURE
Wed Mar 21 00:00:03 2018 [Z0][VM][I]: New LCM state is BOOT
Wed Mar 21 00:00:03 2018 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/8/deployment.0
Wed Mar 21 00:00:05 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Wed Mar 21 00:00:05 2018 [Z0][VMM][I]: ExitCode: 0
Wed Mar 21 00:00:05 2018 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy ‘/var/lib/one//datastores/100/8/deployment.0’ ‘pine1’ 8 pine1
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: deploy.py: ########################################
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: losetup: /var/lib/one/datastores/100/8/disk.0: failed to set up loop device: No such file or directory
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: Traceback (most recent call last):
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: File “/var/tmp/one/vmm/lxd/deploy.py”, line 139, in
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: apply_profile(profile, container)
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: File “/var/tmp/one/vmm/lxd/deploy.py”, line 97, in apply_profile
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: root_source = lc.storage_rootfs_mount(VM_ID, DISK_TYPE[0], DS_ID, DISK_SOURCE[0], DISK_CLONE[0])
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: File “/var/tmp/one/vmm/lxd/lxd_common.py”, line 185, in storage_rootfs_mount
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: source = storage_sysmap(‘0’, DISK_TYPE, DISK_SOURCE, VM_ID, DS_ID, DISK_CLONE)
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: File “/var/tmp/one/vmm/lxd/lxd_common.py”, line 155, in storage_sysmap
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: disk = storage_pre("losetup -f --show " + disk)
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: File “/var/tmp/one/vmm/lxd/lxd_common.py”, line 148, in storage_pre
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: blockdev = sp.check_output(command, shell=True)
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: File “/usr/lib/python2.7/subprocess.py”, line 574, in check_output
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: raise CalledProcessError(retcode, cmd, output=output)
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: subprocess.CalledProcessError: Command ‘losetup -f --show /var/lib/one/datastores/100/8/disk.0’ returned non-zero exit status 1
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: ExitCode: 1
Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Wed Mar 21 00:00:07 2018 [Z0][VMM][E]: Error deploying virtual machine
Wed Mar 21 00:00:07 2018 [Z0][VM][I]: New LCM state is BOOT_FAILURE

Wed Mar 21 00:00:07 2018 [Z0][VMM][I]: losetup: /var/lib/one/datastores/100/8/disk.0: failed to set up loop device: No such file or directory

It seems the container image wasn’t there for some reason. During PROLOG phase the image should be copied there. Try to find out why the image wasn’t there.

Most likely you have a bad NFS datastore configuration. Perhaps the nfs folder is not mounted??

I checked the nfs configuration and replaced the mount point to /var/lib/one. Here is the datastore output.

root@pine1:~# onedatastore list
ID NAME SIZE AVAIL CLUSTERS IMAGES TYPE DS TM STAT
0 system - - 0 0 sys - ssh on
1 default 119.9G 97% 0 0 img fs shared on
2 files 119.9G 97% 0 0 fil fs ssh on
101 nfs_image 119.9G 97% 0 2 img fs shared on

After that operation, I delete all the vms and configure them again. VMs are now in error phase but the log says a different story.

Here are the onevm output, and the log files.
root@pine1:~# onevm list
ID USER GROUP NAME STAT UCPU UMEM HOST TIME
10 oneadmin oneadmin Ubuntu 16.04 - fail 0 0K pine3 0d 11h06
11 oneadmin oneadmin lxdone fail 0 0K pine2 0d 07h39

root@pine1:/var/log/one# more 10.log
Sat Mar 24 01:18:59 2018 [Z0][VM][I]: New state is ACTIVE
Sat Mar 24 01:18:59 2018 [Z0][VM][I]: New LCM state is PROLOG
Sat Mar 24 01:19:34 2018 [Z0][VM][I]: New LCM state is BOOT
Sat Mar 24 01:19:35 2018 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/10/deployment.0
Sat Mar 24 01:19:37 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Sat Mar 24 01:19:38 2018 [Z0][VMM][I]: ExitCode: 0
Sat Mar 24 01:19:38 2018 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Sat Mar 24 01:19:50 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy ‘/var/lib/one//datastores/100/10/deployment.0’ ‘pine3’ 10 pine3
Sat Mar 24 01:19:50 2018 [Z0][VMM][I]: deploy.py: ########################################
Sat Mar 24 01:19:50 2018 [Z0][VMM][E]: deploy.py: container: Error calling ‘lxd forkstart one-10 /var/lib/lxd/containers /var/log/lxd/one-10/lxc.conf’: err='Failed to run: /usr/bin/lxd forkstart one-10 /var/lib/
lxd/containers /var/log/lxd/one-10/lxc.conf: ’
Sat Mar 24 01:19:50 2018 [Z0][VMM][I]: lxc 20180323221944.314 ERROR lxc_start - start.c:start:1443 - Exec format error - Failed to exec “/sbin/init”.
Sat Mar 24 01:19:50 2018 [Z0][VMM][I]: lxc 20180323221944.314 ERROR lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
Sat Mar 24 01:19:50 2018 [Z0][VMM][I]: lxc 20180323221944.314 ERROR lxc_start - start.c:__lxc_start:1358 - Failed to spawn container “one-10”.
Sat Mar 24 01:19:50 2018 [Z0][VMM][I]: lxc 20180323221944.908 ERROR lxc_conf - conf.c:run_buffer:416 - Script exited with status 1.
Sat Mar 24 01:19:51 2018 [Z0][VMM][I]: lxc 20180323221944.908 ERROR lxc_start - start.c:lxc_fini:546 - Failed to run lxc.hook.post-stop for container “one-10”.
Sat Mar 24 01:19:51 2018 [Z0][VMM][I]:
Sat Mar 24 01:19:51 2018 [Z0][VMM][I]: ExitCode: 1
Sat Mar 24 01:19:51 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Sat Mar 24 01:19:51 2018 [Z0][VMM][E]: Error deploying virtual machine
Sat Mar 24 01:19:51 2018 [Z0][VM][I]: New LCM state is BOOT_FAILURE

root@pine1:/var/log/one# more 11.log
Sat Mar 24 04:46:22 2018 [Z0][VM][I]: New state is ACTIVE
Sat Mar 24 04:46:22 2018 [Z0][VM][I]: New LCM state is PROLOG
Sat Mar 24 04:46:26 2018 [Z0][VM][I]: New LCM state is BOOT
Sat Mar 24 04:46:26 2018 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/11/deployment.0
Sat Mar 24 04:46:29 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Sat Mar 24 04:46:29 2018 [Z0][VMM][I]: ExitCode: 0
Sat Mar 24 04:46:29 2018 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/lxd/deploy ‘/var/lib/one//datastores/0/11/deployment.0’ ‘pine2’ 11 pine2
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: deploy.py: ########################################
Sat Mar 24 04:46:41 2018 [Z0][VMM][E]: deploy.py: container: Error calling ‘lxd forkstart one-11 /var/lib/lxd/containers /var/log/lxd/one-11/lxc.conf’: err='Failed to run: /usr/bin/lxd forkstart one-11 /var/lib/
lxd/containers /var/log/lxd/one-11/lxc.conf: ’
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: lxc 20180324014635.121 ERROR lxc_start - start.c:start:1443 - Exec format error - Failed to exec “/sbin/init”.
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: lxc 20180324014635.123 ERROR lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: lxc 20180324014635.123 ERROR lxc_start - start.c:__lxc_start:1358 - Failed to spawn container “one-11”.
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: lxc 20180324014635.721 ERROR lxc_conf - conf.c:run_buffer:416 - Script exited with status 1.
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: lxc 20180324014635.721 ERROR lxc_start - start.c:lxc_fini:546 - Failed to run lxc.hook.post-stop for container “one-11”.
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]:
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: ExitCode: 1
Sat Mar 24 04:46:41 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Sat Mar 24 04:46:41 2018 [Z0][VMM][E]: Error deploying virtual machine
Sat Mar 24 04:46:41 2018 [Z0][VM][I]: New LCM state is BOOT_FAILURE

Hi, the mount point should never be /var/lib/one. Here is explained where should you mount.

Check on this output:
root@pine1:~# onedatastore list
ID NAME SIZE AVAIL CLUSTERS IMAGES TYPE DS TM STAT
0 system - - 0 0 sys - ssh on
1 default 119.9G 97% 0 0 img fs shared on
2 files 119.9G 97% 0 0 fil fs ssh on
101 nfs_image 119.9G 97% 0 2 img fs shared on
that every datastore has the same size and availability. When done correctly the nfs datastore should has different values.

The container init process is not starting, for some reason. Apparently this isn’t related to LXDoNe but to lxd itself. Try to run a native lxd container and test it.