VM creation failure on XEN host

Hello,

I am very much new to OpenNebula. I have setup a private on OpenNebula with xen hypervisor. The problem is I am unable to instantiate a VM. Below is the error log. Please Help

Tue May 31 14:52:02 2016 [Z0][DiM][I]: New VM state is ACTIVE.
Tue May 31 14:52:03 2016 [Z0][LCM][I]: New VM state is PROLOG.
Tue May 31 14:52:30 2016 [Z0][LCM][I]: New VM state is BOOT
Tue May 31 14:52:30 2016 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/10/deployment.0
Tue May 31 14:52:31 2016 [Z0][VMM][I]: ExitCode: 0
Tue May 31 14:52:31 2016 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue
May 31 14:52:34 2016 [Z0][VMM][I]: Command execution fail: cat <<
EOT | /var/tmp/one/vmm/xen4/deploy
’/var/lib/one//datastores/0/10/deployment.0’ ‘192.168.100.4’ 10
192.168.100.4
Tue May 31 14:52:34 2016 [Z0][VMM][I]: libxl: error:
libxl_dm.c:717:libxl__build_device_model_args_new: qemu-xen doesn’t
support read-only disk drivers
Tue May 31 14:52:34 2016 [Z0][VMM][I]: libxl: error: libxl_dm.c:1393:device_model_spawn_outcome: (null): spawn failed (rc=-3)
Tue
May 31 14:52:34 2016 [Z0][VMM][I]: libxl: error:
libxl_create.c:1189:domcreate_devmodel_started: device model did not
start: -3
Tue May 31 14:52:34 2016 [Z0][VMM][I]: libxl: error:
libxl_dm.c:1489:kill_device_model: unable to find device model pid in
/local/domain/6/image/device-model-pid
Tue May 31 14:52:34 2016 [Z0][VMM][I]: libxl: error: libxl.c:1421:libxl__destroy_domid: libxl__destroy_device_model failed for 6
Tue May 31 14:52:34 2016 [Z0][VMM][E]: Unable
Tue May 31 14:52:34 2016 [Z0][VMM][I]: ExitCode: 3
Tue May 31 14:52:34 2016 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Tue May 31 14:52:34 2016 [Z0][VMM][E]: Error deploying virtual machine: Unable
Tue May 31 14:52:34 2016 [Z0][DiM][I]: New VM state is FAILED

Regards,
Arshad Shaikh

Hi,
well I just wanted to file the same forum post.
We are obviously experiencing the same issue.

Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 Command execution fail: cat << EOT | /var/tmp/one/vmm/xen4/deploy ‘/var/lib/one//datastores/0/47/deployment.0’ ‘stortest2.linbit’ 47 stortest2.linbit
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 libxl: error: libxl_dm.c:717:libxl__build_device_model_args_new: qemu-xen doesn’t support read-only disk drivers
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 libxl: error: libxl_dm.c:1393:device_model_spawn_outcome: (null): spawn failed (rc=-3)
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 libxl: error: libxl_create.c:1189:domcreate_devmodel_started: device model did not start: -3
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 libxl: error: libxl_dm.c:1489:kill_device_model: unable to find device model pid in /local/domain/15/image/device-model-pid
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 libxl: error: libxl.c:1421:libxl__destroy_domid: libxl__destroy_device_model failed for 15
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG E 47 Unable
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 ExitCode: 3
Wed Jun 1 17:16:35 2016 [Z0][VMM][D]: Message received: LOG I 47 Failed to execute virtualization driver operation: deploy.

I did some experiments with the disk driver already. i tried to explicitely set the driver for the boot disk to

tap2:tapdisk:aio:

this is the full configuration of my template. am i missing something?

oneadmin@stortest1:~$ onetemplate show 0
TEMPLATE 0 INFORMATION
ID : 0
NAME : CentOS-6.5-nfs-XEN
USER : oneadmin
GROUP : oneadmin
REGISTER TIME : 01/21 15:12:28

PERMISSIONS
OWNER : um-
GROUP : —
OTHER : —

TEMPLATE CONTENTS
CONTEXT=[
SSH_PUBLIC_KEY="$USER[SSH_PUBLIC_KEY]" ]
CPU="1"
DISK=[
DRIVER=“tap2:tapdisk:aio:”,
IMAGE=“CentOS-6.5-nfs_x86_64”,
IMAGE_UNAME=“oneadmin” ]
GRAPHICS=[
LISTEN=“0.0.0.0”,
TYPE=“vnc” ]
HYPERVISOR="xen"
MEMORY="512"
NIC=[
NETWORK=“private 192.168”,
NETWORK_UNAME=“oneadmin” ]
OS=[
ARCH=“x86_64”,
BOOT=“hd” ]
VCPU=“1”

I followed this guide to configure my hypervisor host:

http://docs.opennebula.org/4.14/administration/virtualization/xeng.html

thanks for any hints!!!
all the best
Jojo

can this be the problem:


While it is true that qemu doesn’t support
readonly for emulated IDE disks

It does support readonly SCSI disks

Source:
http://lists.xen.org/archives/html/xen-devel/2015-11/msg01154.html

is opennebula using IDE by default?

how could I actually look at what exact disk configuration is given to xen.

can i force the disk to be scsi?

forgot to mention before that my xen hv is ubuntu 14.04.

ist this a known issue with this distribution?

thanks
Jojo

hi, im not a xen user, so cant tell if this will help, but it’s worth a try.

in /etc/one/oned.conf is a default setting for device prefix for disks. I think the default is set to “hd”.
If I understand correctly, prefix hd = ide, prefix sd = sata and prefix vd = virtio (for kvm).
You can also set the prefix in the attributes of the image itself, or you could change the value in oned.conf, so it is default for all your images.

so prefix: hd / sd / vd
and target: hda / sdb / vda

hope this helps!

EDIT: here is the relevant part in /etc/one/oned.conf:

#*******************************************************************************
# DataStore Configuration
#*******************************************************************************
#  DATASTORE_LOCATION: Path for Datastores. It IS the same for all the hosts
#  and front-end. It defaults to /var/lib/one/datastores (in self-contained mode
#  defaults to $ONE_LOCATION/var/datastores). Each datastore has its own
#  directory (called BASE_PATH) in the form: $DATASTORE_LOCATION/<datastore_id>
#  You can symlink this directory to any other path if needed. BASE_PATH is
#  generated from this attribute each time oned is started.
#
#  DATASTORE_CAPACITY_CHECK: Checks that there is enough capacity before
#  creating a new image. Defaults to Yes
#
#  DEFAULT_IMAGE_TYPE: This can take values
#       OS        Image file holding an operating system
#       CDROM     Image file holding a CDROM
#       DATABLOCK Image file holding a datablock, created as an empty block
#
#  DEFAULT_DEVICE_PREFIX: This can be set to
#       hd        IDE prefix
#       sd        SCSI
#       vd        KVM virtual disk
#
#  DEFAULT_CDROM_DEVICE_PREFIX: Same as above but for CDROM devices.
#*******************************************************************************

#DATASTORE_LOCATION  = /var/lib/one/datastores

DATASTORE_CAPACITY_CHECK = "yes"

DEFAULT_IMAGE_TYPE    = "OS"
DEFAULT_DEVICE_PREFIX = "hd"

DEFAULT_CDROM_DEVICE_PREFIX = "hd"

Hi, thanks for the hint! Appreciated!

But I already did a lot of testing with different device prefixes and different drivers yesterday.

In my oned.conf, there is one more possible option in the comments.

# DEFAULT_DEVICE_PREFIX: This can be set to
# hd IDE prefix
# sd SCSI
# xvd XEN Virtual Disk
# vd KVM virtual disk

I actually did not set the default in oned.conf but I set it in the image itself. The outcome should be the same. I tried sd, I tried xvda, always the same error. Also tried with the default driver for XEN: raw

oneadmin@stortest1:~$ oneimage show 0
IMAGE 0 INFORMATION
ID : 0
NAME : CentOS-6.5-nfs_x86_64
USER : oneadmin
GROUP : oneadmin
DATASTORE : image-nfs
TYPE : OS
REGISTER TIME : 01/21 15:11:40
PERSISTENT : No
SOURCE : /var/lib/one//datastores/1/10777a4812f3dd3ab1d0983454dcad70
PATH : http://appliances.c12g.com/CentOS-6.5/centos6.5.qcow2.gz
SIZE : 267M
STATE : rdy
RUNNING_VMS : 0

PERMISSIONS
OWNER : um-
GROUP : —
OTHER : —

IMAGE TEMPLATE
DEV_PREFIX="xvda"
DRIVER=“tap2:tapdisk:aio:”

VIRTUAL MACHINES

Hi Arshad,
which Xen version and Operating System are you actually using?

I am starting to think that it’s an issue with the Xen version we are using and filed a new forum post here:

thanks for clarifying.

all the best
Jojo

Hi Jojo,

Thanks for the reply. Xen version and the version of Ubuntu are the same ones that you have mentioned above.

Regards,
Arshad Shaikh

Hi Arshad,
only recently I found a solution yes :slight_smile: I definitely wanted to post it here but I did not have the time yet.

the problem is that xen sd driver definitely cannot handle readonly devices, if you dont use a readonly device it should just work! when you have the “context” checkboxes on in your template, then automatically a readonly disk is attached to your vm. this is causing the qemu-xen readonly blabla error!!!

try to remove every “context” checkboxes in the opennebula gui! then the readonly iso device should be left out.

instantiate your template, and then check how exactly your xen vm definition looks like. it should be in your system datastore, usually this is the path (134 is the vm id).


root@stortest1:~# cat /var/lib/one//datastores/0/134/deployment.0
name = ‘one-134’
#O CPU_CREDITS = 256
memory = ‘1024’
disk = [
‘raw:/var/lib/one//datastores/0/134/disk.0,xvda,w’,
‘raw:/var/lib/one//datastores/0/134/disk.1,sda,r’,
]
vif = [
’ mac=00:16:3e:00:00:01,bridge=xenbr0’,
]
vfb = [‘type=vnc,vnclisten=0.0.0.0,vncunused=0,vncdisplay=134’]


if you also want to use “contextualization”, opennebula certainly somehow must attach the readonly iso file.
i got it to work when using the following settings:

In my template in the OS section I set “pygrub” as the bootloader:
BOOTLOADER=pygrub

for the DISK I set:
DEV_PREFIX="xvd"
DRIVER=“raw”

try these two possible ways to get it work and post here how it works.
Hope that helps!
all the best
Jojo

Thanks for replying Jojo.

By context “checkboxes” do you mean these?

Regards,
Arshad

yes, that’s correct

make sure to monitor oned.log while provisioning your template.
if it throws an error it will show you the path to the xen configuration of the vm (’/var/lib/one//datastores/0/<VM-ID>/deployment.0’)

I would suggest you post teh contents of this file here, it will probably help to figure out the problems.

all the best

The VM is up and running. Thanks for the help. :slight_smile:

The contents of the file are:
name = ‘one-13’
#O CPU_CREDITS = 256
memory = '512’
builder = 'hvm’
disk = [
‘file:/var/lib/one//datastores/0/13/disk.0,sda,w’,
]
vif = [
]

Regards,
Arshad

you’re welcome :slight_smile:

I have to correct my 2nd solution a litle. I just did another test.

it is booting with context checkboxes ON in this configuration:


root@stortest1:~# cat /var/lib/one//datastores/0/145/deployment.0
name = 'one-145’
#O CPU_CREDITS = 256
memory = ‘1024’
bootloader = ‘pygrub’
disk = [
‘raw:/var/lib/one//datastores/0/145/disk.0,sda,w’,
‘raw:/var/lib/one//datastores/0/145/disk.1,sdb,r’,
]
vif = [
’ mac=02:00:40:55:83:0d,bridge=xenbr0’,
]
vfb = [‘type=vnc,vnclisten=0.0.0.0,vncunused=0,vncdisplay=145’]

so the thing that makes it work is to use “pygrub” as the bootloader!

if you find the time please try this out and confirm. just put the text “pygrub” in the BOOTLOADER box in your template!

thank you
all the best
Johannes

I am getting the following error when I set the bootloader as pygrub.

Thu Jun 23 19:51:52 2016 [Z0][DiM][I]: New VM state is ACTIVE.
Thu Jun 23 19:51:52 2016 [Z0][LCM][I]: New VM state is PROLOG.
Thu Jun 23 19:51:54 2016 [Z0][LCM][I]: New VM state is BOOT
Thu Jun 23 19:51:54 2016 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/18/deployment.0
Thu Jun 23 19:51:54 2016 [Z0][VMM][I]: ExitCode: 0
Thu Jun 23 19:51:54 2016 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Thu
Jun 23 19:51:55 2016 [Z0][VMM][I]: Command execution fail: cat <<
EOT | /var/tmp/one/vmm/xen4/deploy
’/var/lib/one//datastores/0/18/deployment.0’ ‘192.168.100.4’ 18
192.168.100.4
Thu Jun 23 19:51:55 2016 [Z0][VMM][I]: libxl: error:
libxl_bootloader.c:628:bootloader_finished: bootloader failed - consult
logfile /var/log/xen/bootloader.11.log
Thu Jun 23 19:51:55 2016
[Z0][VMM][I]: libxl: error:
libxl_exec.c:118:libxl_report_child_exitstatus: bootloader [-1] exited
with error status 1
Thu Jun 23 19:51:55 2016 [Z0][VMM][I]: libxl: error: libxl_create.c:1022:domcreate_rebuild_done: cannot (re-)build domain: -3
Thu Jun 23 19:51:55 2016 [Z0][VMM][E]: Unable
Thu Jun 23 19:51:55 2016 [Z0][VMM][I]: ExitCode: 3
Thu Jun 23 19:51:55 2016 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Thu Jun 23 19:51:55 2016 [Z0][VMM][E]: Error deploying virtual machine: Unable
Thu Jun 23 19:51:57 2016 [Z0][DiM][I]: New VM state is FAILED

Could you tell me what pygrub do?

Thanks and Regards,
Arshad

Hi Arshad,
I’ve seen this error but can’t reproduce it at the moment.

As far as I understand, pygrub is a bootloader that sits “outside” of your Xen-VM (on your Xen-hypervisor! not as normal grub, which sits in the MBR or start of partition of your virtual disk).
pygrub is able to load the kernel of the VM that lies “inside” the VM. More detailed explainations here:

http://wiki.xen.org/wiki/PyGrub

I suppose that in the past a typical way of booting Xen VMs was to have the bootloader AND the kernels outside of the VM, lying around on your hypervisor, which is tedious to handle, upgrade kernels, etc.

all the best
Johannes

Hi Jojo,

So when I check the log file it says:

Traceback (most recent call last):
File “/usr/lib/xen-4.4/bin/pygrub”, line 867, in
raise RuntimeError, "Unable to find partition containing kernel"
RuntimeError: Unable to find partition containing kernel

Any idea what does it mean?

Also do I need to make changes to my image?

Regards,
Arshad

not really, but obviously pygrub does not find a kernel on your virtual drive :wink:
should there be one? what exactely is on your drive?

  • post the corresponding deployment.0 file (aka the xen vm definition file)
  • try a xen vm image from the opennebula marketplace and see if that works

all the best
johannes

here is the vm definition file:

name = ‘one-20’
#O CPU_CREDITS = 256
memory = '512’
bootloader = 'pygrub’
disk = [
‘file:/var/lib/one//datastores/0/20/disk.0,hda,r’,
]
vif = [
]

try using sda or xvda as a DEV_PREFIX in your image definition!
also set DRIVER=raw

set this attributes in the image definition in sunstone gui or by using "one image update ID"
DEV_PREFIX="xvd"
DRIVER=“raw”

hda and xen is evil :wink:

ist this a predefined xen vm from the marketplace?

I used a predefined image for Xen from Opennebula market place and configured with the settings you suggested and it worked.

Thanks :slight_smile:

Regards,
Arshad

I am just reading this thread, and I must add that when you are using pygrub then you are launching your machine in Paravirtual mode, while when you are not specifying it you would be running in fully virtualized HVM mode. This is hinted here: http://docs.opennebula.org/4.14/administration/virtualization/xeng.html#usage

Choosing PV or HVM has implications in the performance of the VM and also on how it interacts with the HV below so choose wisely…

And yes, this seems to be a problem after the fix for http://xenbits.xen.org/xsa/advisory-142.html . This makes it difficult to run HVM-mode with Context disks on OpenNebula (which is much of a default). A workaround is to manually edit the deployment file to mark the context disk as ‘w’[ritable], then create the VM manually and finally recover it from the FAILURE state.