VM crash randomly

Hi all,

My VM random crash as the attached picture
There 's no Error in oned.log, and Datastores monitor always return SUCCESSFUL.
Any one have same situation,
Thank you very much.

Versions of the related components and OS (frontend, hypervisors, VMs):
5.6

Steps to reproduce:
Randomly
Current results:
Still crash
Expected results:

This doesn’t seem to be an OpenNebula issue, that is why you can’t see any error log on OpenNebula. This is either an error on KVM (not likely) or the HDD you are using is corrupted (more likely). You could manually try to run a file system check on that partition and check for errors. Are you using btrfs or is that log only referring to the kernel module being loaded?

Hi Sergio,
Thank you for your reply.
The VM is deployed and running for a couple of days and suddenly stuck at that state forever, no matter what i did many hard reboot.
I use rdb with Ceph backend for Datastore.

Hi, sorry but I think I didn’t completely understand your last reply. Was the VM running until it froze, then you rebooted it and that screen showed? Try to manually mount the rbd that OpenNebula is using and run a fsck, this might help.

Hi Sergo,
Every time i rebooted VMs, it always stuck at that stage( waiting for udev random), even for days.
I have just discovered that the issue only happen with “Ubuntu 18.04 - KVM” image, other VMs running on “CentOS 7” or “Debian 9” work fine.
Need a deeper investigation.
Thanks you.

Hi,
I have the same issue. IMHO the issue appear with ubuntu 18.04 and 16.04 images after kernel upgrade to the latest version via “apt-get dist-upgrade”. I also tried to launch “problematic” image on PC under “pure” kvm and it boots up normally. Seems that problem in opennebula kvm command line parameters.

Hi Roman,
I also doubt this issue relate to kernel version of KVMs hosts too.
I’m currently running CentOS 7 (kernel 3.10) for KVM hosts. I will update Nebula host to kernel 4.x.
Thank you.

Hello,

this is caused by the GRUB overrides on (all) Ubuntu images in /etc/default/grub.d/50-cloudimg-settings.cfg which brings back e.g. the serial console when GRUB is refreshed (as a result of kernel or distribution update). I guess, the missing (virtual) serial console with the kernel configuration for the serial console is problematic here.

Please, remove /etc/default/grub.d/50-cloudimg-settings.cfg as a very first thing on your Ubuntu VMs.

Best regards,
Vlastimil Holer

1 Like

Thank you for your information Holer.