Can't run VM after Opennebula upgrade. Need URGENT help!

Hello,

Yesterday I did “yum update” on my Centos 6 machine that runs Opennebula 4.8. I noticed that there were (among other things) new kernel and new libvirt. It is an all-in-one setup done by following the Quick Start Guide. Before the update I shut down all my VMs (they are all persistent) and I deleted them from the “Virtual Machines” section. (All this I do every time I shut down the host and have had no problems whatsoever). After the Centos update, I restarted the machine but now I cannot instantiate any VM anymore. I get nasty timeouts, the VM stays in “PENDING” and no logs!

I thought that I need to upgrade Opennebula, as well. So, I upgraded to latest - 4.12. The upgrade went fine, except for when I had to execute (as advised by the documentation) “onehost sync”.

That command onehost sync hangs and then says: execution expired.
Also, from sunstone GUI, when I go to “Hosts”, my host is in status “INIT”. When I click on it to open, the gui progress bar juggles for a while and then I get “execution expired” pop up in the lower right corner.

PLEASE, tell me what to do or what additional info to provide, because the VMs are critical to our company infrastructure and I need to bring them up as soon as possible!

My setup:

  • CentOS release 6.6 + latest updates
  • Opennebula 4.8 installed from packages and subsequently upgraded to 4.10 and 4.12.

THANKS in advance!!

I would check if oneadmin user is able to log into CNs via ssh passwordlessly.

I don’t know what CNs are, but oneadmin CAN ssh to localhost passwordlessly, both by typing “localhost” and by typing the FQDN.

You should follow the upgrade documentation:

Most probably you have not run onedb command and your configuration files are not set up for your infrastructure.

That’s exactly what I followed. onedb upgrade went just fine.
Please, note that the initial problem appeared BEFORE I upgraded Opennebula. I decided to upgrade it (Opennebula) BECAUSE of the problem, as a last resort, hoping that the upgrade will solve it.

UPDATE: I stopped opennebula services, deleted

one.db
config
.one/
remotes/
vms/

and reinstalled opennebula with
yum reinstall opennebula-server opennebula-sunstone opennebula-node-kvm

and started over.

I login to the web GUI and go to Hosts in order to add a new host. I click on the + button and select
Type: KVM
Hostname: localhost
Cluster: default
Networking: dummy

The host ended up in ERROR state with a message something like “cannot connect to localhost”.

UPDATE: Appears that Opennebula doesn’t play nice with the latest KVM from Centos 6.6.
I get regular errors like this on the console:

bash: fork: retry: Resource temporarily unavailable

and when i check the processes I see endless lines like this:

oneadmin  47407  47404  0 14:25 ?        00:00:00 /bin/bash ./run_probes kvm /var/lib/one//datastores 4124 20 3 localhost

I also got the following error message in /var/log/one/sunstone.error

NoMethodError: undefined method `[]' for nil:NilClass
    /usr/lib/one/sunstone/views/index.erb:126:in `__tilt_6e45bebdc2155f4b08c75371e35ef6e8'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/tilt.rb:195:in `send'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/tilt.rb:195:in `evaluate'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/tilt.rb:131:in `render'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:343:in `render'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:302:in `erb'
    /usr/lib/one/sunstone/sunstone-server.rb:360:in `GET /'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:863:in `call'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:863:in `route'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:521:in `instance_eval'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:521:in `route_eval'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:500:in `route!'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:497:in `catch'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:497:in `route!'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:476:in `each'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:476:in `route!'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:601:in `dispatch!'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `instance_eval'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `catch'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:399:in `call'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/commonlogger.rb:18:in `call'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/deflater.rb:13:in `call'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/session/abstract/id.rb:63:in `context'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/session/abstract/id.rb:58:in `call'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/showexceptions.rb:24:in `call'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/methodoverride.rb:24:in `call'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:1005:in `synchronize'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/content_length.rb:13:in `call'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/chunked.rb:15:in `call'
    /usr/lib/ruby/gems/1.8/gems/thin-1.2.8/lib/thin/connection.rb:84:in `pre_process'
    /usr/lib/ruby/gems/1.8/gems/thin-1.2.8/lib/thin/connection.rb:82:in `catch'
    /usr/lib/ruby/gems/1.8/gems/thin-1.2.8/lib/thin/connection.rb:82:in `pre_process'
    /usr/lib/ruby/gems/1.8/gems/thin-1.2.8/lib/thin/connection.rb:57:in `process'
    /usr/lib/ruby/gems/1.8/gems/thin-1.2.8/lib/thin/connection.rb:42:in `receive_data'
    /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine'
    /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run'
    /usr/lib/ruby/gems/1.8/gems/thin-1.2.8/lib/thin/backends/base.rb:61:in `start'
    /usr/lib/ruby/gems/1.8/gems/thin-1.2.8/lib/thin/server.rb:159:in `start'
    /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/handler/thin.rb:14:in `run'
    /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:946:in `run!'
    /usr/lib/one/sunstone/sunstone-server.rb:627

I would appreciate any hints!
Thanks.

I don’t think it’s a good idea deleting the database. All the information about images, VMs, networks and such are there. Recovering all without the database can be time consuming.

We have not found any problem with CentOS 6.6. That is the platform we use to test the packages.

  • Check /var/log/one/oned.log and check for errors
  • Make sure you can ssh using oneadmin to localhost. I think you’ve already done this but just double check
  • Make sure libvirt is running and oneadmin can execute commands:
$ virsh -c qemu:///system list

Concerning the error in sunstone. It seems that the configuration files for sunstone are wrong? You should use the new config files and do any changes you’ve made in the previous version. That error is this line:

It seems that the view you are using no longer exists. Try clearing the cache and cookies from your user and log in again.

I don’t think it’s a good idea deleting the database.

I made a backup of the database. My goal was to see if it would work if I started from clean slate. Doesn’t :frowning: Something is very broken with my system.

Check /var/log/one/oned.log and check for errors

I’ve been monitoring that log the whole time. Nothing valuable there, even in debug mode.

Make sure you can ssh using oneadmin to localhost. I think you’ve already done this but just double check

check.

Make sure libvirt is running and oneadmin can execute commands:
$ virsh -c qemu:///system list

returns empty

Now I just installed a clean Centos 6.6 on another machine with Opennebula 4.10 on it (because 4.12 doesn’t have a Quick Start Guide for Centos 6). It works. Looks my main system is messed up very badly by the last Centos update, although it was minor and I had done such minor updates many times.

Anyway, I think I’m going to have to re-install the OS and everything. Please, point me to some steps how to restore my VMs. All I have about them is their disk images, which are persistent, so everything is there (or IS IT?)

Other than that, I have detailed notes about the network setup and I can re-create the Opennebula configuration pretty quickly. My main concern is how to import a pre-existing image when creating images in the new installation. The GUI doesn’t seem to have such an option. For example, my current datastores folder looks like this:

[oneadmin@saturn ~]$ ls -lath datastores/1/
total 238G
-rw-r--r-- 1 oneadmin oneadmin 830M  6 апр 20,38 8fd3c7c3f660d6b5bd0e7a539cba1c71
-rw-r--r-- 1 oneadmin oneadmin 8,2G  6 апр 19,37 58842014eed4089fcf2a44bada846b8c
-rw-r--r-- 1 oneadmin oneadmin 9,8G  6 апр 19,35 c03b8ce194e975bdcc4c01220e6f680d
-rw-r--r-- 1 oneadmin oneadmin 103G  6 апр 19,35 02c50b301cb36cc082d6941408953354
-rw-r--r-- 1 oneadmin oneadmin 2,0G  6 апр 19,31 6a9db6566c6a545bf617cd29df52329f
-rw-r--r-- 1 oneadmin oneadmin 9,6G  6 апр 19,30 9910c1789c30a1a272fa6133848fc7cd
-rw-r--r-- 1 oneadmin oneadmin  98G  6 апр 19,30 6dae21596c8e6b0ba4c6aec1dd5e193d
-rw-r--r-- 1 oneadmin oneadmin  42G  6 апр 19,23 a8ca53e71c599eff61cb0ffbdeea44f2
-rw-r--r-- 1 oneadmin oneadmin 830M 31 мар 12,38 bdfe99869ac7916512d85b6c5921bc34
-rw-r--r-- 1 oneadmin oneadmin  34G 11 мар 13,58 e1bd39cbdad22b61f330bb3d684fdcb8
drwxr-x--- 2 oneadmin oneadmin 4,0K 10 мар  1,43 .
drwxr-x--- 7 oneadmin oneadmin 4,0K 10 мар  1,43 ..
-rw-r--r-- 1 oneadmin oneadmin 5,0G 29 окт 12,49 1b7ea8c333d77e388c856d00b94dc231
-rw-r--r-- 1 oneadmin oneadmin 716M 27 окт 18,07 750b0fd912a4a0664a1c59e1cce0ca02
-rw-r--r-- 1 oneadmin oneadmin 9,8G 27 окт 15,15 e8a8fb380f9494c4ca2dd31a0101a580
-rw-r--r-- 1 oneadmin oneadmin 1,9G 24 окт 14,24 2394552db652a714466c331f4a4c6263
-rw-r--r-- 1 oneadmin oneadmin 2,6G 24 окт 12,45 ea25699b922a47c7bbe68ccaf2386a48
-rw-r--r-- 1 oneadmin oneadmin 566M 19 сеп  2014 35a09ce63724f7a58fb4cc8ddfca5a5a
-rw-r--r-- 1 oneadmin oneadmin 398M 19 сеп  2014 efa94df4fc1e6bd8b2989ba3c2030bf7

Can I re-use those disk images in any way? They are all I have about my precious VMs :open_mouth:

Thanks!

Let me answer here for anyone that might have the same problem. I couldn’t figure out what went wrong with my CentOS 6 installation. I installed quickly a fresh one and updated to the same point and then installed the same version of Opennebula on it and it was running fine. Apparently, it is not about versions of packages. Something else broke, but I quess I will never know.

Anyway, I went on and re-installed my host, this time with CentOS 7, installed latest Opennebula on it. I kept my datastores folder in a backup location. I also kept all the contents of the home folder of user oneadmin /var/lib/one. The one.db file was upgraded properly before the failure, so I copied everything on the new installation and that was it. Opennebula starts and I can see all my settings there. VMs boot.

That’s it.