VM Stuck in "SNAPSHOT" state

Hi,

Using OpenNebula 5.0 and a VM stuck in SNAPSHOT state. Here’s a brief info about my system:

[oneadmin@fe1 ~]$ onedatastore list
ID NAME SIZE AVAIL CLUSTERS IMAGES TYPE DS TM STAT
102 nfs_images 896G 98% 0 13 img fs qcow2 on
103 nfs_system 896G 98% 0 0 sys - shared on

[oneadmin@fe1 ~]$ onevm show 28
VIRTUAL MACHINE 28 INFORMATION
ID : 28
NAME : vm-centos6
USER : oneadmin
GROUP : oneadmin
STATE : ACTIVE
LCM_STATE : HOTPLUG_SNAPSHOT
RESCHED : No
HOST : comp1
CLUSTER ID : 0
CLUSTER : default
START TIME : 06/09 18:31:43
END TIME : -
DEPLOY ID : one-28

VIRTUAL MACHINE MONITORING
CPU : 0.0
MEMORY : 1024M

PERMISSIONS
OWNER : um-
GROUP : —
OTHER : —

VM DISKS
ID DATASTORE TARGET IMAGE SIZE TYPE SAVE
0 nfs_images hda vm-centos6-disk-0 1.4G/30G file YES
1 - hdb CONTEXT 1M/- - -

SNAPSHOTS
ID TIME NAME HYPERVISOR_ID
0 06/10 10:19 vm28-centos6

VIRTUAL MACHINE HISTORY
SEQ HOST ACTION DS START TIME PROLOG
0 comp2 live-migrate 103 06/09 18:35:56 0d 15h42m 0h00m01s
1 comp1 none 103 06/10 10:18:08 0d 00h32m 0h00m00s

USER TEMPLATE
HYPERVISOR="kvm"
LOGO="images/logos/linux.png"
SCHED_REQUIREMENTS="ID=“3” | ID=“4"”

VIRTUAL MACHINE TEMPLATE
AUTOMATIC_DS_REQUIREMENTS="“CLUSTERS/ID” = 0"
AUTOMATIC_REQUIREMENTS="(CLUSTER_ID = 0) & !(PUBLIC_CLOUD = YES)“
CLONING_TEMPLATE_ID=“24"
CONTEXT=[
DISK_ID=“1”,
NETWORK=“YES”,
SSH_PUBLIC_KEY=””,
TARGET=“hdb” ]
CPU="0.3"
GRAPHICS=[
LISTEN=“0.0.0.0”,
PORT=“5928”,
TYPE=“VNC” ]
MEMORY="1024"
OS=[
BOOT=“disk1” ]
TEMPLATE_ID="27"
VMID=“28”

[oneadmin@fe1 ~]$ ll /var/lib/one/datastores/102/
total 8611202
-rw-r–r-- 1 oneadmin oneadmin 1485438976 Jun 9 12:04 0fcfa785f5f35267b6d8c88ecaeaa28a
drwxrwxr-x 2 oneadmin oneadmin 0 Jun 8 12:41 0fcfa785f5f35267b6d8c88ecaeaa28a.snap
-rw-r–r-- 1 oneadmin oneadmin 414187520 Jun 8 12:37 12ea73ddc561a53a36c40d3ca23ef68d
-rw-r–r-- 1 oneadmin oneadmin 197120 Jun 8 12:32 14958fb78c0a126e0f3e5ff97bab1790
-rw-r–r-- 1 oneadmin oneadmin 1102249984 Jun 9 14:25 1e1027f65ad292fadb84a11a8771928c
drwxrwxr-x 2 oneadmin oneadmin 0 Jun 9 12:33 1e1027f65ad292fadb84a11a8771928c.snap
-rw-r–r-- 1 oneadmin oneadmin 197120 Jun 8 12:38 3fe6ef100d06204cd68ac4332c868967
-rw-r–r-- 1 oneadmin oneadmin 1123352576 Jun 10 10:03 59fdd7f0ec980dcfec1f7f6effe4ff64
drwxrwxr-x 2 oneadmin oneadmin 0 Jun 9 18:36 59fdd7f0ec980dcfec1f7f6effe4ff64.snap
-rw-r–r-- 1 oneadmin oneadmin 414187520 Jun 8 12:40 5a4bb1fa3d6a77898bf00bafdbf408bb
-rw-r–r-- 1 oneadmin oneadmin 41943040 Jun 8 12:34 7b896584362209e209ef3aa6ea66410b
-rw-r–r-- 1 oneadmin oneadmin 632291328 Jun 8 18:21 95fcd4f61b0670a2b189ea0172f8f7df
-rw-r–r-- 1 oneadmin oneadmin 632291328 Jun 9 12:33 969e7e7e74035c1ddb887aa80ef6f20e
-rw-r–r-- 1 oneadmin oneadmin 197120 Jun 8 12:31 c1d2501d933512dd139cdccaf0fe80d9
-rw-r–r-- 1 oneadmin oneadmin 1485635584 Jun 9 14:19 e1c9f879d08a19f1e86fb9a5649fd885
drwxrwxr-x 2 oneadmin oneadmin 0 Jun 9 13:35 e1c9f879d08a19f1e86fb9a5649fd885.snap
-rw-r–r-- 1 oneadmin oneadmin 1485701120 Jun 10 10:54 ff607122771c2a785731a7889c19f68d
drwxrwxr-x 2 oneadmin oneadmin 0 Jun 9 18:35 ff607122771c2a785731a7889c19f68d.snap

================================
[oneadmin@fe1 ~]$ ll /var/lib/one/datastores/102/ff607122771c2a785731a7889c19f68d.snap/
total 1
lrwxrwxrwx 1 oneadmin oneadmin 60 Jun 9 18:35 0 -> /var/lib/one/datastores/102/ff607122771c2a785731a7889c19f68d
lrwxrwxrwx 1 oneadmin oneadmin 65 Jun 9 18:35 ff607122771c2a785731a7889c19f68d.snap -> /var/lib/one/datastores/102/ff607122771c2a785731a7889c19f68d.snap

mount (on all nodes and fe): mfs#mfsmaster:9421 on /var/lib/one/datastores type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)

================================

oned.log: Fri Jun 10 10:51:40 2016 [Z0][VMM][D]: VM 28 successfully monitored: DISK_SIZE=[ID=0,SIZE=1417] SNAPSHOT_SIZE=[ID=0,DISK_ID=0,SIZE=1417] DISK_SIZE=[ID=1,SIZE=1]

================================

28.log
Fri Jun 10 10:18:22 2016 [Z0][VM][I]: New LCM state is RUNNING
Fri Jun 10 10:19:24 2016 [Z0][VM][I]: New LCM state is HOTPLUG_SNAPSHOT

================================

Any help would be appreciated.

Orhan

What version are you using? We’ve fixed some snapshot problems in RC (4.90.10). Is it the first snapshot you’re creating in that disk or did one exist previously?

Can you retry the snapshot?

$ onevm recover --failure
$ onevm disk-snapshot-create ...

My version is OpenNebula 4.90.5 and this is the first time i try snapshot with this installation

onevm recover 28 --failure
works and makes the vm RUNNING

onevm disk-snapshot-create 28 0 test
no change, stuck in SNAPSHOT

Btw, how do i upgrade to 4.90.10, couldnt find in docs?

Thanks