I have an OpenNebula 4.6.2 with vmware.
When I suspend a virtual machine and I try to boot after that state (SUSPENDED), always fails and VM doesn’t change of state.
In vmware the VM appears power off and when I try boot it from there, it boots fine but it isn’t synchronized with Sunstone. In Sunstone it appears suspended.
The problem is:
Tue Mar 31 12:17:49 2015 [VMM][I]: Command execution fail: /var/lib/one/remotes/vmm/vmware/restore ‘/vmfs/volumes/0/128/checkpoint’ ‘X.X.X.X’ ‘one-128’ 128 X.X.X.X
Tue Mar 31 12:17:49 2015 [VMM][I]: /var/lib/one/remotes/vmm/vmware/vmware_driver.rb:212: warning: Object#id will be deprecated; use Object#object_id
Tue Mar 31 12:17:49 2015 [VMM][E]: restore: Error executing: virsh -c ‘esx://X.X.X.X/?no_verify=1&auto_answer=1’ snapshot-revert one-128 checkpoint err: ExitCode: 1
Tue Mar 31 12:17:49 2015 [VMM][I]: out:
Tue Mar 31 12:17:49 2015 [VMM][I]: error: internal error Could not revert to snapshot ‘checkpoint’: FileLocked - Unable to access file since it is locked
Tue Mar 31 12:17:49 2015 [VMM][I]:
Tue Mar 31 12:17:49 2015 [VMM][I]: ExitCode: 1
Tue Mar 31 12:17:49 2015 [VMM][I]: Failed to execute virtualization driver operation: restore.
Tue Mar 31 12:17:49 2015 [VMM][E]: Error restoring VM
Tue Mar 31 12:17:49 2015 [LCM][I]: Fail to boot VM. New VM state is SUSPENDED
The error seems to come from the snapshot file being locked. To better understand why, could you please send the vmware.log file? This should be placed in the /vmfs/volumes/<system_ds_id>//disk.0/ directory in the ESX, where vid is the id of the VM, and system_ds_id is the system datastore id.
After analysing the log, we cannot find any reference to the locked file. We suggest following the steps proposed by VMware to solve the issue. If this does not work, we suggest copying the relevant disks from the VM, register them again and launch a new VM based on this new disks to avoid losing any data.
Hello,
I have already tested the VMware steps to solve the problem. This doesn’t work.
The question is this problem happens in all machines when theses are suspended, never start again.
Can it be a template or image mistake?
Thank you. Regards.
We cannot reproduce the problem, but it is unlikely that it is a template or image error. Does this happen with only VMs from one VM template and one image?
If you access the ESX directly via the vSphere client, are you able to restore the “checkpoint” snapshot? Are you able to create more snapshots, and restore them?
I have just created a new virtual machine and the problem is the same when I suspend the machine.
From vSphere client I can create new snapshosts, revert to checkpoint snapshot. It’s very strange.
Can it be a permission problem in the datastores inside vSphere? I have created a new user called oneadmin but when I create a machine for example, the owner is root inside the ESXi.
Permissions could explain the problem. Are you using NFS? If you change the checkpoint snapshot file ownership to oneadmin, does the VM go from suspended to running?
We can confirm the permissions issue by setting ESX root credentials in /etc/one/vmwarerc and trying again the process. Please let us know the outcome of the experiment.
The vmwarerc looks fine. You can try to set the password without quotes (although it should not make a difference if at leat one operation is working). The libvirt URI is correct
The datacenter and vCenter variables can be commented if you do not need the VMotion capabilities.
We are failing to reproduce this problem. Using root, are all the files belonging to root in /vmfs/volumes/0/<vid>?
I have deleted the quotes’ password but the erro is the same.
VMotion attributes are commented now.
Yes, all the files are belonging to root in /vmfs/volumes/0/126
And my hypervisor is an ESXi 5.5.
I have a federated platform with other hypervisor: KVM, and suspended action works fine.
The problem is with VMware driver…
If I run these manual commands:
/var/lib/one/remotes/vmm/vmware/restore ‘/vmfs/volumes/0/126/checkpoint’ ‘X.X.X.X’ ‘one-126’ 126 X.X.X.X
virsh -c ‘esx://X.X.X.X/?no_verify=1&auto_answer=1’ snapshot-revert one-126 checkpoint
Everything works fine!
The problem is when I run SUSPEND from the Sunstone or with onevm suspend 126. It’s strange
How could I change the owner user in the ESXi from root to oneadmin?
Not sure if I understand the question though (How could I change the owner user in the ESXi from root to oneadmin?). If you mean of the files generated by the ESX there is no way. If you input the root username and password in the vmwarerc, all the files should belong to root though.
Apologies but since we cannot reproduce the error we are running out of ideas.