VMs getting into error state when NFS mount goes unreachable for a few mins

I am having opennebula managing 2 hosts. The os images are stored in /var/lib/one/datastores/1 which is on NFS and mounted on all the hosts. When launching a vm on the hosts, the nfs image seems to be referenced for bring up of each vm.

When the NFS goes unreachable due to network issues or maintenance even for a few minutes, the vms gets into disk error state, though /var/lib/one/datastores/0 is locally present on each host (not mounted on NFS). How to avoid this? The expectation is the NFS mount should be used only during VM launch to copy the image and then everything needed should be locally available on the host. How this can be done?


*** Environment ***
Opennebula 6.8.0
hosts running Ubuntu 20.04
NFS mounts on hosts:
/var/lib/one/datastores/1
/var/lib/one/datastores/2

Locally available on each host:
/var/lib/one/datastores/0

Steps to reproduce:
Launch a VM from opennebula
The image is cloned from NFS and VM comes up good
Bring down the connection to NFS

Current results:
When the nfs is not reachable or down for sometime, VM is still in running state, but the vm console shows disk errors and vm becomes unusable. Sometimes the vm boots into maintenance mode and running fsck recovers the vm, sometimes its unrecoverable.

Expected results:
The NFS dependency shouldnot be present once the VM is launched.

Hi @Sendilraj_P :waving_hand:

By default, the NFS datastore doesn’t use local storage and relies entirely on access to the NFS mount to operate with the machines disks. To avoid this, make sure your datastore is configured to also use local storage. This setup ensures that disks remain accessible even if there are interruptions with the NFS mount.

From what you’re saying, it seems like you have it configured that way. Could you confirm that your OpenNebula configuration is setup as described in this section of the documentation?

Cheers,
Victor.

Thankyou @vpalma for the pointer, will check