OpenNebula EE 6.6.1 backup and restore issues

Hi team.

I have been testing OpenNebula new backups solution since ON CE 6.6.0. There are a lot of issues there but I am happy that ON team is working in the right direction. I am testing now ON EE 6.6.1 (EE maintenace patch) that fix some of these issues from ON CE 6.6.0, but there are new issues that I would like to report.

My Lab:

  • Remote Backup server is on 192.168.1.254 and is sharing its storage through NFS with the OpenNebula orchestrator, running on 192.168.1.71. Both servers are running under Ubuntu 22.04. The exported directory is “/var/lib/vault/rsync_backups/ondev”. The OpenNebula orchestrator has the NFS client configured and the mount point “/var/lib/datastores/nfs_mount”. The datastore for the backups is the “100”. There is a simbolic link from "/var/lib/one/datastores/100 to “/var/lib/datastores/nfs_mount/100”. The user and group “oneadmin” are the owners for both directories. The user “oneadmin” can access through SSH to the orchestrator and get into the directory “/var/lib/one/datastores/100”, allowing the rsync actions for the backup proccess. A test VM was created (using Alpine for this) and the VM was configured for INCREMENTAL backups.
OS=[
  ARCH="x86_64",
  BOOT="disk0",
  UUID="5b906216-3a71-4d64-80d0-5fb42d1c134a" ]
GRAPHICS=[
  LISTEN="0.0.0.0",
  PORT="5904",
  TYPE="VNC" ]
CONTEXT=[
  DISK_ID="1",
  NETWORK="YES",
  PASSWORD="PAwdmvBzN1lD9dNu89fW2g==",
  SSH_PUBLIC_KEY="",
  TARGET="hda" ]
BACKUP_CONFIG=[
  ACTIVE_FLATTEN="NO",
  BACKUP_VOLATILE="YES",
  FS_FREEZE="AGENT",
  INCREMENTAL_BACKUP_ID="13",
  KEEP_LAST="3",
  LAST_INCREMENT_ID="2",
  MODE="INCREMENT" ]

Detected issues for Backup action

  • Backup#1: The first backup was successfull (#0 FULL).
  • Backup#2: The 2nd backup was successfull (now we have #0 FULL and #1 INC1).
  • Backup#3: The 3rd backup was successfull (now we have #0 FULL and #1 INC1).
  • Backup#4: The 4th backup was successfull but one of the increments was removed (now we have #1 FULL, #2 INC, #3 INC) and I expected to have (#0 FULL, #1 INC, #2 INC, #3 INC). This is because the option KEEP_LAST="3" is also limiting the number of increments inside the chain, when should only be limiting the number of chains of incremental backups. In my case I only wanted 3 chains of incremental backups and no limit for incremental backups inside each one of the chain (recomended to have no more than 6 increments per chain, 1 FULL + 6 INCs). OpenNebula EE 6.6.1 does not provide an option for limit the number of increments on a chain. Right now, it looks like is doing “synthetic full” and it appears that, in a chain of 3 increments (for example) as a limit (1 FULL and 2 INCs), the 4th backup will make that, the backup #0 (FULL) becomes part of the backup #1 (INC), resulting that now the full backup is the #1, the old backup #2 (INC) is now the 1st incremental after the FULL backup, and a new incremental backup is created (#3 INC). Team, Is this deduction correct?

Detected issues for Restore action

  • For restore, if you click on “Restore” button it does nothing, you need to refresh your web browser (FireFox) to see the new square dialog for restoring backups.
  • When restoring a VM, you specify a name for the new image and the INC ID to be restores. The images is restored, imported to the image datastore and a new template for it is also created, but neither one has the name I specified for that image. The image is being restored using the format <vmid>-<backupid>-disk-<diskid>. The template is created using the format <vmid>-<backupid>.
  • Before each backup, I created a file. For backup #1 (bak1-chain0), for backup #2 (bak2-chain0), for backup #3 (bak3-chain0). When I restored one of this backups, I specified the Increment ID 2. I expected to only fin the files bak1-chain0 and bak2-chain0, but instead I found all files until the last one. This means that OpenNebula is restoring allways from the last increment, no matter the specified Increment ID.
cat /var/log/one/9.log
Tue Apr 25 14:29:17 2023 [Z0][VM][I]: New state is ACTIVE
Tue Apr 25 14:29:17 2023 [Z0][VM][I]: New LCM state is PROLOG
Tue Apr 25 14:29:21 2023 [Z0][VM][I]: New LCM state is BOOT
Tue Apr 25 14:29:21 2023 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/9/deployment.0
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: ExitCode: 0
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/mkdir -p.
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: ExitCode: 0
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/cat - >/var/lib/one//datastores/0/9/vm.xml.
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: ExitCode: 0
Tue Apr 25 14:29:22 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/cat - >/var/lib/one//datastores/0/9/ds.xml.
Tue Apr 25 14:29:25 2023 [Z0][VMM][I]: ExitCode: 0
Tue Apr 25 14:29:25 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: deploy.
Tue Apr 25 14:29:25 2023 [Z0][VMM][I]: Successfully execute network driver operation: post.
Tue Apr 25 14:29:25 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Apr 25 14:30:22 2023 [Z0][VM][I]: New LCM state is HOTPLUG
Tue Apr 25 14:30:23 2023 [Z0][VMM][I]: ExitCode: 0
Tue Apr 25 14:30:23 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: prereconfigure.
Tue Apr 25 14:30:23 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Apr 25 14:30:23 2023 [Z0][VMM][I]: ExitCode: 0
Tue Apr 25 14:30:23 2023 [Z0][VMM][I]: Successfully execute virtualization driver operation: reconfigure.
Tue Apr 25 14:30:23 2023 [Z0][VMM][I]: VM update conf succesfull.
Tue Apr 25 14:30:23 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Apr 25 14:31:54 2023 [Z0][VM][I]: New LCM state is BACKUP
Tue Apr 25 14:32:00 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: prebackup_live.
Tue Apr 25 14:32:05 2023 [Z0][VMM][I]: Successfully execute  operation: backup.
Tue Apr 25 14:32:08 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: postbackup_live.
Tue Apr 25 14:32:08 2023 [Z0][VMM][I]: VM backup successfully created.
Tue Apr 25 14:32:08 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Apr 25 14:33:05 2023 [Z0][VM][I]: New LCM state is BACKUP
Tue Apr 25 14:33:08 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: prebackup_live.
Tue Apr 25 14:33:11 2023 [Z0][VMM][I]: Successfully execute  operation: backup.
Tue Apr 25 14:33:12 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: postbackup_live.
Tue Apr 25 14:33:12 2023 [Z0][VMM][I]: VM backup successfully created.
Tue Apr 25 14:33:12 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Apr 25 14:33:47 2023 [Z0][VM][I]: New LCM state is BACKUP
Tue Apr 25 14:33:50 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: prebackup_live.
Tue Apr 25 14:33:52 2023 [Z0][VMM][I]: Successfully execute  operation: backup.
Tue Apr 25 14:33:53 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: postbackup_live.
Tue Apr 25 14:33:53 2023 [Z0][VMM][I]: VM backup successfully created.
Tue Apr 25 14:33:53 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Apr 25 15:47:09 2023 [Z0][VM][I]: New LCM state is BACKUP
Tue Apr 25 15:47:12 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: prebackup_live.
Tue Apr 25 15:47:14 2023 [Z0][VMM][I]: Successfully execute  operation: backup.
Tue Apr 25 15:47:16 2023 [Z0][VMM][I]: Successfully execute transfer manager driver operation: postbackup_live.
Tue Apr 25 15:47:16 2023 [Z0][VMM][I]: VM backup successfully created.
Tue Apr 25 15:47:16 2023 [Z0][VM][I]: New LCM state is RUNNING
Tue Apr 25 15:47:16 2023 [Z0][LCM][I]: Removing 1 backup increments

In the last backup: Tue Apr 25 15:47:16 2023 [Z0][LCM][I]: Removing 1 backup increments

This is an example with KEEP_LAST = 2, check the following pictures:

Backup #1

Backup #2

Backup #3

Follow up the entire discussion here