We are testing the new opennebula 5.12 community upgrade in our testbed, just following the docs:
And also:
but during the onehost sync step, we get the error:
$ onehost sync
* Adding hyp107.altaria.os to upgrade
* Adding hyp106.altaria.os to upgrade
* Adding hyp105.altaria.os to upgrade
* Adding hyp104.altaria.os to upgrade
[========================================] 4/4 hyp104.altaria.os
Failed to update the following hosts:
* hyp107.altaria.os
* hyp105.altaria.os
* hyp106.altaria.os
* hyp104.altaria.os
And also the hyps are in error status after the upgrade (and VMs in unknown status). We didnt get any error during the rpms/db ugprade. Is this a known issue? We have upgraded from 5.8.1 to 5.12.0 using the community migrator package.
From the oned logs we can also see these error messages:
Usually the sync fails when there is some file without enough permissions or when there is some symbolic link broke. Let’s try the first one, could you run find /var/lib/one/remotes ! -user oneadmin -exec ls -l {} \; in your frontend and share the output?
Ah, indeed, we had a few files there with just root access and this was interfering with the sync. We did a backup as root for /var/lib/one/remotes/etc so we get several files with root permissions:
We should use oneadmin user to make those backups next time. I have moved the spurious /var/lib/one/remotes/etc.xxxxxxx directories and now the sync is working correctly as oneadmin:
$ onehost sync
* Adding hyp107.altaria.os to upgrade
* Adding hyp106.altaria.os to upgrade
* Adding hyp105.altaria.os to upgrade
[========================================] 3/3 hyp105.altaria.os
Also more good news! with this fix also the hosts now are available again, it seems this has fixed also another issue (Hosts in error after upgrading to 5.12.0)