Error when trying to adjust BRIDGE_LIST on ceph datastore

I’m trying to update the BRIDGE_LIST on a data store for my Ceph images, as I’m having trouble uploading an image through Sunstone and suspect the fact that one of my bridge hosts is down as being the problem.

I get this error:

[one.datastore.update] Cannot update template. Attribute shared, ln_target or clone_target in TM_MAD_CONF for ceph is missing or has wrong value in oned.conf

This is the offending config:

TM_MAD_CONF = [
    NAME = "ceph", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "YES",
    DS_MIGRATE = "NO", DRIVER = "raw", ALLOW_ORPHANS="mixed",
    TM_MAD_SYSTEM = "ceph", LN_TARGET_SSH = "SYSTEM", CLONE_TARGET_SSH = "SYSTEM",
    DISK_TYPE_SSH = "FILE", LN_TARGET_SHARED = "NONE",
    CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "RBD",
    LN_TARGET = "NONE", CLONE_TARGET = "SELF"
]

SHARED, LN_TARGET and CLONE_TARGET are all present and accounted for, and from the documentation, look to be set to valid values.

oned.log isn’t any more informative:

Fri Sep 11 07:54:29 2020 [Z0][ReM][D]: Req:9360 UID:0 IP:127.0.0.1 one.datastore.update invoked , 100, "ALLOW_ORPHANS = "mix...", 0
Fri Sep 11 07:54:29 2020 [Z0][ReM][E]: Req:9360 UID:0 one.datastore.update result FAILURE [one.datastore.update] Cannot update template. Attribute shared, ln_target or clone_target in TM_MAD_CONF for ceph is missing or has wrong value in oned.conf

Pretty much the same message I got through the front-end. All I want to do is upload a new ISO image so I can boot a VM instance with it. The config above is migrated from an earlier OpenNebula 5.4 install, a migration necessitated by an operating system update.

In case you haven’t solve it already. The configuration should read:

    NAME = "ceph", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "YES",                                                                                                                                                                                  
    DS_MIGRATE = "NO", DRIVER = "raw", ALLOW_ORPHANS="mixed",                                                                                                                                                                                                  
    TM_MAD_SYSTEM = "ssh,shared", LN_TARGET_SSH = "SYSTEM", CLONE_TARGET_SSH = "SYSTEM",                                                                                                                                                                       
    DISK_TYPE_SSH = "FILE", LN_TARGET_SHARED = "NONE",                                                                                                                                                                                                         
    CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "RBD"                                                                                                                                                                                                     
]

In this case the error is TM_MAD_SYSTEM, that is used to define additional transfer modes for VMs. Ceph supports SSH and shared, meaning that some VMs can run from local storage. Basically the rbd volume is exported as a file. As you set TM_MAD_SYSTEM = ceph (which is not needed) oned was looking for LN_TARGET_CEPH… Hope it helps.

No problems, I’ve amended the configuration.

I’ll admit the migration to 5.10 was done in a hurry because I had done an update of the underlying OS to Ubuntu 18.04 (from 16.04), then had to update OpenNebula because the old version wouldn’t run on the newer OS (and there wasn’t a build of the older OpenNebula for Ubuntu 16.04).

I should probably move up to 5.12 before I miss the boat a second time!

RC=0 stuartl@rikishi ~ $ diff -u old.conf new.conf 
--- old.conf    2020-09-15 08:02:18.250236800 +1000
+++ new.conf    2020-09-15 08:02:41.227599071 +1000
@@ -1,6 +1,5 @@
     NAME = "ceph", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "YES",
     DS_MIGRATE = "NO", DRIVER = "raw", ALLOW_ORPHANS="mixed",
-    TM_MAD_SYSTEM = "ceph", LN_TARGET_SSH = "SYSTEM", CLONE_TARGET_SSH = "SYSTEM",
+    TM_MAD_SYSTEM = "ssh,shared", LN_TARGET_SSH = "SYSTEM", CLONE_TARGET_SSH = "SYSTEM",
     DISK_TYPE_SSH = "FILE", LN_TARGET_SHARED = "NONE",
-    CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "RBD",
-    LN_TARGET = "NONE", CLONE_TARGET = "SELF"
+    CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "RBD"

Those changes seem to have made a difference, I’m able to update the data store configuration now to take out the downed host out of BRIDGE_LIST. VMs that refused to start before are now starting… jury’s still out on uploading images, that seems to be inconclusive, but I’ll keep chasing it from here.