Issues setting up Ceph Datastores

Hi everyone,

I’ve been trying for a few days to get Ceph datastores working with my OpenNebula setup


Versions of the related components and OS (frontend, hypervisors, VMs):

OS:

[root@opennebula1.example.com ~]# cat /etc/redhat-release
Rocky Linux release 8.9 (Green Obsidian)

[root@opennebula1.example.com ~]# uname -a
Linux opennebula1.example.com 4.18.0-513.5.1.el8_9.x86_64 #1 SMP Fri Nov 17 03:31:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Packages on frontend node (opennebula1.example.com):

[root@opennebula1.example.com ~]# yum list installed | grep -E '(nebula|ceph|virt|qemu)'
ceph-common.x86_64                    2:17.2.7-0.el8                                    @PROXY-Ceph
libcephfs2.x86_64                     2:17.2.7-0.el8                                    @PROXY-Ceph
opennebula.x86_64                     6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-common.noarch              6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-common-onecfg.noarch       6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-fireedge.x86_64            6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-flow.noarch                6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-gate.noarch                6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-guacd.x86_64               6.8.0-1.2.0+1.el8                                 @PROXY-OpenNebula
opennebula-libs.noarch                6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-provision.noarch           6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-provision-data.noarch      6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-rubygems.x86_64            6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-sunstone.noarch            6.8.0-1.el8                                       @PROXY-OpenNebula
opennebula-tools.noarch               6.8.0-1.el8                                       @PROXY-OpenNebula
python3-ceph-argparse.x86_64          2:17.2.7-0.el8                                    @PROXY-Ceph
python3-ceph-common.x86_64            2:17.2.7-0.el8                                    @PROXY-Ceph
python3-cephfs.x86_64                 2:17.2.7-0.el8                                    @PROXY-Ceph
qemu-img.x86_64                       15:6.2.0-49.module+el8.10.0+1752+38c6b60a         @appstream
virt-what.x86_64                      1.25-4.el8                                        @anaconda

Packages on KVM node(s) (kvm1.example.com):

[root@kvm1.example.com ~]# yum list installed | grep -E '(nebula|ceph|virt|qemu)'
ceph-common.x86_64                                2:17.2.7-0.el8                                              @ceph
ipxe-roms-qemu.noarch                             20181214-11.git133f4c47.el8                                 @appstream
libcephfs2.x86_64                                 2:17.2.7-0.el8                                              @ceph
librados2.x86_64                                  2:17.2.7-0.el8                                              @ceph
libradosstriper1.x86_64                           2:17.2.7-0.el8                                              @ceph
librbd1.x86_64                                    2:17.2.7-0.el8                                              @ceph
librgw2.x86_64                                    2:17.2.7-0.el8                                              @ceph
libvirt.x86_64                                    8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-client.x86_64                             8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon.x86_64                             8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-config-network.x86_64              8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-config-nwfilter.x86_64             8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-interface.x86_64            8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-network.x86_64              8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-nodedev.x86_64              8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-nwfilter.x86_64             8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-qemu.x86_64                 8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-secret.x86_64               8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage.x86_64              8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-core.x86_64         8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-disk.x86_64         8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-gluster.x86_64      8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-iscsi.x86_64        8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-iscsi-direct.x86_64 8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-logical.x86_64      8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-mpath.x86_64        8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-rbd.x86_64          8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-daemon-driver-storage-scsi.x86_64         8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
libvirt-libs.x86_64                               8.0.0-23.1.module+el8.10.0+1779+84732956                    @appstream
opennebula-common.noarch                          6.8.0-1.el8                                                 @PROXY-OpenNebula
opennebula-common-onecfg.noarch                   6.8.0-1.el8                                                 @PROXY-OpenNebula
opennebula-node-kvm.noarch                        6.8.0-1.el8                                                 @PROXY-OpenNebula
python3-ceph-argparse.x86_64                      2:17.2.7-0.el8                                              @ceph
python3-ceph-common.x86_64                        2:17.2.7-0.el8                                              @ceph
python3-cephfs.x86_64                             2:17.2.7-0.el8                                              @ceph
python3-rados.x86_64                              2:17.2.7-0.el8                                              @ceph
python3-rbd.x86_64                                2:17.2.7-0.el8                                              @ceph
python3-rgw.x86_64                                2:17.2.7-0.el8                                              @ceph
qemu-img.x86_64                                   15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm.x86_64                                   15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-block-curl.x86_64                        15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-block-gluster.x86_64                     15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-block-iscsi.x86_64                       15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-block-rbd.x86_64                         15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-block-ssh.x86_64                         15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-common.x86_64                            15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-core.x86_64                              15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-docs.x86_64                              15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-hw-usbredir.x86_64                       15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-ui-opengl.x86_64                         15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
qemu-kvm-ui-spice.x86_64                          15:6.2.0-49.module+el8.10.0+1752+38c6b60a                   @appstream
virt-what.x86_64                                  1.25-4.el8                                                  @anaconda

Steps to reproduce:

I have the following machines:

Followed the instructions here: Ceph Datastore — OpenNebula 6.8.3 documentation

  • Ceph pool created with name libvirt-pool
  • Created /etc/ceph/ceph.conf on frontend and kvm nodes
    • mon_host = 10.7.177.54:6789,10.7.177.52:6789,10.7.177.56:6789
    • Username: client.newadmin
  • Created /etc/ceph/keyring on frontend and kvm nodes
  • Verified ceph config is working correctly with: rbd ls -n client.newadmin -c /etc/ceph/ceph.conf -p libvirt-pool
  • Set up libvirt secret on kvm nodes:
[root@kvm1.example.com ~]# cat secret.xml
<secret ephemeral='no' private='no'>
  <uuid>$UUID</uuid>
  <usage type='ceph'>
    <name>client.newadmin secret</name>
  </usage>
</secret>

[root@kvm1.example.com ~]# virsh -c qemu:///system secret-define secret.xml
[root@kvm1.example.com ~]# virsh -c qemu:///system secret-set-value --secret $UUID --base64 $(cat client.libvirt.key)

[root@kvm1.example.com ~]# virsh secret-list
 UUID                                   Usage
---------------------------------------------------------------------
 UUID                                   ceph client.newadmin secret

Then I set up two datastores in the frontend:

[oneadmin@opennebula1.example.com root]$ onedatastore show 108
DATASTORE 108 INFORMATION
ID             : 108
NAME           : ceph-test-system
USER           : oneadmin
GROUP          : oneadmin
CLUSTERS       : 0,100
TYPE           : SYSTEM
DS_MAD         : -
TM_MAD         : ceph
BASE PATH      : /var/lib/one//datastores/108
DISK_TYPE      : RBD
STATE          : READY

DATASTORE CAPACITY
TOTAL:         : 0M
FREE:          : 0M
USED:          : 0M
LIMIT:         : -

PERMISSIONS
OWNER          : uma
GROUP          : uma
OTHER          : u--

DATASTORE TEMPLATE
ALLOW_ORPHANS="mixed"
BRIDGE_LIST="10.203.0.11 10.203.0.12 10.203.0.13 10.203.0.14 10.203.0.15 10.203.0.16"
CEPH_HOST="10.7.177.54:6789 10.7.177.52:6789 10.7.177.56:6789"
CEPH_SECRET="<SECRET_UUID>"
CEPH_USER="client.newadmin"
DISK_TYPE="RBD"
DS_MIGRATE="NO"
POOL_NAME="libvirt-pool"
RESTIC_COMPRESSION="-"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="ceph"
TYPE="SYSTEM_DS"

[oneadmin@opennebula1.example.com root]$ onedatastore show 109
DATASTORE 109 INFORMATION
ID             : 109
NAME           : ceph-test-images
USER           : oneadmin
GROUP          : oneadmin
CLUSTERS       : 0,100
TYPE           : IMAGE
DS_MAD         : ceph
TM_MAD         : ceph
BASE PATH      : /var/lib/one//datastores/109
DISK_TYPE      : RBD
STATE          : READY

DATASTORE CAPACITY
TOTAL:         : 0M
FREE:          : 0M
USED:          : 0M
LIMIT:         : -

PERMISSIONS
OWNER          : uma
GROUP          : uma
OTHER          : u--

DATASTORE TEMPLATE
ALLOW_ORPHANS="mixed"
BRIDGE_LIST="10.203.0.11 10.203.0.12 10.203.0.13 10.203.0.14 10.203.0.15 10.203.0.16"
CEPH_HOST="10.7.177.54:6789 10.7.177.52:6789 10.7.177.56:6789"
CEPH_SECRET="<SECRET_UUID>"
CEPH_USER="client.newadmin"
CLONE_TARGET="SELF"
CLONE_TARGET_SHARED="SELF"
CLONE_TARGET_SSH="SYSTEM"
COMPATIBLE_SYS_DS="108"
DISK_TYPE="RBD"
DISK_TYPE_SHARED="RBD"
DISK_TYPE_SSH="FILE"
DRIVER="raw"
DS_MAD="ceph"
LN_TARGET="NONE"
LN_TARGET_SHARED="NONE"
LN_TARGET_SSH="SYSTEM"
POOL_NAME="libvirt-pool"
RESTIC_COMPRESSION="-"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="ceph"
TM_MAD_SYSTEM="ssh,shared"
TYPE="IMAGE_DS"

However, both datastores are showing no available capacity:

DATASTORE CAPACITY
TOTAL:         : 0M
FREE:          : 0M
USED:          : 0M
LIMIT:         : -

Also, /var/log/one/oned.log is showing this:

[root@opennebula1.example.com one]# grep -r ceph ./*.log | tail -n 4
./oned.log:Wed Aug 28 07:06:20 2024 [Z0][InM][D]: Monitoring datastore ceph-test-system (108)
./oned.log:Wed Aug 28 07:06:20 2024 [Z0][InM][D]: Monitoring datastore ceph-test-images (109)
./oned.log:Wed Aug 28 07:06:22 2024 [Z0][ImM][I]: Command execution failed (exit code: 1): /var/lib/one/remotes/datastore/ceph/monitor 109
./oned.log:Wed Aug 28 07:06:22 2024 [Z0][ImM][I]: Command execution failed (exit code: 1): /var/lib/one/remotes/tm/ceph/monitor 108

Any help would be greatly appricated, I feel like I may be missing something

Thanks!

I figured out the problem, for anyone who runs into anything similar:

CEPH_USER is expecting the user ID (newadmin), rather then the user Name (client.newadmin) that I originally provided

Ref: User Management — Ceph Documentation

Ceph has the concept of a type of user. For purposes of user management, the type will always be client. Ceph identifies users in a “period- delimited form” that consists of the user type and the user ID: for example, TYPE.ID, client.admin, or client.user1. The reason for user typing is that the Cephx protocol is used not only by clients but also non-clients, such as Ceph Monitors, OSDs, and Metadata Servers. Distinguishing the user type helps to distinguish between client users and other users. This distinction streamlines access control, user monitoring, and traceability.

Sometimes Ceph’s user type might seem confusing, because the Ceph command line allows you to specify a user with or without the type, depending upon your command line usage. If you specify --user or --id, you can omit the type. For example, client.user1 can be entered simply as user1. On the other hand, if you specify --name or -n, you must supply the type and name: for example, client.user1. We recommend using the type and name as a best practice wherever possible.

1 Like