Disk usage calculation for system datastore

Hello Developers,

I am working on addon-storpool driver to handle the images created in the system datastore System datastore (context iso and volatile disks).

I am almost ready, the work in progress is uploaded to github in branch next. The only piece that is not solved is the reported disk usage for the system datastore when storpool TM_MAD is selected. As I see the size is calculated/reported when first VM is instantiated in the datastore. It looks like the numbers are the used/free space of the file system where the system datastore BASE_PATH is pointing.

In our setup on the system datastore FS are only VM configuration XML and the VM checkpoint file when VM is stopped. All other disks are symlinks to StorPool block devices so there is no much free space needed nut the reported free space is not accounting StorPool usage stats.

Can you point me to the files related to used/free space calculation for system datastore to check how to integrate the reporting of StorPool used/free space to them?

Kind regards,
Anton Todorov

Hi Anton,

The System Datastores are monitored through the monitor_ds.sh probe, i.e. using the host monitoring system (OpenNebula’s collectd) . It is based on BASE_PATH and the current implementation is based on df for the two stock system DS: shared & ssh.

Probably you can take a look at it and updated as needed. As this information is gathered through the hosts you need at least one host defined in OpenNebula to get it. Note that this schema is used to make it compatible with both shared and non-shared System DS.

Cheers

Ruben

Forgot to say that if you are using only the system DS only for files, the current monitoring should be OK. OpenNebula will consider cloning on the StorPool Datastore, through its configuration parameters. For example for ceph:

TM_MAD_CONF = [                                                                                                                                                                                                 
  name = "ceph", 
  ln_target = "NONE", 
  clone_target = "SELF", 
  shared = "yes"                                                                                                                                    
]     

means that clones are done in the Image datastore (SELF) and it won’t take any space from the System DS.

Hi Ruben,

Thank you for the promt hints. I will check the script and think how to solve the issue.

My goal is to have both system and images datastores driven by StorPool volumes. Our setup already is utilizing clone_target = SELF. During the development of the system DS support in our driver I figured out that it is possible to have system datastore on non-shared filesystem and images datastore on shared filesystem. This case is solved by introducing new attribute for the system datastore (SP_SYSTEM) which if is set to ssh (SP_SYSTEM = ssh) will override default shared=yes value in oned.conf.
At the moment I found that the shared attribute about TA_MAD in oned.conf is global for all datastores driven by the named TM_MAD. So I made the above workaround. What do you think about making the shared attribute optional per datastore? I agree that there is extra work to fix all TM_MAD drivers to support such change but it is doable as you can see in our addon :smile: .

Cheers
Anton

Hi Anton,

I think it is better to keep your current approach. Default attributes for
the driver are in oned.conf in TM_MAD. Then you can override them or
include other configuration attributes per datastore. This approach is used
in other datastores to control the specific behavior for each one, (e.g.
specific access keys, pool names or behavior…)

Cheers

Ruben

Hi Ruben,

It is working fine so I will not touch it :smile:

Cheers
Anton

:slight_smile: You know what they say…

Hi Ruben,

I managed to create scripts to handle reporting system datastore disk usage stats but I need to clarify some details:

  1. The script in src/im_mad/remotes/common.d/monitor_ds.sh is copied to each im/remotes/*-probes.d/ during install, then propagated to /var/tmp/one/im/… on each node. So the documented procedure to install it from our driver should be:
    a) patch $ONE_LOCATION/remotes/im/*-probes.d/monitor_ds.sh (just add one line of code to source our script)
    b) propagate the change all to nodes by issuing onehost sync hostname

  2. I am not sure what monitor_ds.sh is reporting. it have:
    DS_LOCATION_{USED,TOTAL,FREE}_MB - this is the size on the filesystem of all datastores in the given DATASTORE_LOCATION path

Then we have for each datastore in the given DATASTORE_LOCATION path the following alternative:
a) {USED,TOTAL,FREE}_MB this is the size on filesystem

b) if there is LVM setup report the size from LVM as {USED,TOTAL,FREE}_MB and the size on the filesystem as VOLATILE_{USED,TOTAL,FREE}_MB.

IMHO our case is b) but from my understanding I should report datastore size on the filesystem as {USED,TOTAL,FREE}_MB and the size used by the volatile and context images on StorPool as VOLATILE_{USED,TOTAL,FREE}_MB. Please correct me if I am wrong.

And one minor note - the monitor_ds.sh script is missing copyright in its header. :wink:

Cheers,
Anton

Hi Ruben,

I’ve just pushed my last changes to our ‘next’ branch for addon-storpool. About the reporting of the used space I made some tests and figured out that VOLATILE_{USED,TOTAL,FREE}_MB variables are not taken in account at all. So I tweaked our extension to the monitor.ds.sh script to calculate and report most honest numbers via {USED,TOTAL,FREE}_MB variables.

I am planning to push it to our master branch because I do not see anything that i could further improve on it, it is working for weeks in our testing lab.

There is pieces of work upcoming that I would like to base on the code from the next branch. My goal is our addon to be ready (shortly if not immediately) after 4.14 is released. These feature branches will be named following your development naming convention. For example feature-3782 is almost ready to push :smile:

Kind Regards,
Anton Todorov