Hello everyone,
I’m woking at a University Data Centre and i’m planning a deployment of OpenNebula on bare-metal servers and am looking for up-to-date best practices, particularly around PXE boot for unattended installations and autoconfiguration to streamline the process. My setup involves a scalable, automated environment that supports multi-tenant management, high availability, and robust security measures. To provide more context, I’ll outline my key requirements based on my actual setup and feature list.
First, I need guidance on using PXE boot effectively for network-based deployment. This includes setting up a PXE server with DHCP and TFTP to enable automatic booting of bare-metal servers, combined with tools like Kickstart or Cloud-Init for OS installation and initial configuration. I’m aiming for minimal manual intervention to ensure scalability and reliability in a production environment.
Additionally, autoconfiguration is crucial for my use case. I want to automate the provisioning of bare metal Servers with scripts that handle host registration, inventory management, and eventually also firmware updates right after boot. This ties into broader automation strategies using tools like Ansible or Terraform to manage configurations. The Host must use FC Connection to a Huawei Dorado SAN Array for Data Storage. The Servers are DELL-Server (R640,R750,R6515, R7515). Optionally HCI (Storage inside Compute-Server) with Ceph should also be possible.
From a management perspective, I’m focusing on multi-tenant features, such as integrating LDAP or OIDC for authentication, implementing strict quotas to prevent resource overuse, and enabling self-service portals via OpenNebula’s Sunstone UI. Security is a top priority, so I’m interested in best practices for RBAC, separated networks, Multi-VLAN usage, Microsegmentation and regular updates to mitigate vulnerabilities.
If anyone has recent experiences or recommendations—such as sample scripts, configuration templates, or pitfalls to avoid—please share them. I’m particularly curious about integrating these elements into a CI/CD pipeline for ongoing maintenance and how to handle challenges like ensuring fault tolerance during deployments and upgrades.
For the Beginning there will be around 500 VMs on ~20 Servers. What do you think? Is it possible to administer this Setup with 1 Person?
Thanks in advance for your insights and any resources you can point to!
Best regards,
Christian