We have observed that the Nebula service frequently gets stuck during VM lifecycle stage transitions.
For example, when more than 20+ actions occur simultaneously such as power-off to running, prolog to boot, or boot to running, the process often hangs. During this time, the logs stop updating in real time, as seen when running:
tail -f /var/log/one/<vm_id>.log
No further log entries are generated until we restart the Nebula service. Once restarted, the logs resume and the VM states begin changing normally again.
This suggests that the Nebula service becomes unresponsive during high-concurrency operations or multiple simultaneous state transitions. Could you please help us understand the root cause of this behavior?
Is this a known limitation or expected behavior of Nebula, or can any configuration parameters in oned.conf be adjusted to prevent such frequent hangs?
currently I need to restart the Nebula service often to recover from these situations and would appreciate your guidance on achieving a permanent or configuration-based fix.