Running Open Nebula 5.2.1 on two Ubuntu 16.04 VMs, I have a similar issue: Pacemaker has issues starting opennebula-sunstone:
Stack: corosync
Current DC: one-a (version 1.1.14-70404b0) - partition with quorum
2 nodes and 5 resources configured
Online: [ one-a one-b ]
Full list of resources:
Resource Group: opennebula-cluster
VIP (ocf::heartbeat:IPaddr2): Started one-b
opennebula (systemd:opennebula.service): Started one-b
opennebula-gate (systemd:opennebula-gate.service): Started one-b
opennebula-sunstone (systemd:opennebula-sunstone.service): Stopped
opennebula-flow (systemd:opennebula-flow.service): Stopped
Failed Actions:
* opennebula_monitor_30000 on one-b 'not running' (7): call=25, status=complete, exitreason='none',
last-rc-change='Mon Mar 13 10:54:37 2017', queued=0ms, exec=0ms
* opennebula-sunstone_start_0 on one-b 'not running' (7): call=28, status=complete, exitreason='none',
last-rc-change='Mon Mar 13 10:53:38 2017', queued=0ms, exec=2015ms
* opennebula-sunstone_start_0 on one-a 'not running' (7): call=28, status=complete, exitreason='none',
last-rc-change='Fri Mar 10 19:19:56 2017', queued=0ms, exec=2012ms
* opennebula_monitor_30000 on one-a 'OCF_PENDING' (196): call=25, status=complete, exitreason='none',
last-rc-change='Fri Mar 10 19:20:25 2017', queued=0ms, exec=0ms
I’ve put ExecStartPre=/usr/bin/logger "sunstone systemd script invoked"
into /lib/systemd/system/opennebula-sunstone.service, rebooted that node and as expected, don’t get this message anywhere (above’s output comes from that reboot of node one-b):
root@one-b:~# /usr/bin/logger "TEST: sunstone systemd script invoked"
root@one-b:~# grep -r "sunstone systemd script invoked" /var/log/
/var/log/syslog:Mar 13 11:34:46 one-b root: TEST: sunstone systemd script invoked
root@one-b:~# systemctl status opennebula-sunstone.service
● opennebula-sunstone.service - OpenNebula Web UI Server
Loaded: loaded (/lib/systemd/system/opennebula-sunstone.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Mär 13 10:53:40 one-b systemd[1]: Stopped OpenNebula Web UI Server.
I’m no systemd expert, but to me it seems as if the service isn’t started at all and I can’t make out how it is supposed to work in the first place:
root@one-b:~# grep opennebula.*service /lib/systemd/system/*.service
/lib/systemd/system/opennebula-novnc.service:Before=opennebula-sunstone.service
/lib/systemd/system/opennebula-scheduler.service:After=opennebula.service
/lib/systemd/system/opennebula-scheduler.service:BindTo=opennebula.service
/lib/systemd/system/opennebula.service:Before=opennebula-scheduler.service
/lib/systemd/system/opennebula.service:BindTo=opennebula-scheduler.service
/lib/systemd/system/opennebula-sunstone.service:After=opennebula.service
/lib/systemd/system/opennebula-sunstone.service:After=opennebula-novnc.service
/lib/systemd/system/opennebula-sunstone.service:BindTo=opennebula-novnc.service
opennebula-sunstone.service wants to be started after opennebula.service and opennebula-novnc.service, the latter wants to be started before opennebula-sunstone.service. opennebula-novnc.service seems not to be started explicitely anywhere, I guess that magic happens due to systemd? It’s at least running on node-b:
root@one-b:~# systemctl status opennebula-novnc.service
● opennebula-novnc.service - OpenNebula noVNC Server
Loaded: loaded (/lib/systemd/system/opennebula-novnc.service; disabled; vendor preset: enabled)
Active: active (running) since Mo 2017-03-13 10:54:10 CET; 1h 39min ago
Process: 1399 ExecStart=/usr/bin/novnc-server start (code=exited, status=0/SUCCESS)
Main PID: 1638 (python2)
Tasks: 1
Memory: 27.9M
CPU: 1.923s
CGroup: /system.slice/opennebula-novnc.service
└─1638 python2 /usr/share/one/websockify/websocketproxy.py --target-config=/var/lib/one/sunstone_vnc_tokens 29876
Mär 13 10:53:38 one-b systemd[1]: Starting OpenNebula noVNC Server...
Mär 13 10:54:10 one-b novnc-server[1399]: VNC proxy started
Mär 13 10:54:10 one-b systemd[1]: Started OpenNebula noVNC Server.
I’m at loss here, starting Sunstone via systemd manually “just works” at this point, starting it via Pacemaker doesn’t and it looks like it’s not even tried ("status=complete, exitreason='none'
", see above)?