Hi,
we have an OpenNebula (5.12.0.3] on top of Ubuntu physical nodes.
Last week, a OneFlow service with four different roles went into WARNING, since one of the roles (SGE_ubuntu_standard) was in state WARNING. There are no running VMs in the service.
=> How to check Why the service / role is in WARNING ? Only weird thing I see in the logs is SGE_ubuntu_standard cardinality to -1
.
I am unsure if the 6.0 docs apply to our 5.12, but in life-cycle I don’t see a way out of that state. I was hoping I could recover, but that’s not the case:
oneflow recover 18
Service cannot be recovered in state: WARNING
=> How to clear the WARNING state for that service ?
Yours,
Steffen
oneadmin@stratus:~$ oneflow show 18
SERVICE 18 INFORMATION
ID : 18
NAME : SGE
USER : oneadmin
GROUP : oneadmin
STRATEGY : none
SERVICE STATE : WARNING
PERMISSIONS
OWNER : um-
GROUP : ---
OTHER : ---
ROLE SGE_ubuntu_small
ROLE STATE : RUNNING
PARENTS : SGE_master
VM TEMPLATE : 24
CARDINALITY : 0
MIN VMS : 0
MAX VMS : 40
NODES INFORMATION
VM_ID NAME USER GROUP
ROLE SGE_ubuntu_standard
ROLE STATE : WARNING
PARENTS : SGE_master
VM TEMPLATE : 25
CARDINALITY : 0
MIN VMS : 0
MAX VMS : 12
NODES INFORMATION
VM_ID NAME USER GROUP
ROLE SGE_suse_small
ROLE STATE : RUNNING
PARENTS : SGE_master
VM TEMPLATE : 26
CARDINALITY : 0
MIN VMS : 0
MAX VMS : 40
NODES INFORMATION
VM_ID NAME USER GROUP
ROLE SGE_suse_standard
ROLE STATE : RUNNING
PARENTS : SGE_master
VM TEMPLATE : 27
CARDINALITY : 0
MIN VMS : 0
MAX VMS : 12
NODES INFORMATION
VM_ID NAME USER GROUP
ROLE SGE_master
ROLE STATE : RUNNING
VM TEMPLATE : 42
CARDINALITY : 0
NODES INFORMATION
VM_ID NAME USER GROUP
LOG MESSAGES
04/23/21 09:56 [I] Role SGE_ubuntu_standard scaling up from 0 to 1 nodes
04/23/21 09:56 [I] New state: SCALING
04/23/21 09:57 [I] New state: COOLDOWN
04/23/21 09:57 [I] New state: RUNNING
04/23/21 15:55 [I] Role SGE_ubuntu_standard scaling down from 1 to 0 nodes
04/25/21 12:14 [I] Role SGE_ubuntu_standard scaling up from 0 to 1 nodes
04/25/21 12:14 [I] New state: SCALING
04/25/21 12:14 [I] New state: COOLDOWN
04/25/21 12:14 [I] New state: RUNNING
04/25/21 14:16 [I] Role SGE_ubuntu_standard scaling up from 1 to 2 nodes
04/25/21 14:16 [I] New state: SCALING
04/25/21 14:17 [I] New state: COOLDOWN
04/25/21 14:17 [I] New state: RUNNING
04/25/21 15:55 [I] Role SGE_ubuntu_standard scaling down from 2 to 1 nodes
04/25/21 15:56 [I] Role SGE_ubuntu_standard scaling down from 1 to 0 nodes
04/25/21 16:58 [I] Role SGE_ubuntu_standard scaling up from 0 to 1 nodes
04/25/21 16:58 [I] New state: SCALING
04/25/21 16:58 [I] New state: COOLDOWN
04/25/21 16:58 [I] New state: RUNNING
04/25/21 16:59 [I] Role SGE_ubuntu_standard scaling down from 1 to 0 nodes
04/25/21 17:09 [I] Role SGE_ubuntu_standard scaling up from 0 to 10 nodes
04/25/21 17:09 [I] New state: SCALING
04/25/21 17:10 [I] New state: COOLDOWN
04/25/21 17:10 [I] New state: RUNNING
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 10 to 9 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 9 to 8 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 8 to 7 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 7 to 6 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 6 to 5 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 5 to 4 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 4 to 3 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 3 to 2 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 2 to 1 nodes
04/26/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 1 to 0 nodes
04/27/21 08:35 [I] Role SGE_ubuntu_standard scaling up from 0 to 1 nodes
04/27/21 08:35 [I] New state: SCALING
04/27/21 08:36 [I] New state: COOLDOWN
04/27/21 08:36 [I] New state: RUNNING
04/27/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 1 to 0 nodes
04/27/21 10:43 [I] Role SGE_ubuntu_standard scaling up from 0 to 1 nodes
04/27/21 10:43 [I] New state: SCALING
04/27/21 10:44 [I] New state: COOLDOWN
04/27/21 10:44 [I] New state: RUNNING
04/28/21 09:56 [I] Role SGE_ubuntu_standard scaling down from 1 to 0 nodes
04/29/21 09:42 [I] Role SGE_ubuntu_standard scaling up from 0 to 1 nodes
04/29/21 09:42 [I] New state: SCALING
04/29/21 09:42 [I] New state: COOLDOWN
04/29/21 09:42 [I] New state: RUNNING
04/29/21 10:46 [I] New state: WARNING
04/29/21 21:56 [I] Role SGE_ubuntu_standard scaling down from 1 to 0 nodes
/var/log/one/oneflow.log:
Sun Apr 25 06:26:40 2021 [I]: [AE] Checking policies for service: 18
Sun Apr 25 06:26:57 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is DONE
Sun Apr 25 06:26:57 2021 [I]: [WD] Update 18:SGE_ubuntu_standard cardinality to -1
...
Sun Apr 25 15:15:08 2021 [I]: [AE] Checking policies for service: 18
Sun Apr 25 15:15:26 2021 [I]: [WD] Running 18: SGE_ubuntu_standard is ACTIVE
Sun Apr 25 15:15:56 2021 [I]: [WD] Running 18: SGE_ubuntu_standard is ACTIVE
Sun Apr 25 15:16:26 2021 [I]: [WD] Running 18: SGE_ubuntu_standard is ACTIVE
...
Thu Apr 29 10:45:44 2021 [I]: [AE] Checking policies for service: 18
Thu Apr 29 10:45:48 2021 [I]: [WD] Running 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:46:18 2021 [I]: [WD] Running 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:46:28 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:46:58 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:47:28 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:47:58 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:48:28 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:48:58 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:49:28 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is ACTIVE
Thu Apr 29 10:49:50 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is POWEROFF
Thu Apr 29 10:50:20 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is POWEROFF
Thu Apr 29 10:50:50 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is POWEROFF
...
Thu Apr 29 21:55:20 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is POWEROFF
Thu Apr 29 21:55:50 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is POWEROFF
Thu Apr 29 21:56:20 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is POWEROFF
Thu Apr 29 21:56:51 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is DONE
Thu Apr 29 21:56:51 2021 [I]: [WD] Update 18:SGE_ubuntu_standard cardinality to 0
Thu Apr 29 21:57:21 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is DONE
Thu Apr 29 21:57:21 2021 [I]: [WD] Update 18:SGE_ubuntu_standard cardinality to -1
Thu Apr 29 21:57:51 2021 [I]: [WD] Warning 18: SGE_ubuntu_standard is DONE
Thu Apr 29 21:57:51 2021 [I]: [WD] Update 18:SGE_ubuntu_standard cardinality to -1
...
...