I’m running OpenNebula 7.0.1 and I’ve run into an issue with live migration.
Whenever I perform a live migration, the VM becomes unusable afterward and I see “I/O ERROR”.
However, cold/offline migration works fine with no issues.
VM internal error is as follows:
It looks like the underlying iSCSI block device becomes unresponsive, but I’ve been monitoring it closely and the iSCSI sessions and multipath mappings remain stable—there’s no link flapping or disconnects.
root@manager01:~# pvs
PV VG Fmt Attr PSize PFree
/dev/mapper/one_system_data_ceph_11T vg-one-106 lvm2 a-- <11.00t 10.96t
root@manager01:~# vgs
VG #PV #LV #SN Attr VSize VFree
vg-one-106 1 2 0 wz--n- <11.00t 10.96t
root@manager01:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv-one-23-0 vg-one-106 Vwi---tz-k 40.00g lv-one-23-pool
lv-one-23-pool vg-one-106 twi---tz-k 40.00g
root@manager01:~# multipathd show paths
hcil dev dev_t pri dm_st chk_st dev_st next_check
16:0:0:0 sdc 8:32 50 active ready running XX........ 10/40
15:0:0:0 sdb 8:16 10 active ready running XX........ 10/40
17:0:0:0 sdd 8:48 50 active ready running XX........ 11/40
18:0:0:0 sde 8:64 10 active ready running XX........ 10/40
12月 26 16:07:37 compute-node01 kernel: br2310: port 2(one-23-0) entered disabled state
12月 26 16:07:37 compute-node01 kernel: device one-23-0 left promiscuous mode
12月 26 16:07:37 compute-node01 kernel: br2310: port 2(one-23-0) entered disabled state
12月 26 16:07:38 compute-node01 kernel: kauditd_printk_skb: 2 callbacks suppressed
12月 26 16:07:38 compute-node01 kernel: audit: type=1400 audit(1766736458.318:438): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=342340 comm="apparmor_parser"
12月 26 16:07:38 compute-node01 kernel: br2310: port 1(trunk0.2310) entered disabled state
12月 26 16:07:38 compute-node01 kernel: device trunk0.2310 left promiscuous mode
12月 26 16:07:38 compute-node01 kernel: br2310: port 1(trunk0.2310) entered disabled state
12月 26 16:07:32 compute-node2 kernel: br2310: port 1(trunk0.2310) entered blocking state
12月 26 16:07:32 compute-node2 kernel: br2310: port 1(trunk0.2310) entered disabled state
12月 26 16:07:32 compute-node2 kernel: device trunk0.2310 entered promiscuous mode
12月 26 16:07:32 compute-node2 kernel: br2310: port 1(trunk0.2310) entered blocking state
12月 26 16:07:32 compute-node2 kernel: br2310: port 1(trunk0.2310) entered forwarding state
12月 26 16:07:32 compute-node2 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): br2310: link becomes ready
12月 26 16:07:33 compute-node2 kernel: audit: type=1400 audit(1766736453.262:303): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251122 comm="apparmor_parser"
12月 26 16:07:33 compute-node2 kernel: audit: type=1400 audit(1766736453.402:304): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251125 comm="apparmor_parser"
12月 26 16:07:33 compute-node2 kernel: audit: type=1400 audit(1766736453.550:305): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251129 comm="apparmor_parser"
12月 26 16:07:33 compute-node2 kernel: audit: type=1400 audit(1766736453.698:306): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251133 comm="apparmor_parser"
12月 26 16:07:33 compute-node2 kernel: br2310: port 2(one-23-0) entered blocking state
12月 26 16:07:33 compute-node2 kernel: br2310: port 2(one-23-0) entered disabled state
12月 26 16:07:33 compute-node2 kernel: device one-23-0 entered promiscuous mode
12月 26 16:07:33 compute-node2 kernel: br2310: port 2(one-23-0) entered blocking state
12月 26 16:07:33 compute-node2 kernel: br2310: port 2(one-23-0) entered forwarding state
12月 26 16:07:33 compute-node2 kernel: audit: type=1400 audit(1766736453.866:307): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251158 comm="apparmor_parser"
12月 26 16:07:34 compute-node2 kernel: audit: type=1400 audit(1766736454.014:308): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251161 comm="apparmor_parser"
12月 26 16:07:34 compute-node2 kernel: audit: type=1400 audit(1766736454.150:309): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251164 comm="apparmor_parser"
12月 26 16:07:34 compute-node2 kernel: audit: type=1400 audit(1766736454.286:310): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251167 comm="apparmor_parser"
12月 26 16:07:34 compute-node2 kernel: audit: type=1400 audit(1766736454.426:311): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251170 comm="apparmor_parser"
12月 26 16:07:34 compute-node2 kernel: audit: type=1400 audit(1766736454.570:312): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-5e011c1f-39bd-4a19-ab30-3d527c9d3f48" pid=1251173 comm="apparmor_parser"
I’ve checked and troubleshot many parts of the environment but still can’t find the root cause. I’d really appreciate any guidance or support.
Thanks a lot!




