[SOLVED] Error running onedb fsck a second time in 5.x

Hi all

We have detected this problem when we did an upgrade from OpenNebula 5.0.2 to latest OpenNebula 5.2.1 DB. We did the same steps as usual: http://docs.opennebula.org/5.2/intro_release_notes/upgrades/upgrade_50.html

$ onedb upgrade -v -S localhost -u oneadmin -d opennebula

and then

$ onedb fsck -S localhost -u oneadmin -d opennebula

fsck has detected some issues like:
Cluster 0 has not the proper reserved VNC ports

but everything was fixed (apparently) so to check that the issues were gone we did again a fsck but this time we got this:

$ onedb fsck -S localhost -u oneadmin -d opennebula
MySQL Password: 
MySQL dump stored in /var/lib/one/mysql_localhost_opennebula_2017-1-20_9:38:23.sql
Use 'onedb restore' or restore the DB using the mysql command:
mysql -u user -h server -P port db_name < backup_file


Cluster 0 has not the proper reserved VNC ports

Mysql::Error: Table 'network_pool_new' already exists
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:175:in `query'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:175:in `block in _execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/logging.rb:33:in `log_yield'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:175:in `_execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/shared/mysql_prepared_statements.rb:34:in `block in execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/connecting.rb:249:in `block in synchronize'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/connection_pool/threaded.rb:103:in `hold'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/connecting.rb:249:in `synchronize'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/shared/mysql_prepared_statements.rb:34:in `execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:155:in `execute_dui'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/query.rb:43:in `execute_ddl'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/query.rb:76:in `run'
/usr/lib/one/ruby/onedb/fsck.rb:1693:in `fsck'
/usr/lib/one/ruby/onedb/onedb.rb:280:in `fsck'
/bin/onedb:317:in `block (2 levels) in <main>'
/usr/lib/one/ruby/cli/command_parser.rb:449:in `call'
/usr/lib/one/ruby/cli/command_parser.rb:449:in `run'
/usr/lib/one/ruby/cli/command_parser.rb:76:in `initialize'
/bin/onedb:210:in `new'
/bin/onedb:210:in `<main>' 

fortunately the db was restored by fsck after this issue, but I’m not sure if the db is corrupted at this moment… someone else has experienced the same issue updating the database? any clue why we get this error from fsck?

Cheers and thanks!
Alvaro

btw the table network_pool_new is still in our db:

MariaDB [opennebula]> show tables;
+----------------------------+
| Tables_in_opennebula       |
+----------------------------+
| acl                        |
| cluster_datastore_relation |
| cluster_network_relation   |
| cluster_pool               |
| cluster_vnc_bitmap         |
| datastore_pool             |
| db_versioning              |
| document_pool              |
| group_pool                 |
| group_quotas               |
| history                    |
| host_monitoring            |
| host_pool                  |
| image_pool                 |
| local_db_versioning        |
| marketplace_pool           |
| marketplaceapp_pool        |
| network_pool               |
| network_pool_new           |
| network_vlan_bitmap        |
| pool_control               |
| secgroup_pool              |
| system_attributes          |
| template_pool              |
| user_pool                  |
| user_quotas                |
| vdc_pool                   |
| vm_import                  |
| vm_monitoring              |
| vm_pool                    |
| vm_showback                |
| vrouter_pool               |
| zone_pool                  |
+----------------------------+

and its empty

Ok I have removed the table network_pool_new and fsck does not show that error and it’s able to finish the check. But it seems that something is wrong with the old database, some VMs have lost some ARs:

[UNREPAIRED] VNet 5 AR 11 has a wrong lease for VM 302. IP does not match: … This can’t be fixed

And the error from fsck this time:

VNet 5 AR 11 has leased 172.24.27.2 to VM 302, but it is actually free
VNet 5 AR 11 has leased 172.24.24.0 to VM 311, but it is actually free

invalid address
/usr/share/ruby/ipaddr.rb:394:in `set'
/usr/share/ruby/ipaddr.rb:471:in `initialize'
/usr/lib/one/ruby/onedb/fsck.rb:1807:in `new'
/usr/lib/one/ruby/onedb/fsck.rb:1807:in `block (4 levels) in fsck'
/usr/lib/one/ruby/onedb/fsck.rb:1784:in `each'
/usr/lib/one/ruby/onedb/fsck.rb:1784:in `block (3 levels) in fsck'
/usr/lib/one/ruby/onedb/fsck.rb:1729:in `each'
/usr/lib/one/ruby/onedb/fsck.rb:1729:in `block (2 levels) in fsck'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/dataset/actions.rb:139:in `block in each'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:312:in `block (2 levels) in fetch_rows'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:369:in `yield_rows'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:312:in `block in fetch_rows'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:177:in `_execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/shared/mysql_prepared_statements.rb:34:in `block in execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/connecting.rb:249:in `block in synchronize'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/connection_pool/threaded.rb:99:in `hold'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/connecting.rb:249:in `synchronize'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/shared/mysql_prepared_statements.rb:34:in `execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/dataset/actions.rb:950:in `execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:353:in `execute'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/adapters/mysql.rb:296:in `fetch_rows'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/dataset/actions.rb:139:in `each'
/usr/lib/one/ruby/onedb/fsck.rb:1720:in `block in fsck'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/transactions.rb:134:in `_transaction'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/transactions.rb:108:in `block in transaction'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/connecting.rb:249:in `block in synchronize'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/connection_pool/threaded.rb:103:in `hold'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/connecting.rb:249:in `synchronize'
/usr/share/gems/gems/sequel-4.27.0/lib/sequel/database/transactions.rb:97:in `transaction'
/usr/lib/one/ruby/onedb/fsck.rb:1719:in `fsck'
/usr/lib/one/ruby/onedb/onedb.rb:280:in `fsck'
/bin/onedb:329:in `block (2 levels) in <main>'
/usr/lib/one/ruby/cli/command_parser.rb:449:in `call'
/usr/lib/one/ruby/cli/command_parser.rb:449:in `run'
/usr/lib/one/ruby/cli/command_parser.rb:76:in `initialize'
/bin/onedb:222:in `new'
/bin/onedb:222:in `<main>'

Finally I have removed the spurious VNET and regenerate it so now our DB seems to be ok:

$ onedb fsck -S localhost -u oneadmin -d opennebula
MySQL dump stored in /var/lib/one/mysql_localhost_opennebula_2017-1-20_11:31:3.sql
Use 'onedb restore' or restore the DB using the mysql command:
mysql -u user -h server -P port db_name < backup_file

Total errors found: 0
Total errors repaired: 0
Total errors unrepaired: 0
A copy of this output was stored in /var/log/one/onedb-fsck.log
1 Like