Domain won't start - Failed to process consensus block=Unknown Block

jrwashburn · December 5, 2023, 10:33pm

After renaming paritydb folder to back it up, (renamed due to to OOM Killer) node synched okay, but now that operator re-registered, operator fails to start and crashes with:

Dec 05 17:23:00 mmvt1 subspace-node[2191]: 2023-12-05T22:23:00.402805Z [Consensus] ⚙️  Syncing  0.0 bps, target=#525569 (40 peers), best: #409761 (0xf8c1…d11f), finalized #376724 (0x17d4…cfe4), ⬇ 106.7kiB/s ⬆ 13.1kiB/s
Dec 05 17:23:02 mmvt1 subspace-node[2191]: 2023-12-05T22:23:02.144594Z [Domain] Failed to process consensus block error=UnknownBlock("Header was not found in the database: 0x1e27b40e3142420329ab58b9e42fcf484f3ede9ee13ce4a5b7f853154f21a78a")
Dec 05 17:23:02 mmvt1 subspace-node[2191]: 2023-12-05T22:23:02.144731Z [Domain] Essential task `domain-operator-worker` failed. Shutting down service.
Dec 05 17:23:02 mmvt1 subspace-node[2191]: 2023-12-05T22:23:02.144873Z [Domain] Domain starter exited with an error Other("Essential task failed.")
Dec 05 17:23:02 mmvt1 subspace-node[2191]: 2023-12-05T22:23:02.144881Z [Domain] Essential task `domain` failed. Shutting down service.
Dec 05 17:23:02 mmvt1 subspace-node[2191]: Error: SubstrateService(Other("Essential task failed."))
Dec 05 17:23:02 mmvt1 systemd[815]: subspace-node.service: Main process exited, code=exited, status=1/FAILURE
Dec 05 17:23:02 mmvt1 systemd[815]: subspace-node.service: Failed with result 'exit-code'.

nazar-pc · December 6, 2023, 8:18am

@ning I think this is because node database wasn’t renamed or something, right?

ning · December 6, 2023, 1:04pm

I’m not sure, but I have tested locally that if I rename the whole --base-path folder the operator can restart successfully.

A few questions/things I need @jrwashburn to help with to locate the problem:

Have you renamed the whole --base-path or just the chains/subspace_gemini_3g/paritydb
Is the operator failed immediately after starting the node or after the node syncing for some time
Plz run this command and let me know the result: subspace-node check-block 0x1e27b40e3142420329ab58b9e42fcf484f3ede9ee13ce4a5b7f853154f21a78a --chain gemini-3g --base-path <PATH>

jrwashburn · December 6, 2023, 1:32pm

I renamed subsapce_gemini_3g/partitydb and subspace_gemini_3g_evm_domain/paritydb.

If fails within a few seconds; logs: failed-consensus-block.log - Google Drive

I renamed them both again and re-synced overnight, and the node is running okay this time. Would I need to take down the node, restore the old paritydb folders and then run the check-block? And if I do that, will I be able to just rename back to the good paritydb folders and not have to sync all over again?

ning · December 6, 2023, 2:37pm

Would I need to take down the node

No need to as your node is running fine this time, but please do check if your domain node’s best block match the RPC endpoint node by

Get the best block from the log (i.e. #160648 (0x3928…a566) in the following log):

[Domain] 💤 Idle (0 peers), best: #160648 (0x3928…a566), finalized #0 (0xf886…aeb8)

Check the same block number (i.e. #160648) has the same hash (i.e. 0x3928…a566) as in the the RPC endpoint node

restore the old paritydb folders and then run the check-block?

If your old paritydb folders still exist (i.e. have the exact same data as it first shut down due to OOM), you can run the command directly in the old folder

Rabinovitch · January 31, 2024, 10:45am

So the only option is to re-sync from scratch?

Rabinovitch · January 31, 2024, 10:54am

r9@r9:~$ /home/r9/subspace/target/production/subspace-node check-block --chain gemini-3g --base-path /media/nvme1/subspace-node/chains/subspace_gemini_3g/paritydb 0xad31eb63f0b0ccfd5dfd85c440bb62a40a9abc380bf9d16db9d9a34e7e46dcd0
2024-01-31 13:54:10+03:00 🔨 Initializing Genesis block/state (state: 0x09b5…b0b4, header-hash: 0x4180…180b)
Error: SubstrateCli(Service(Other("Unknown block")))

My error is

янв 31 14:10:49 r9 subspace-node[58599]: 2024-01-31T11:10:49.685629Z [Domain] Failed to process consensus block error=UnknownBlock("Header was not found in the database: 0xad31eb63f0b0ccfd5dfd85c440bb62a40a9abc380bf9d16db9d9a34e7e46dcd0")
янв 31 14:10:49 r9 subspace-node[58599]: 2024-01-31T11:10:49.685664Z [Domain] Essential task `domain-operator-worker` failed. Shutting down service.
янв 31 14:10:49 r9 subspace-node[58599]: 2024-01-31T11:10:49.685747Z [Domain] Domain starter exited with an error Other("Essential task failed.")
янв 31 14:10:49 r9 subspace-node[58599]: 2024-01-31T11:10:49.685760Z [Domain] Essential task `domain` failed. Shutting down service.
янв 31 14:10:49 r9 subspace-node[58599]: Error: SubstrateService(Other("Essential task failed."))
янв 31 14:10:49 r9 systemd[1]: subspace.service: Main process exited, code=exited, status=1/FAILURE
янв 31 14:10:49 r9 systemd[1]: subspace.service: Failed with result 'exit-code'.
янв 31 14:10:49 r9 systemd[1]: subspace.service: Consumed 5min 12.622s CPU time.

nazar-pc · January 31, 2024, 5:24pm

Yes, domain must always start from consensus genesis or else you’ll run into issues. Support for starting at any time is not implemented yet.

Rabinovitch · January 31, 2024, 5:26pm

OK, then how to re-sync domain?

nazar-pc · January 31, 2024, 5:41pm

You wipe all node data (consensus and domain) and start from scratch so both sync together

Topic		Replies	Views
Synchronization problem Support	5	152	December 20, 2023
Substrate Service Error Support	5	475	June 27, 2022
Thread 'main' panicked at 'Must always set if there is no logical error Incentivized Testnet faq , gemini , error	7	256	September 28, 2023
Error syncing block Support	1	152	October 19, 2023
Thread 'main' panicked at 'Failed to make runtime API call during last archived block search: UnknownBlock Support nodes , cli	10	191	September 17, 2023

Domain won't start - Failed to process consensus block=Unknown Block

Related Topics