Cannot stat a runtime from within a runtime

After upgrading to gemini-3f-2023-sep-29, numerous errors of this sort started appearing:

thread 'plotting#0' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', /root/subspace/crates/subspace-farmer/src/single_disk_farm/plotting.rs:226:28
1 Like

Hm, I have not seen it and wouldnā€™t expect, will need to investigate.

Can you run it with RUST_BACKTRACE=1 and see what backrtace it has? I do not see how this could happen right now. Do you have any code changes applied or is this an official build?

Found where and when it may happen. Interesting edge-case, working on a fix.

Itā€™d help if you could run an app with RUST_BACKTRACE=1 to see the backtrace of this panic.

I tried to reproduce what I thought was happening with a small app and it worked just fine, so Iā€™m confused how this may have happened.

This is an unaltered release compiled from sources. The person with this problem said he would not be able to provide output with RUST_BACKTRACE until later. He is using 9 plots of 2300G, the plotting is not complete.

I see that this line corresponds to plotting, I have plotted some sectors with the same codebase and wasnā€™t able to reproduce the issue. Backtrace would really help to figure out how this is even possible. Would they be able to try official build as well and see if the same issue can be reproduced there?

Unfortunately, getting output with RUST_BACKTRACE=1 failed, as it takes too long to initialise the farmer, and after an hour of waiting he decided to roll back.

Why does it take an hour to initialize? It really shouldnā€™t. And there will be no fix for following snapshots and next Gemini version unless I can reproduce it or get the backtrace. This is what testnet is for :confused:

I donā€™t know that. He gets this behaviour when using gemini-3f-2023-sep-29 and RUST_BACKTRACE=1. The first attempt took about 45 minutes, after the farmer was restarted, and the subsequent attempt took about an hour.

Iā€™ll try to reproduce it at my place, but I donā€™t have much free space on my drives (about 2TiB). At the very least, itā€™s worth a try.

What do you mean by ā€œinitializeā€ specifically, how do you identify that farmer is ā€œinitializedā€?

I used the wrong word though. In this case, by ā€œend of initialisationā€ I meant the first appearances of plotted messages (e.g. information about plotted sectors). This usually takes about 15 minutes on this server.

1 Like

You can try building latest version of gemini-3f-maintenance branch (or I can trigger CI build), it has some more improvements, but Iā€™m waiting to fix this regression before making another release.

Iā€™ll report back if I get any results.

Blockquote
2023-10-01T03:06:12.001744Z INFO single_disk_farm{disk_farm_index=4}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (0.03%) sector_index=0
2023-10-01T03:06:13.599021Z INFO single_disk_farm{disk_farm_index=8}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (89.55%) sector_index=3144
2023-10-01T03:06:16.696080Z INFO single_disk_farm{disk_farm_index=9}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (13.21%) sector_index=463
2023-10-01T03:06:17.041838Z INFO single_disk_farm{disk_farm_index=3}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (6.52%) sector_index=228
2023-10-01T03:06:20.193227Z INFO single_disk_farm{disk_farm_index=0}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (26.50%) sector_index=371
thread ā€˜plotting#23ā€™ panicked at ā€˜Cannot start a runtime from within a runtime. This happens because a function (like block_on) attempted to block the current thread while the thread is being used to drive asynchronous tasks.ā€™, /home/srv_subspace2/subspace/crates/subspace-farmer/src/single_disk_farm/plotting.rs:226:28
stack backtrace:
thread ā€˜plotting#0ā€™ panicked at ā€˜Cannot start a runtime from within a runtime. This happens because a function (like block_on) attempted to block the current thread while the thread is being used to drive asynchronous tasks.ā€™, /home/srv_subspace2/subspace/crates/subspace-farmer/src/single_disk_farm/plotting.rs:226:28
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: tokio::runtime::context::runtime::enter_runtime
3: rayon_core::thread_pool::ThreadPool::install::{{closure}}
4: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
5: rayon_core::registry::WorkerThread::wait_until_cold
6: rayon_core::join::join_context::{{closure}}
7: blst_rust::data_availability_sampling::::das_fft_extension_stride
8: rayon_core::join::join_context::{{closure}}
9: blst_rust::data_availability_sampling::::das_fft_extension_stride
10: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
11: rayon_core::registry::WorkerThread::wait_until_cold
12: rayon_core::join::join_context::{{closure}}
13: blst_rust::data_availability_sampling::::das_fft_extension_stride
14: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
15: rayon_core::registry::WorkerThread::wait_until_cold
16: rayon_core::join::join_context::{{closure}}
17: blst_rust::data_availability_sampling::::das_fft_extension_stride
18: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
19: rayon_core::registry::WorkerThread::wait_until_cold
20: rayon_core::join::join_context::{{closure}}
21: blst_rust::data_availability_sampling::::das_fft_extension_stride
22: rayon_core::join::join_context::{{closure}}
23: blst_rust::data_availability_sampling::::das_fft_extension_stride
24: rayon_core::join::join_context::{{closure}}
25: blst_rust::data_availability_sampling::::das_fft_extension_stride
26: rayon_core::join::join_context::{{closure}}
27: blst_rust::data_availability_sampling::::das_fft_extension_stride
28: subspace_farmer_components::plotting::plot_sector::{{closure}}
29: tokio::runtime::context::runtime::enter_runtime
30: rayon_core::thread_pool::ThreadPool::install::{{closure}}
31: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
32: rayon_core::registry::WorkerThread::wait_until_cold
33: rayon_core::join::join_context::{{closure}}
34: rayon::iter::plumbing::bridge_producer_consumer::helper
35: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
36: rayon_core::registry::WorkerThread::wait_until_cold
37: rayon_core::join::join_context::{{closure}}
38: rayon::iter::plumbing::bridge_producer_consumer::helper
39: rayon_core::join::join_context::{{closure}}
40: rayon::iter::plumbing::bridge_producer_consumer::helper
41: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
42: rayon_core::registry::WorkerThread::wait_until_cold
43: rayon_core::join::join_context::{{closure}}
44: rayon::iter::plumbing::bridge_producer_consumer::helper
45: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
46: rayon_core::registry::WorkerThread::wait_until_cold
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.
stack backtrace:
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: tokio::runtime::context::runtime::enter_runtime
3: rayon_core::thread_pool::ThreadPool::install::{{closure}}
4: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
5: rayon_core::registry::WorkerThread::wait_until_cold
6: rayon_core::join::join_context::{{closure}}
7: blst_rust::data_availability_sampling::::das_fft_extension_stride
8: rayon_core::join::join_context::{{closure}}
9: blst_rust::data_availability_sampling::::das_fft_extension_stride
10: rayon_core::join::join_context::{{closure}}
11: blst_rust::data_availability_sampling::::das_fft_extension_stride
12: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
13: rayon_core::registry::WorkerThread::wait_until_cold
14: rayon_core::join::join_context::{{closure}}
15: rayon::iter::plumbing::bridge_producer_consumer::helper
16: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
17: rayon_core::registry::WorkerThread::wait_until_cold
18: rayon_core::join::join_context::{{closure}}
19: rayon::iter::plumbing::bridge_producer_consumer::helper
20: rayon_core::join::join_context::{{closure}}
21: rayon::iter::plumbing::bridge_producer_consumer::helper
22: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
23: rayon_core::registry::WorkerThread::wait_until_cold
24: rayon_core::join::join_context::{{closure}}
25: rayon::iter::plumbing::bridge_producer_consumer::helper
26: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
27: rayon_core::registry::WorkerThread::wait_until_cold
28: rayon_core::join::join_context::{{closure}}
29: rayon::iter::plumbing::bridge_producer_consumer::helper
30: rayon_core::join::join_context::{{closure}}
31: rayon::iter::plumbing::bridge_producer_consumer::helper
32: rayon_core::join::join_context::{{closure}}
33: rayon::iter::plumbing::bridge_producer_consumer::helper
34: subspace_proof_of_space::chiapos::tables::TablesGeneric<>::create_parallel
35: subspace_farmer_components::plotting::plot_sector::{{closure}}
36: tokio::runtime::context::runtime::enter_runtime
37: rayon_core::thread_pool::ThreadPool::install::{{closure}}
38: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
39: rayon_core::registry::WorkerThread::wait_until_cold
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.
2023-10-01T03:06:21.625066Z INFO single_disk_farm{disk_farm_index=6}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (85.17%) sector_index=2990
2023-10-01T03:08:48.311517Z INFO single_disk_farm{disk_farm_index=1}: subspace_farmer::single_disk_farm::plotting: Sector replotted successfully sector_index=268
2023-10-01T03:09:50.583093Z INFO single_disk_farm{disk_farm_index=9}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (13.24%) sector_index=464
2023-10-01T03:09:55.274369Z INFO single_disk_farm{disk_farm_index=8}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (89.58%) sector_index=3145
2023-10-01T03:09:57.802504Z INFO single_disk_farm{disk_farm_index=6}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (85.19%) sector_index=2991
thread ā€˜plotting#3ā€™ panicked at ā€˜Cannot start a runtime from within a runtime. This happens because a function (like block_on) attempted to block the current thread while the thread is being used to drive asynchronous tasks.ā€™, /home/srv_subspace2/subspace/crates/subspace-farmer/src/single_disk_farm/plotting.rs:226:28
stack backtrace:
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: tokio::runtime::context::runtime::enter_runtime
3: rayon_core::thread_pool::ThreadPool::install::{{closure}}
4: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
5: rayon_core::registry::WorkerThread::wait_until_cold
6: rayon_core::join::join_context::{{closure}}
7: rayon::iter::plumbing::bridge_producer_consumer::helper
8: rayon_core::join::join_context::{{closure}}
9: rayon::iter::plumbing::bridge_producer_consumer::helper
10: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
11: rayon_core::registry::WorkerThread::wait_until_cold
12: rayon_core::join::join_context::{{closure}}
13: rayon::iter::plumbing::bridge_producer_consumer::helper
14: rayon_core::join::join_context::{{closure}}
15: rayon::iter::plumbing::bridge_producer_consumer::helper
16: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
17: rayon_core::registry::WorkerThread::wait_until_cold
18: rayon_core::join::join_context::{{closure}}
19: rayon::iter::plumbing::bridge_producer_consumer::helper
20: rayon_core::join::join_context::{{closure}}
21: rayon::iter::plumbing::bridge_producer_consumer::helper
22: rayon_core::join::join_context::{{closure}}
23: rayon::iter::plumbing::bridge_producer_consumer::helper
24: rayon_core::join::join_context::{{closure}}
25: rayon::iter::plumbing::bridge_producer_consumer::helper
26: rayon_core::join::join_context::{{closure}}
27: rayon::iter::plumbing::bridge_producer_consumer::helper
28: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
29: rayon_core::registry::WorkerThread::wait_until_cold
30: rayon_core::join::join_context::{{closure}}
31: rayon::iter::plumbing::bridge_producer_consumer::helper
32: rayon_core::join::join_context::{{closure}}
33: rayon::iter::plumbing::bridge_producer_consumer::helper
34: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
35: rayon_core::registry::WorkerThread::wait_until_cold
36: rayon_core::join::join_context::{{closure}}
37: rayon::iter::plumbing::bridge_producer_consumer::helper
38: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
39: rayon_core::registry::WorkerThread::wait_until_cold
40: rayon_core::join::join_context::{{closure}}
41: rayon::iter::plumbing::bridge_producer_consumer::helper
42: rayon_core::join::join_context::{{closure}}
43: rayon::iter::plumbing::bridge_producer_consumer::helper
44: rayon_core::join::join_context::{{closure}}
45: rayon::iter::plumbing::bridge_producer_consumer::helper
46: rayon_core::join::join_context::{{closure}}
47: rayon::iter::plumbing::bridge_producer_consumer::helper
48: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
49: rayon_core::registry::WorkerThread::wait_until_cold
50: rayon_core::join::join_context::{{closure}}
51: rayon::iter::plumbing::bridge_producer_consumer::helper
52: rayon_core::join::join_context::{{closure}}
53: rayon::iter::plumbing::bridge_producer_consumer::helper
54: rayon_core::join::join_context::{{closure}}
55: rayon::iter::plumbing::bridge_producer_consumer::helper
56: rayon_core::join::join_context::{{closure}}
57: rayon::iter::plumbing::bridge_producer_consumer::helper
58: subspace_proof_of_space::chiapos::tables::TablesGeneric<
>::create_parallel
59: subspace_farmer_components::plotting::plot_sector::{{closure}}
60: tokio::runtime::context::runtime::enter_runtime
61: rayon_core::thread_pool::ThreadPool::install::{{closure}}
62: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
63: rayon_core::registry::WorkerThread::wait_until_cold
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.
2023-10-01T03:12:16.057993Z INFO single_disk_farm{disk_farm_index=1}: subspace_farmer::single_disk_farm::plotting: Sector replotted successfully sector_index=409
2023-10-01T03:13:01.012521Z INFO single_disk_farm{disk_farm_index=6}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (85.22%) sector_index=2992
thread ā€˜plotting#0ā€™ panicked at ā€˜Cannot start a runtime from within a runtime. This happens because a function (like block_on) attempted to block the current thread while the thread is being used to drive asynchronous tasks.ā€™, /home/srv_subspace2/subspace/crates/subspace-farmer/src/single_disk_farm/plotting.rs:226:28
stack backtrace:
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: tokio::runtime::context::runtime::enter_runtime
3: rayon_core::thread_pool::ThreadPool::install::{{closure}}
4: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
5: rayon_core::registry::WorkerThread::wait_until_cold
6: rayon_core::join::join_context::{{closure}}
7: rayon::iter::plumbing::bridge_producer_consumer::helper
8: rayon_core::join::join_context::{{closure}}
9: rayon::iter::plumbing::bridge_producer_consumer::helper
10: rayon_core::join::join_context::{{closure}}
11: rayon::iter::plumbing::bridge_producer_consumer::helper
12: subspace_proof_of_space::chiapos::tables::TablesGeneric<_>::create_parallel
13: subspace_farmer_components::plotting::plot_sector::{{closure}}
14: tokio::runtime::context::runtime::enter_runtime
15: rayon_core::thread_pool::ThreadPool::install::{{closure}}
16: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
17: rayon_core::registry::WorkerThread::wait_until_cold
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.
2023-10-01T03:15:30.647429Z INFO single_disk_farm{disk_farm_index=2}: subspace_farmer::single_disk_farm::plotting: Sector replotted successfully sector_index=202
2023-10-01T03:16:06.714774Z INFO single_disk_farm{disk_farm_index=4}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (0.06%) sector_index=1
2023-10-01T03:16:10.700005Z INFO single_disk_farm{disk_farm_index=5}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (7.29%) sector_index=255
2023-10-01T03:18:46.269860Z INFO single_disk_farm{disk_farm_index=2}: subspace_farmer::single_disk_farm::plotting: Sector replotted successfully sector_index=228
2023-10-01T03:19:00.342971Z INFO single_disk_farm{disk_farm_index=5}: subspace_farmer::single_disk_farm::plotting: Sector plotted successfully (7.32%) sector_index=256
2023-10-01T03:19:20.431437Z INFO single_disk_farm{disk_farm_index=1}: subspace_farmer::single_disk_farm::plotting: Sector replotted successfully sector_index=434
2023-10-01T03:22:12

2 Likes

Fix for this issue: Fix tokio re-entrance due to rayon thread pool behavior by nazar-pc Ā· Pull Request #2029 Ā· subspace/subspace Ā· GitHub
Backport into Gemini 3f: Gemini 3f backport: Fix tokio re-entrance due to rayon thread pool behavior by nazar-pc Ā· Pull Request #2030 Ā· subspace/subspace Ā· GitHub

Will be included in the next release (probably later today after someone reviews the changes).

Should be fixed as of this release: Release gemini-3f-2023-oct-01 Ā· subspace/subspace Ā· GitHub

On it, trying it out. Got team, you rock! Patching on the weekend weeee!!!

1 Like