More rewards on smaller plots

At the request of @Lazy_Penguin, I’m posting in his place.

Summary

The number of rewards on smaller plots on drives of similar class is greater. There are many more rewards with NVMe SSDs than there are with SATA SSDs.

System Information

CPU: AMD Ryzen 5900X
RAM: 128 GB DDR4 3200
Drives: Corsair MP600 NVMe SSD, Datacenter-grade Samsung U.2 NVMe SSD

Steps to reproduce:

Either create 2 plots of different sizes on the same drives, or create plots of the same size on different classes of drives (SATA SSD and NVMe SSD).

Actual Results:

Plot 0 – on Corsair MP600, 1600 GB
Plots 1-3 – on U.2 Samsung NVMe SSDs, each 2350 GB

Plotting has been completed for all plots.

Farmer example screenshot

Medium CPU load

Approximate reward ratio

This is a continuation of the discussion of the problem in Discord.

2 Likes

Logs are not attached here, but were provided to me in Discord.

The reason here is that farmer wasn’t able to audit the whole sector fast enough.
Here is snippet of audit for one slot:

2023-08-30T22:05:18.090262Z DEBUG single_disk_farm{disk_farm_index=3}: subspace_farmer::single_disk_farm: New slot slot_info=SlotInfo { slot_number: 1693433118, global_challenge: ..., solution_range: 23941329583, voting_solution_range: 239413295830 }
2023-08-30T22:05:18.090287Z DEBUG single_disk_farm{disk_farm_index=2}: subspace_farmer::single_disk_farm: New slot slot_info=SlotInfo { slot_number: 1693433118, global_challenge: ..., solution_range: 23941329583, voting_solution_range: 239413295830 }
2023-08-30T22:05:18.090296Z DEBUG single_disk_farm{disk_farm_index=1}: subspace_farmer::single_disk_farm: New slot slot_info=SlotInfo { slot_number: 1693433118, global_challenge: ..., solution_range: 23941329583, voting_solution_range: 239413295830 }
2023-08-30T22:05:18.090304Z DEBUG single_disk_farm{disk_farm_index=0}: subspace_farmer::single_disk_farm: New slot slot_info=SlotInfo { slot_number: 1693433118, global_challenge: ..., solution_range: 23941329583, voting_solution_range: 239413295830 }
2023-08-30T22:05:18.090335Z DEBUG single_disk_farm{disk_farm_index=0}: subspace_farmer::single_disk_farm::farming: Reading sectors slot=1693433118 sector_count=1498
2023-08-30T22:05:18.674040Z DEBUG single_disk_farm{disk_farm_index=3}: subspace_farmer::single_disk_farm::farming: Reading sectors slot=1693433118 sector_count=2200
2023-08-30T22:05:19.392995Z DEBUG single_disk_farm{disk_farm_index=1}: subspace_farmer::single_disk_farm::farming: Solution found slot=1693433116 sector_index=692
2023-08-30T22:05:19.393824Z DEBUG single_disk_farm{disk_farm_index=1}: subspace_farmer::single_disk_farm::farming: Reading sectors slot=1693433118 sector_count=2200
2023-08-30T22:05:19.762537Z DEBUG single_disk_farm{disk_farm_index=2}: subspace_farmer::single_disk_farm::farming: Reading sectors slot=1693433118 sector_count=2200

There are 4 farms on this machine, they all receive new slot info within 1ms from each other.
Now we can see that one solution was found indeed, but it was found more than a second later, which is too long.
Some farms are have only started auditing more than a second later, meaning they were likely still busy processing previous sector by that time.

For 2350GB plot farmer would need to do ~2300 random 32 byte reads every second, which might be a bit too much for certain SSDs.

So either drives are slow or something else impacts the performance. Farmer needs to do A LOT of random reads.

Future Gemini 3 versions after 3f will have heavier compute component when producing a proof, but will also have more time for everything else and this audit time will not be an issue, but still if we can improve this it’d be great.

What are those U.2 NVMe drives BTW, ideally exact model?

And is there a significant CPU load on some of the cores during farming?
If compute impacts the audit we might do something about it.
In htop or similar tool if you enable thread names, you should see what operations on the farmer consume the most CPU.
29% average CPU utilization seems quite high, I’d expect less on such a beefy processor.

1 Like

Clarified. Turns out it has one Samsung PM1723B (7.68 TB) and 3 plots are located on it.
The reason for this decision, he points out, is that this way the plotting process started to go faster.
A 32-byte read typically turns into a 4KiB read (by page size). There is a lot of SATA SSDs that can handle 2300 random read IOPS and even more.

Htop named threads

Htop CPU util

Simple benchmarks of the drives:

Corsair MP600

Samsung PM1723B

That is helpful. So each farm uses ~50% of CPU core at that plot size, which is not great, we should be able to optimize it further (at least I hope so).

3 farms on the same disk should be suboptimal, but no way to know except to test it. While I agree 2300 IOPS is not a lot, I’m not sure if every read actually translates to a single IOPS. It should be.

According to benchmarks drives should have no problem doing that many reads, which might indicate that auditing CPU overhead is the reason here.

We have auditing benches in monorepo, but they generate plots each time you run them, making it very time consuming to run. I’ve created following ticket to implement such a benchmark: Create auditing benchmark that can be used with existing plots · Issue #1914 · subspace/subspace · GitHub

It’ll look something like this so anyone can run it:

subspace-farmer bench audit /path/to/farm

Not very high on my list due to higher priority tasks, but we’ll get it done.

BTW if someone experienced in Rust wants to tackle it, I can provide some guidance and review PR.

2 Likes

here is my computer specs:
Factory Lenovo P620 Thread Ripper TR5975 with 512GB DDR4 Ram, 2x Intel Arc 770 GPUs, 7=15.3TB, 2=6.8TB, 2=3.8TB all Samsung PM1733 U.2 nvme ssd’s on 3 foot U.2 cables ran outside the case to a external rack with many fans for cooling driven by 2 x16 U,2 Pcie4x16 adapters, and 2 Pcie4x8 adapters

the 6, 500GB plots on the 3.8TB is mining this morning and finding blocks
for some reason once the 7, 13TB plots got to 2TB, they all quit finding blocks, so stopped them. i still have lots of testing to do.
and have decided gona have to plot 1 drive at a time, bring it offline, plot next, when all done bring them online for farming, as it max’s out the cpu. and this could be why quit finding blocks, as was plotting all 7 at same time.

1 Like

Has plotting finished though? If not it still occupies CPU time and I/O and might be the reason of decreased rewards until farming process has finished. Otherwise above comment in this thread still applies, you can enable debug logs with RUST_LOG=info,subspace_farmer=debug environment variable and see how timing looks like there.

I think outside of optimizing auditing as such, solution here would be to start a separate auditing thread for each 1TB of the plot rather than just one thread per farm, then regardless of CPU speed it should be able to keep up with arbitrary sized farms.

1 Like

I have also noticed as my plot grows larger I get less wins and less reward hashes signed.

1 Like

Same here. As my plots got larger, I’ve got less and less signed rewards.
Maybe around 700G plot and above, rewards have completely stopped.
Smaller plots, up till around 500G still receive rewards.

1 Like

Plotting speed doesn’t seem to decrease but rewards go to almost nothing. I have 4 PC’s farming/plotting. Two of them with identical configurations are at different phases in the plotting. The one that is at 25% done (4 @ 4 TB drives) still earns lots of rewards. The one that is at 50% done (also 4 @ 4TB drives) earns about 1/20th of the 25% one. Last night it was earning more. I just stopped and restarted plotting on the 50% one and in the last 14 minutes it has signed 9 reward hashes. Once the plotting resumes I will be able to tell if this directly relates to plot sizes or if I just need to restart the farmer/plotter every so often.

Have you tried running with log level mentioned above and see what timing looks like? Also check CPU usage per thread as described above to see if there is a bottleneck there.

In my case, audit benchmarks show results around 530ms for 936 sectors (1TB plot).

Also, I (and other people with a similar problem) experience a fairly high IOWAIT even on NVMe SSDs (lesser). I’ll try to provide more accurate data later.

During the audit benchmarks, the single core was loaded at 100%, meaning there was almost no I/O waiting.

1 Like

You might want to randomize benchmark input then or else it audits the same exact stuff every time, it is possible that it is cached somehow even though we do not request it in the app and will not represent real-world behavior.

I completely dropped the entire cache via
echo 3 > /proc/sys/vm/drop_caches.

Results the same.

Auditing is performed on all sectors sequentially, so after cache drop their parts should not be in RAM.

But I noticed the high IOWAIT before the audit benchmarks started. Probably at the stage of reading and decoding sector metadata.

I ran proving benchmarks on my plot, but only with 48 sectors, because on 49 an error appears after the audit. The big IOWAIT seems to appear exactly at the proving stage. Results: single sector - ~250 ms, 48 sectors - 11.7 s.

I tried running parallel audit benchmarks via just a .par_bridge() from Rayon. On an 8-core, 16-threaded system, the results improved from 530ms to 43ms.

1 Like

Of course they are. Yes, you dropped caches and the next moment they start filling up again (benchmark is running over and over again many times).

That is probably because that is when cache was not populated yet. After first run cache is full again and you’re potentially running benches from cache.

What about MADV_DONTNEED after each round of the benchmark?

Will it help?