Proof-of-Time - Optimistic vs Pessimistic Verification Analysis

TLDR

Subspace’s protocol is potentially susceptible to long-range attacks, where an attacker can rewrite the chain’s history with less than 51% of current storage by controlling more than 51% of average storage since the chain’s inception. To protect against this attack, we are implementing a Proof-of-Time protocol. This adds the concept of timekeepers that run verifiable computations proving some amount of time has elapsed for each slot.

The initial planning for Proof-of-Time made some basic assumptions:

  • The amount of time and energy to verify AES-based Proof-of-Time would be the same as proving.
  • Verification would not be efficient on commodity hardware.
  • A secure optimistic approach to verification would be straightforward to define and implement.

However, none of these assumptions has fully held.

  • Due to instruction-level parallelism, verification is both faster and more energy efficient than proving. Benchmarks show verification is about seven times faster than proving while consuming about one-quarter of the energy. This energy consumption translates to an estimated $5/year.
  • Verification is efficient on modern commodity hardware, using an estimated 6% of CPU power on a typical modern quad-core machine. The bigger bottleneck in the usage of commodity hardware will be from the proving process in our Proof-of-Space implementation.
  • Defining a secure optimistic approach has proved significantly more challenging than had been anticipated, both from a research and an implementation standpoint.

Based on the above observations, we designed a much simpler solution requiring every farmer to verify PoT independently. We call this solution pessimistic verification, which is much more secure and easier to implement than our original optimistic verification. Additionally, we have found pessimistic verification to be more energy and time-efficient than was originally assumed. However, considering all of the above, the primary question remains: does pessimistic verification conflict with Subspace’s long-held goal to “replace compute-intensive mining with storage-intensive farming.”

Background

Proof-of-Time (PoT) aims to protect against long-range attacks, where an attacker can rewrite the chain’s history with less than 51% of current storage by controlling more than 51% of average storage since genesis. To protect against this attack, a protocol that proves some amount of time has passed for each block (or slot) is useful. This is the goal of our Proof-of-Time protocol.

Proving time has elapsed is done with an encryption algorithm (AES) that has been heavily optimized on most modern hardware, meaning it is unlikely to see massive efficiency gains that would give an adversary a large advantage in proving elapsed time. However, a downside to AES is that verifying the proof is computationally intensive, as is in mentioned by Srivatsan in our original proof-of-time document:

However, verification will take as much time as evaluation! By publishing intermediate checkpoints, verification time can be reduced, but the proof size is large and still requires a lot of parallel computation. The computation required for verification may be too much for light clients.

We have held the following assumptions about PoT verification:

  1. Time: Verification would take nearly as long as proving unless heavily parallelized.
  2. Energy: Verification would require the same amount of energy as proving and potentially conflict with the effort to “replace compute-intensive mining with storage-intensive farming.”
  3. Computational burden: Verification while farming wouldn’t be feasible on commodity hardware, making farming inaccessible to the masses.

Optimistic Verification

The proposed solution was to utilize fraud proofs so that farmers could “optimistically” verify PoT (i.e., trust the proofs to be true without verification) and only need to actually verify when a fraud is reported by other Timekeepers. The original analysis and plan for fraud proofs was straightforward. However, certain attack surfaces had not been considered. Securing against these has added significant complexity.

Security Considerations

  1. A Timekeeper’s primary role is to evaluate the PoT chain forward. This consumes 100% of a dedicated CPU core. Optimistic verification places another burden on Timekeepers - report fraud, a burden that consumes potentially uncapped resources and can lead to denial of service.

    This may happen with a cheap and easy attack. Fake PoT is low-cost to produce and gives malicious actors the ability to grind on challenges and build many longest and heaviest chains with minimal storage. With pessimistic verification, every farmer will refuse those right away. With optimistic verification, farmers rely on the one honest Timekeeper to produce fraud proofs for each and every such fork.

    If we limit how many forks the honest Timekeeper processes, we lose the guarantee that all fraudulent forks will be reported opening doors to long-range attacks.

  2. With optimistic verification, farmers rely on Timekeepers to not only provide slot challenges but also to warn of potential fraud. This requires a longer “challenge period” to give Timekeepers a chance to receive a fraudulent message, respond with a fraud proof, and propagate it back to farmers, which in turn gives adversaries a larger prediction window to produce on-demand plots and fake storage.

  3. Unverified PoT leads to unverified blocks potentially included on-chain. This requires a second type of fraud proof “on invalid block” in addition to simple “on gossip”. While “on gossip” fraud can be slashed, malicious Timekeepers that make puppet farmers produce invalid blocks remain unpunished.

  4. Malicious farmers can generate invalid PoT in their block headers. Then, they use these “faked” blocks to overload honest timekeepers and trick honest farmers. To better understand this attack, let us consider a malicious farmer with only 0.01% of total storage. It can easily generate an invalid longer private chain by manipulating invalid PoT. By contrast, in PoW consensus, it is extremely hard for a malicious miner with only 0.01% of total mining power to generate a longer private chain. The same conclusion applies to PoS consensus. Although a malicious miner with 0.01% of total stake can generate as many private blocks as it wants (especially when c = 1), the growth rate of its private chain is only amplified by \phi_c. In other words, this attack is unique for our PoAS consensus. (This attack can be well mitigated by pessimistic verification but not by optimistic verification.)

There is a nearly complete specification for optimistic verification. However, there are still some open questions:

  • We have defined two operation modes for nodes: normal and high alert (potentially under attack). What are the conditions that trigger under attack mode? And what are the conditions to switch back to normal operation?
  • If all forks are verified optimistically/probabilistically, we are not guaranteed anymore that the chain is fully verified, potentially leading to deep reorgs and un-archiving. How do we handle that?
  • How many fraud proofs do we include in a block? Especially in “high alert” mode, there might be thousands.

Implementation Considerations

Creating the specification is merely the beginning of the process; the time and effort required for implementation must also be considered. Assuming @nazar-pc handles the implementation, the current estimate is well over one month. However, considering all the collaboration and potential adjustments, a more realistic timeline may extend beyond a quarter. Even with this extended period, the complexity of aligning numerous moving parts makes any timing assumptions very difficult and any estimate highly variable. Some specific implementation considerations include:

  1. Integration with staking will make implementation convoluted.
  2. Various types of fraud proofs add increasing complexity.
  3. Adjustable challenge ****periods with different modes of operation will be difficult to test.
  4. Substrate design will pose a challenge for invalidating previously valid blocks due to a wrong PoT.

Pessimistic Verification

As mentioned above, we have been working under at least three core assumptions regarding pessimistic verification; that it would be too time intensive, too energy intensive, and too computationally intensive for commodity hardware.

Time

The initial assumption was that the time to prove and verify would be equal or would require multiple cores to parallelize the process. However, this is not the case, and the reason lies in the existence of instruction-level parallelism in modern processors. This can hide much of the latency in various instructions, enabling a processor to execute multiple instructions simultaneously on a single core. For verification, we have 16 smaller chunks of work that can be handled in parallel instead of the single sequential process used in proving. This allows verification to occur more quickly and leverages the same single CPU core more efficiently. Future processors will only increase the amount of instruction-level parallelism, which will serve to continuously increase this efficiency. The benchmarks provided below illustrate that verification is almost 7 times faster than proving.

AMD Ryzen 9 5900HX (busy core was running at ~4.55 GHz).
Prove time (166M iterations): 1.48s
Verify time (166M iterations): 0.22s

Energy

An additional, related assumption centered on the excessive energy consumption of constantly running verification on a farmer. Running verification does place an extra burden on a farmer’s CPU, with one CPU being used ~25% of the time strictly for this purpose. However, the energy consumption is not symmetric, with proving being roughly 4 times the cost of verifying. On an absolute basis, rough calculations put the estimated cost of continuous verification at around $5.00 per year.

AMD Ryzen 9 5900HX (busy core was running at ~4.55 GHz).
Idle: 25W
Prove: 35W (Idle+10W)
Verify: 42W (Idle+17W)
Prove time: 1.48s
Verify time: 0.22s
Power*time prove: 10*1.48=14.8 W*s
Power*time verify: 17*0.22=3.74 W*s
Prove/verify time difference: 1.48/0.22=6.73x
Prove/verify power difference: 14.8/3.74=3.96x

Estimated annual power consumption: 32.762kWh, around $4.91 USD at $0.15 per kWh

Commodity Hardware

A third, and maybe most important, consideration is whether verification can run on commodity hardware. A definition of “commodity hardware” is required to fairly test this assumption. I have heard Raspberry Pi mentioned, but clearly, from the results below, a Raspberry Pi won’t be sufficient for farming (below benchmark is with k=17, with k=20 table generation will be about eight times slower, taking 797ms*8 or about 6.4 seconds), so it shouldn’t be considered in the Proof-of-Time verification calculations. More reasonable commodity hardware would be a modern quad-core Intel/AMD processor with their typical CPU frequency. Benchmarks below should be an approximation of something like a low-end Intel Core i3 13100 (actual processor will be faster as it uses more powerful cores and has double number of logical threads, this is just an approximation). This benchmark shows a verifying time of about 235ms, so with one-second slots this will use one CPU for < 25% of the time.

**Modern lower-end Intel processor (Raptor Lake 4 cores at 4.3 GHz)
PoS Benchmark (k=17)**
chia/table/single       time:   [202.14 ms 202.91 ms 203.77 ms]
chia/table/parallel     time:   [71.583 ms 72.187 ms 72.795 ms]
chia/quality/no-solution
                        time:   [20.806 ns 20.822 ns 20.847 ns]
chia/quality/solution   time:   [52.979 ns 52.996 ns 53.006 ns]
chia/proof              time:   [241.27 ns 241.62 ns 241.80 ns]
chia/verification       time:   [18.843 µs 18.847 µs 18.852 µs]
**PoT Benchmark (166M iterations)**
verify                  time:   [234.92 ms 234.96 ms 235.01 ms]

**Raspberry Pi 4**
**PoS Benchmarks (k=17)**
chia/table/single       time:   [2.4096 s 2.4098 s 2.4101 s]
chia/table/parallel     time:   [793.20 ms 797.26 ms 800.91 ms]
chia/quality/no-solution
                        time:   [82.897 ns 82.899 ns 82.900 ns]
chia/quality/solution   time:   [690.77 ns 690.78 ns 690.78 ns]
chia/proof              time:   [957.34 ns 957.53 ns 957.69 ns]
chia/verification       time:   [116.12 µs 116.13 µs 116.14 µs]
******PoT Benchmarks (166M iterations)******
verify                  time:   [48.532 s 48.534 s 48.537 s] (No AES hardware acceleration)

**Orange PI 3 (Cortex-A53 quad core 1.8 GHz, this is a Raspberry PI 3 competitor from ~2018/2019)
PoS Benchmarks (k=17)**
chia/table/single       time:   [2.1608 s 2.1839 s 2.1983 s]
chia/table/parallel     time:   [885.79 ms 889.88 ms 894.23 ms]
chia/quality/no-solution
                        time:   [213.04 ns 215.29 ns 217.67 ns]
chia/quality/solution   time:   [928.43 ns 952.65 ns 970.11 ns]
chia/proof              time:   [2.3155 µs 2.3951 µs 2.4472 µs]
chia/verification       time:   [156.32 µs 157.33 µs 159.74 µs]
**PoT Benchmarks (166M iterations)**
pot/verify              time:   [2.0775 s 2.0775 s 2.0775 s]

Conclusion

When first approaching Proof-of-Time, certain assumptions were held. These include assumptions about verification timing, energy consumption, and ability to perform on commodity hardware. An additional assumption was made that optimistic verification would be straightforward and secure. The last two months of researching PoT at a much deeper level and implementing the “simpler” pessimistic version have given the team much more insight into each of the above assumptions, and to a large extent, none of them have fully held.

The efficiency gained from instruction-level parallelism not only improved verification time relative to proving but also made for more efficient energy consumption. Regarding the protocol working on commodity hardware, the bottleneck will not be with PoT verification but will be driven by our Proof-of-Space implementation. The straightforward approach to adding fraud proofs and optimistically verifying PoT has turned out to be much more challenging from both a security and an implementation standpoint.

To sum up, optimistic verification is much less secure and harder to implement than pessimistic verification. Pessimistic verification is more energy efficient than we expected before. In fact, it can be made even more efficient with probabilistic verification. The final question to be answered then is whether pessimistic verification conflicts with our goal to “replace compute-intensive mining with storage-intensive farming.” Given the small energy footprint and small burden on the farmer’s machine, I would argue it does not.

17 Likes

Thanks for posting this @Jeremy_Frank really detailed and provides a clear picture. Curious about feedback from others in the community

4 Likes

Interested to hear any perspectives from the community on this important subject.

1 Like

Very interesting! In your opinion, as hardware technology continues to evolve, how might future processors impact the current conclusions about PoT, especially in relation to instruction-level parallelism?

Supranational has done a study on our request - and their theoretical maximum estimate was 3x faster. Such maturity of the algorithm and near-bare-metal optimal performance was one of the big reasons we chose AES as VDF substitute. Other VDFs based on complicated algebra have theoretically more speedup and are not as well optimized in both software and hardware.

1 Like

Hardware in near future should add support for VAES that can process 4 blocks at the same time, so verification time will decrease further and energy consumption as well. Corresponding instructions already exist in server CPUs, but not yet in desktop counterparts. AVX512 had brief appearance in Intel CPUs, but then was removed. With new push for AVX10 we’ll hopefully get wider AES instructions in a few generations.

1 Like

What is the (software) architecture level that this is required at? Is it just a node? Node + Operators?

Either way, it seems PoT is a clear requirement, and I prefer the simplicity of pessimistic verification, even with the increased energy cost, to the additional complexity of timekeepers + fraud proofs. (Caveat: I recall there is a need for fraud proofs for the PoAS implementation, if that is correct, and if those mechanisms could be shared, there may not be as much incremental complexity as it seems just for PoT?)

2 Likes

Congrats on the update. Sounds like we are getting close to launch. Excited to see Pessimistic Verification in the wild.

2 Likes

The energy and computational requirements for pessimistic verification are minimal, ensuring that the primary focus remains on storage-intensive farming.

PoAS as such doesn’t need fraud proofs. The closest things we have is equivocation of farmers that produce multiple blocks/votes with the same solution (which only really happens when you try to cheat and clone your farm instead of plotting properly). And PoT would need them unless we use pessimistic verification.

There are fraud proofs related to execution layer, but they are not a part of the base consensus.

I realize I didn’t actually answer the question. Yes, I think that this conflicts with the goal referenced. You could say that having a node or a farmer conflicts as well, the existence of executors, etc. So while some things might conflict with the goal, they are also necessary to the existence of the network. PoT is necessary as well, so I would think about it like the node and farmer. There are also goals for decentralization, and to be permissionless. I think this decision has to be made within the larger context, and the specific question may not be designed to get the best answer. Assuming that the approach conflicts with this specific goal, is it still a better solution for other goals and for the long-term sustainability, maintenance, etc. might be better approach.

Operators actually do not conflict because they are a separate role that doesn’t constrain farmer participation. Pessimistic PoT verification is a constraint in a sense that this is something every node, both farmers and operators, would have to do in case we decide to move forward with it.

Farmer as such also doesn’t conflict because that is a core part of the protocol without which network wouldn’t exist.

In the end we can’t reduce computation to absolute zero, so it is a spectrum for sure. The question was primarily to see where community sees the red line is.

For instance maybe pessimistic PoT is “too much” computation and will no longer be seeing as an energy-efficient solution to the problem we set out to solve. Or maybe it is not that much computation at all and community doesn’t see that as a problem.

This is why we are looking for community input. To look at the problem, the data we were able to collect, ask clarification questions or more data if necessary, and ultimately help us make the right decision.

2 Likes