Usage of total_pieces

This post describes an attack on the current PoS construction. The attack uses the fact that it is possible to grind over the value for total_pieces without the need to reconstruct the PoS table. Additionally, some issues with the design are presented, as well as a rational strategy for when to replot, also focused on total_pieces.

To mitigate the attack, the recommendation is to integrate total_pieces into the evaluation seed for the PoS table.

Background

  • total_pieces is the number of total (recorded-history) pieces of the chain at a given time.
  • It is used only for a single task – to place records within sectors. More precisely, given a sector and a sector slot position, farmers derive the record to be placed at the slot by computing hash(sector_id, i) mod total_pieces, where i is a sector slot and sector_id is derived from the farmer’s public key and the sector index (a number). Given the result, a non-negative number smaller than total_pieces, we take the corresponding record with this index value (in the global history) and associate it with that slot.
  • total_pieces is part of the farmer’s proof (the Solution), as verifiers need to make the same computation.
  • PoS tables do not use this value: for each slot position, farmers derive the table’s seed using sector_id and record_position (the specific slot position).
  • In particular, PoS tables are independent from the chain history or the actual content of the sector.
  • Farmers collect proofs of space within the PoS table, starting at start_index which is set to 0.

Attack

We illustrate the attack as follows:

  • Suppose the attacker has the entire blockchain history (the entire history is not needed for the attack, but it is a realistic assumption).
  • The attacker fixes sector_id, and generates the PoS table for each slot.
  • Note: the attacker still hasn’t determined the actual records for the sector, and can now range over different values for total_pieces to obtain different content for the same sector.
  • Given a challenge (the current global_challenge), the attacker derives the audit index j for the sector (s_bucket_audit_index).
  • The attacker evaluates each record (recall that records can be represented as polynomials) at the audit index j, XORs the result with the corresponding proof of space (if exists), and checks if the result gives a ‘‘winning ticket’’.
  • For each such winning ticket, the attacker can range over total_pieces (going down from the current actual total number of pieces) in the aforementioned computation to see if the winning record falls into the desired slot (that corresponds to the PoS table).
  • In fact, since the mapping slot --> record follows a simple algebraic formula, the attacker can compute the desired value for total_pieces (if exists), and hope that it falls within the acceptable range (not in the future, and not expired yet). This may even be pre-computed, so the attacker does not have to check all records at all slots.
  • Moreover, since start_index is set to 0, the attacker may evaluate all records at the first 50000 (s_buckets_per_sector) inputs in advance, as the overhead of storing this, instead of a degree-32768 polynomial, is small.

Storing the entire history is amortized over all of the attacker’s sectors. For each sector, the attacker needs to store the PoS table (I’m not sure about the actual size compared to the ‘honest plot’ size, but even if larger it should be comparable). The attacker has a wide range to set total_pieces, and each such attempt basically doubles the amount of ‘‘lottery tickets’’, hence doubling the weight of the attacker’s storage.

The crux of this attack is at the observation that the PoS table, an essential piece of the construction, is independent from the sector’s content. Hence, it is recommended to integrate total_pieces into the table’s seed.

Note: in the attack above the farmer stores the PoS table instead of the encoded polynomial. Doing so circumvents the limitation in producing this table on-the-fly, the main reason why it has been introduced in first place.

Replotting strategy

We come back to the computation hash(sector_id, i) mod total_pieces, and recall that sector_id is derived by a public key and the sector index, which is a number.

When a sector expires, nothing prevents the farmer from setting exactly the same sector_id for their new plot. Denote the value for total_pieces within this sector by tp.

Suppose that record r was in the i-th slot for the expired sector. One can ask what the new value for total_pieces, denote it by tp', should be such that the record r falls again in the i-th slot. It can be shown that tp' must be a multiple of tp. The smallest multiple is 2, so set tp' = 2*tp, which is also the proposed time (to check when) the original sector expires.

The record r is not guaranteed to fall within the i-th slot of the new sector, as it can also fall in the (tp+i)-th slot. However, given that it distributed uniformly, there’s a 50% chance it will fall in the i-th slot.

We see that following this strategy – replotting at times that correspond to twice of the previous total_pieces and with the same sector_id – farmers are expected to keep half of their old plot.

Following the advice above and integrating total_pieces to the PoS table’s seed will also prevent this strategy from working. Moreover, we can make (the currently constant) start_index dynamic. The latter seems like a good idea in general.

8 Likes

Thanks a lot, Barak! In my research-style paper on Subspace consensus v2 (which will be released this Friday or a bit later), evaluation_seed is defined in a slightly different way. That is, evaluation_seed = sector_id \XOR hash of KZG commitment of the selected record. Does the above attack apply there?

3 Likes

I was following the consensus spec.

In this case, the attack probably won’t go through, as the PoS table depends on the sector’s content. However, I believe that the described replotting strategy still holds.

If I understand correctly, then for a given sector_id, whether some record r is in position j or in position k != j would yield the same PoS table. This means that we may want to keep start_index fixed, otherwise we may fall into a variant of the attack. It may also expand the replotting strategies, but I’m not sure about it.

Thinking about it now, an easier description of the attack is to form several sectors, all with the same sector_id but different total_pieces, and storing the same PoS table for all of them.

Basically if the probability for duplicate records (that will either win or lose at the same time) is low, farmers may prefer to have just one sector_id for all of their sectors (by keeping sector_index constant).

It may be better to integrate total_pieces into sector_id. It will resolve the issues above.

1 Like

Totally agree! It is a great idea to “seal” or “bind” total_pieces into sector_id or evaluation_seed in order to eliminate such grinding opportunities. May I ask how you would define evaluation_seed? Thanks!

1 Like

Spec says:

The plot seed is obtained from farmer public key, current sector and piece position within the sector.

It doesn’t say right now what “current sector” is, but if it is the contents (or something that uniquely represents it, like all of the piece indexes hashed together) then in order to determine winning ticket you’ll have to derive proof-of-space before being able to check if you have a winning ticket. In other words PoS table is not the fixed for the same public key, sector index and piece position making it reusable, it should tie back into the total_pieces essentially, turning the whole thing into PoW, which is supposedly not feasible to do efficiently.

So yes, PoS table should include total_pieces directly or indirectly.

@dariolina can you clarify what “current sector” means?

I agree that mixing in total_pieces into PoS table seed generation is a generally good idea. When considering how to do that, we should remember that verifiers need this seed to validate the proof_of_space in the Solution. We would also like not to include it in the Solution but have verifiers re-derive the seed. This approach could work as a verifier has all the info

but it doesn’t seem to stand against the selective replotting attack Barak describes, where the same record is plotted after sector expiry and doesn’t depend on record position.

To Nazar, “current sector” meant sector_id, as in the formula during plotting.

Regardless of seed generation, we should also select space_k large enough so that storing a full PoS table is inconvenient, but derivation during farming is not too slow. IMO, currently listed space_k=16 is too small, but we can easily bench and adjust it during implementation.

Also, total_pieces is always a multiple of 256 and total archived segments to date because new pieces are always created when a segment is archived in batches of 256. So it makes sense to use total_segments as an indicator of sector plotting time instead, as Nazar suggested. Otherwise, when the verifiers check total_pieces validity apart from being too large or small, they need to check whether it’s a multiple of 256.

Given the above, I suggest something like this:

evaluation_seed = Hash(sector_id || i || total_segments) mod total_pieces

What do you think?

To your other point, Barak:

We could and wanted initially, but that complicates erasure coding (and specifically recovery) a lot. I have not found an implementation. This is something we should explore but in a different thread.

Chia provides (page 16) this formula for the final table size without accounting for Hellman attacks:
0.762 ∗ k ∗ 2^k bytes
So we can choose k in such a way that it’s inconvenient to store tables. For instance, k = 18 already makes a 3.5 MB table vs 1MB record.

@Barak could you clarify how a dynamic start_index helps mitigate the attack you described? I understand that with dynamic start_index, same as with fixed, the attacker could still store the PoS table, just query it at different indices

Depends what we want to achieve. In your proposal (using the record’s commitment) one could change the record position (by changing total_pieces) and have the same seed. I don’t see any issue with that, but it does not mean there isn’t.

I am in favor of injecting total_pieces to the seed (either directly or through sector_id), such that for different total_pieces there would be different seeds.

I won’t call the selective replotting an attack. Farmers would still store the history. It is just a rational strategy to reduce the load of replotting (at the cost of replotting a bit earlier than prescribed). Note, even though there would be a point in time where a farmer could use either the total_pieces tp sector or the 2*tp one, since it is exactly the same embedded record, it would either ‘win’ in both sectors, or ‘lose’ in both. So there’s no ‘storage gain’ in doing so.

It is a measure for the replotting strategy. If start_index is different, then even though farmers could craft the exact same record to the exact same slot (with the same sector_id, but different total_pieces), they will need to embed different parts. However, if k=16 is used, there would be quite a lot of overlap in the indices.

Yes, the PoS table needs to be larger than an encoded record, otherwise it is tempting to store the PoS table itself, making further on-demand computations rather easy.

Still, we should keep in mind Hellman’s result: maybe one could store 0.5MB of ‘‘auxiliary information’’ twice, instead of a single ‘honest’ encoded recorded, and then reconstruct 2 PoS tables with (significantly) less than the required memory.

If I understand correctly, the 0.762 accounts for indices with no proof of space. I don’t have a good understanding of the tables construction, but there may be other preprocessing one could do (I assume Chia has done good analysis, but maybe we have different needs), for example to remove multiple proofs from the same index (as we only use one).

Due to usefulness of storage it is not just the storage gain we are concerned about, but also uniformity of distribution of pieces. We really want to not allow for farmer to choose what they store, rather just decide how much space they want to occupy.

1 Like