Can someone explain how plotting works?

I have a few questions, can someone please help?

  1. The node has the peer and the farmer has the connection flag, so are they related to each other? On other words, the farmer gets the data from its node’s peers or it gets data from its own connection to other farmers in the network?

  2. What is the actual plotting process? The CPU does some calculation or the farmer downloads data from the network or both?

  3. I don’t think the plotting process download the whole data to a sector, is it correct? We know each sector has 1000 pieces and each piece is 1 MB, so does farmer need to really download 1 GB for each sector it plots?

  4. Explanation of replotting. Is it a forever loop?

Thank you.

1 Like

Hi @Gracevn! Welcome to the Subspace Network community! Let me try to answer your questions:

  1. The farmer and the node are indeed related to each other and work in tandem, but each has different concerns. The node is a peer on the network, receiving and propagating transactions, blocks, etc. The farmer is a peer on the Distributed Storage Network and can serve pieces of history to other farmers and nodes and help them sync and plot. You could run a node without a farmer; however, just a farmer application wouldn’t work without the node. The node relays to the farmer challenges for a chance to win a block every second, for instance.
  2. Farmers plot the chain’s history (in pieces), encoded uniquely for each farmer. So it’s both - the farmer downloads data, and the CPU does a calculation. You can read more details on plotting in this Medium post.
  3. Yes, the farmer has to download 1 GiB for each sector it plots.
  4. Replotting is a forever loop, but it becomes less and less frequent with the chain growth. See this answer for more detail.
1 Like

May not be necessary in case it is cached locally, but as network growth the probability of this will get smaller and smaller.

Thank you very much. So am I understanding correctly that node’s network and farmer’s network, basically, are different. And my node can have node’s peer of User A, B, C; but my farmer can have farmer’s peer of User D, E, F?

Also, I notice my CPU is very high load constantly when it has to recover piece. Is it normal? Can we optimize on this part? It seems to me that recovering piece is too very optimized, wondering if we just discard the whole sector and download a new one, will it be better and smoother?

Farmers are using Subspace networking only, Nodes are using both Substrate and Subspace networks at the same time. All three can and likely will unique set of peers.

Pulsar implementation is more efficient than reference implementation because it runs only one Subspace networking stack shared with both node and farmer internally rather than two in case of separate applications. Eventually Pulsar will have just a single Subspace networking stack and Substrate networking will go away, but we’re not there yet.

Yes, both piece recovery and reading from plot are costly and will remain that way, there is no way to make them cheap. You can think of plot as cold storage backup. DSN has a cache for this purpose and if you upgrade to the latest release of the software the chance of hitting cache should be much better than with previous versions, we’ve done some improvements to make sure recovery happens as rarely as possible, ideally you’ll get all the pieces from cache all the time.

The way farmer is implemented right now, it expects sectors to be sequential and doesn’t support “skipping”. It should not be necessary either if network is functioning correctly. This is exactly why we are having testnet and frequent releases with various improvements.

Thank you Nazar. From your answer “Eventually Pulsar will have just a single Subspace networking stack and Substrate networking will go away, but we’re not there yet.” I believe you mean eventually CLI (node and farmer) will only run on Subspace network and we can phase out Substrate completely, right?

Yes, we will phase out Substrate networking at some point, but not Substrate as a framework.

I was just making a distinction that with separate node and farmer applications each of them will be connecting to Subspace networking individually and with Pulsar networking will be still combined together for better efficiency.

Yes, right. But Pulsar doesn’t give farmer many options to change settings now. Pulsar doesn’t support multi plots now either.

I suggest Pulsar can come with a settings.toml template so all flags as CLI. Then farmers will be able to play with the number that fit their system.

I believe there is a way to specify many parameters in Pulsar as well, maybe just not very well documented. Pulsar is meant to be simpler to use, but should still offer most of the features of the reference implementation.

@Parth do we support multiple plots there yet? If not, is there an ETA?

A template settings.toml in Github with all default values will help us. Farmers will be happy with that.

The way we run can be changed a bit please. Farmers will download Pulsar and settings.toml in same folder. Then, farmers to make changes in the settings.toml directly before ‘pulsar farm’. That means we won’t need pulsar init anymore.

right now quite a few still struggle to find the settings.toml.

Another suggestion to pulsar, please do not encrypt the wallet address, this part confuses everyone.

Great suggestion. I’ve added the template idea to this issue (which I will also try and get prioritised). Thanks!

Another suggestion is: please make both the execution file and template in a zip file (I mean for Windows, not familiar with Mac or Ubuntu). Name this zip file with version and date, but the files inside this zip file are always same for all releases, i.e. pulsar, subspace-node, subspace-farmer.

Many other projects are doing this way. Your team don’t have to update the guide from time to time due to version change.

1 Like

@nazar-pc Pulsar CLI need to be updated to support it.

There is no ETA on this at the moment. I will look into it and set priority.