Disk error “Error while reading piece from cache”

Issue Report

Environment

  • Operating System: Ubuntu 22.04 LTS AMD-7950x 4xINTEL-P4420-7.68T
  • Pulsar/Advanced CLI/Docker: [gemini-3h-2024-feb-01]
  • File system: XFS

Problem

I have about 400 SSDs. INTEL P4420 is the most. But almost every INTEL-P4420 has error during farmer, while sn640 could work correctly

2024-02-03T04:24:52.673750Z ERROR single_disk_farm{disk_farm_index=1}: subspace_farmer::single_disk_farm::farming: Failed to prove slot=185265 sector_index=2 error=Record reading error: Invalid chunk at location 8678311 s-bucket 13820 encoded true, possible disk corruption: Invalid scalar
2024-02-03T04:24:52.689654Z ERROR single_disk_farm{disk_farm_index=5}: subspace_farmer::single_disk_farm::farming: Failed to prove slot=185264 sector_index=1 error=Record reading error: Invalid chunk at location 14675975 s-bucket 23396 encoded true, possible disk corruption: Invalid scalar
2024-02-03T04:24:53.165380Z ERROR single_disk_farm{disk_farm_index=5}: subspace_farmer::single_disk_farm::farming: Failed to prove slot=185265 sector_index=2 error=Record reading error: Invalid chunk at location 3281471 s-bucket 5233 encoded true, possible disk corruption: Invalid scalar
2024-02-03T04:24:53.227316Z ERROR single_disk_farm{disk_farm_index=7}: subspace_farmer::single_disk_farm::farming: Failed to prove slot=185265 sector_index=2 error=Record reading error: Invalid chunk at location 25378838 s-bucket 40455 encoded true, possible disk corruption: Invalid scalar
2024-02-03T04:24:53.614345Z ERROR subspace_farmer::piece_cache: Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=2 key=Key(b"\x90\xb2\xce\x05\x08\xa6\x16\0\0\0\0\0\0") offset=2781
2024-02-03T04:24:53.711920Z ERROR subspace_farmer::piece_cache: Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=7 key=Key(b"\x90\xb2\xce\x05\x08\x0e\r\0\0\0\0\0\0") offset=4231
2024-02-03T04:24:53.811111Z ERROR subspace_farmer::piece_cache: Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=4 key=Key(b"\x90\xb2\xce\x05\x08\x99\xb0\0\0\0\0\0\0") offset=527
2024-02-03T04:24:54.037053Z ERROR subspace_farmer::piece_cache: Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=2 key=Key(b"\x90\xb2\xce\x05\x08;=\0\0\0\0\0\0") offset=1916
2024-02-03T04:24:54.328341Z ERROR subspace_farmer::piece_cache: Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=4 key=Key(b"\x90\xb2\xce\x05\x08\xe5\x88\0\0\0\0\0\0") offset=3562
2024-02-03T04:24:54.341470Z ERROR subspace_farmer::piece_cache: Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=7 key=Key(b"\x90\xb2\xce\x05\x08\xceD\0\0\0\0\0\0") offset=2189
2024-02-03T04:24:54.420407Z ERROR subspace_farmer::piece_cache: Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=3 key=Key(b"\x90\xb2\xce\x05\x08\x80E\0\0\0\0\0\0") offset=4326

In 3g it shows:

Error while reading piece from cache, might be a disk corruption error=Checksum mismatch disk_farm_index=0 key=Key(b"\x90\xb2\xce\x05\x08\xe4\x97\0\0\0\0\0\0") offset=21700

Well, as error says, looks like a disk corruption. Either disks are defective, or file system got corrupted, or RAM is problematic and corrupts contents. Either way doesn’t look like a protocol issue, looks like you need to test your hardware.