Efficiently transferring large ZFS datasets

From what I’ve seen online, there are a lot of metrics that professional computer nerds use to judge their homelab setups. Some strive for a very satisfying low hum, some try to maximize the number of blinking lights, etc. For me, I’ve always tried to strive for efficiency over raw horsepower or size. Running a processor with a TDP above thirty-five watts feels like an unnecessary extravagance.

And since I tend to run relatively anemic processors, trying to transfer large ZFS datasets around (during upgrades and the like), I quickly find the transfer throughput bogged down by single-threaded CPU performance, nearly always the SSH thread. In the research I’ve done trying to shift the bottleneck to disk or network throughput, using zfs send/recv via an mbuffer TCP pipe gets the closest. Here’s a quick tutorial on how:

Important note regarding security: mbuffer DOES NOT encrypt data in transit, so this method should only be used when transferring data between two nodes on the same local network. If you need to transfer over the internet, you should switch to using SSH for transport, or consider using a VPN.
  1. Take a recursive snapshot of the dataset on the dataset to be replicated:
    zfs snapshot -R dataset@backup-snapshot
    
  2. Start the mbuffer | zfs recv process on the receiving system:
    mbuffer -s 128k -m 1G -I backup-src:9001 | zfs recv tank/backup
    
  3. Buffer the recursive, raw zfs send via mbuffer from the sending node:
    zfs send -R --raw tank@backup-snapshot | mbuffer -s 128k -m 1G -O backup-dest:9001
    

    Using the --raw parameter ensures that only the on-disk datastream is sent, avoiding unnecessary CPU overhead from having re-compress or re-encrypt the underlying data, avoids needing to calculate new checksums, etc.

  4. Once the receive is finished, I usually scrub the receiving zpool to flush out any bits that may have flipped in transit. They shouldn’t have, since mbuffer uses TCP, but what’s the fun of using a checksummed, erasure-encoded filesystem if you’re not scrubbing your data?
    zpool scrub tank
    

Using this method to backup my data to a relatively underpowered Intel Atom C3758 system over 10gbe, I was able to to achieve an average throughput of ~583MB/sec, not too shabby.

Comments