Phase 9 - File Transfer Architecture

Date: 2026-03-11 Status: Complete ADRs: ADR-R01 to ADR-R09

File transfer is the first production plugin built on the Phase 9A service infrastructure. It spans ~6,100 lines across 10 source files in https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/, with full daemon integration, CLI commands, and a management API.

ADR-R01: Own FastCDC Implementation

Date: 2026-03-08 Status: Accepted

Context

Content-defined chunking (CDC) is required for deduplication and resumable transfers. Options: use an existing Go CDC library, or write our own.

Decision

Write our own FastCDC in https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/chunker.go (180 lines). Single-pass streaming: each byte is hashed with BLAKE3 as the chunk boundary is found, so the chunk hash is available the moment the boundary is detected. No second pass.

Chunk sizes are adaptive based on file size:

File size	Min	Avg	Max
< 250 MB	64 KB	128 KB	256 KB
< 1 GB	128 KB	256 KB	512 KB
< 4 GB	256 KB	512 KB	1 MB
>= 4 GB	512 KB	1 MB	2 MB

Why Not a Library

Every Go CDC library we evaluated either required a second pass for hashing, pulled in unnecessary dependencies, or didn’t support adaptive chunk sizes. 180 lines of self-contained code with zero dependencies (beyond BLAKE3 which we already use for Merkle) is simpler than managing an external dependency for marginal benefit.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/chunker.go

ADR-R02: BLAKE3 for All Hashing

Date: 2026-03-08 Status: Accepted

Context

File transfer needs hashing for: per-chunk integrity, Merkle tree root verification, and checkpoint matching.

Decision

BLAKE3 everywhere. zeebo/blake3 (CC0/public domain). Used for:

Per-chunk hash during FastCDC (single-pass, computed as chunks are cut)
Merkle tree nodes (https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/merkle.go, 48 lines, binary tree with odd-node promotion)
Transfer checkpoint filenames (.shurli-ckpt-<root-hash>)

Why Not SHA-256

BLAKE3 is ~3-5x faster than SHA-256 on modern hardware. For large file transfers where every chunk is hashed, this matters. The CC0 license means zero legal overhead. SHA-256 would work correctly but slower for no benefit.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/merkle.go, https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/chunker.go

ADR-R03: zstd On-By-Default with Bomb Protection

Date: 2026-03-08 Status: Accepted

Context

Compression reduces transfer time on all but already-compressed data. The question is whether to make it opt-in or opt-out.

Decision

zstd compression on by default (klauspost/compress/zstd, BSD-3). Opt-out via transfer.compress: false in config.

Incompressible data is auto-detected: if compressed output is larger than input, the chunk is sent uncompressed (flagged in the wire format).

Bomb protection: maxDecompressRatio = 10. If decompressed output exceeds 10x compressed input size, decompression aborts immediately. This prevents a malicious peer from sending a tiny compressed payload that expands to fill disk or memory.

Why On-By-Default

95%+ of real files (documents, source code, logs, databases) compress well. The 5% that don’t (JPEG, MP4, ZIP) are detected automatically and sent uncompressed. The cost of attempting compression on incompressible data is negligible (one comparison). The benefit of not requiring users to remember a flag is significant.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/compress.go (41 lines), https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer.go (maxDecompressRatio)

ADR-R04: Reed-Solomon Stripe-Based Erasure Coding

Date: 2026-03-09 Status: Accepted

Context

WAN transfers lose chunks to network instability. Without forward error correction, every lost chunk requires a full round-trip retransmit.

Decision

Reed-Solomon erasure coding via klauspost/reedsolomon (MIT). Stripe-based: file is divided into stripes of defaultStripeSize = 100 data chunks each. Parity chunks are generated per stripe and appended to the manifest.

Key constraints:

Max parity overhead: maxParityOverhead = 0.50 (50% cap)
Max total parity chunks: maxParityCount = maxChunkCount / 2
Auto-enabled on Direct WAN only (disabled on LAN where loss is negligible)
Configurable via transfer.erasure_overhead (default 0.2 = 20%)

Why Stripe-Based

The alternative is whole-file RS encoding, which requires holding the entire file’s chunk set in memory. Stripe-based encoding bounds memory to one stripe (100 chunks) regardless of file size. A 100 GB file uses the same memory as a 100 MB file.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer_erasure.go (384 lines)

ADR-R05: RaptorQ Fountain Codes for Multi-Source

Date: 2026-03-09 Status: Accepted

Context

When multiple peers hold the same file, downloading from all of them simultaneously increases throughput. Traditional chunk-based multi-source requires coordination to avoid duplicates. Fountain codes solve this: each peer generates statistically independent symbols, so any combination of enough symbols from any peers reconstructs the data.

Decision

RaptorQ via xssnick/raptorq (MIT). Constants:

raptorqSymbolSize = 1024 bytes
raptorqRepairRatio = 0.2 (20% repair symbols per peer)

Wire protocol: /shurli/file-multi-peer/1.0.0 (https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer_multipeer.go, 874 lines). Requesting peer sends a manifest to each source, each source encodes independently and streams symbols back. The receiver collects symbols from all sources and decodes when it has enough.

Why RaptorQ Over Plain Multi-Source

Plain multi-source (each peer sends different chunks) requires a coordinator to prevent duplicates and handle stragglers. RaptorQ eliminates coordination entirely: symbols are statistically independent, so peers can encode at their own pace. The receiver just needs “enough” symbols from any combination. This is the same approach TON uses for its DHT, battle-tested at scale.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer_raptorq.go (105 lines), https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer_multipeer.go

ADR-R06: Adaptive Parallel Streams

Date: 2026-03-09 Status: Accepted

Context

A single QUIC stream underutilizes available bandwidth on high-BDP (bandwidth-delay product) links. Multiple streams allow the transport to fill the pipe.

Decision

Parallel chunk transfer with transport-aware defaults (https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer_parallel.go, 592 lines):

Transport	Default Streams	Max Streams
LAN	8	32
Direct WAN	4	20
Relay	1 (single stream)	1

Auto-reduction: if chunks < minChunksPerStream * streamCount (minimum 4 chunks per stream), stream count is reduced to avoid overhead exceeding benefit.

Why Different Defaults

LAN has near-zero latency and high bandwidth. 8 streams is conservative for gigabit+. WAN has higher latency and congestion is likelier; 4 streams balances throughput against congestion. Relay is already bandwidth-limited (signaling-only by default); parallel streams through relay would multiply relay load for minimal gain.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer_parallel.go

ADR-R07: AirDrop-Style Receive Permissions

Date: 2026-03-08 Status: Accepted

Context

Unsolicited file transfers are a spam vector. The system needs a permission model that balances convenience with control.

Decision

Five receive modes, controlled via transfer.receive_mode config:

Mode	Behavior
`off`	Reject all incoming transfers
`contacts`	Auto-accept from authorized peers (default)
`ask`	Queue all transfers for manual approval
`open`	Accept from any authorized peer without prompt
`timed`	Temporarily open, reverts to previous mode after duration

The contacts default means: if a peer passed the connection gater (is in authorized_keys), their transfers are accepted automatically. Unknown peers are rejected silently (no error message, no information leakage).

Why This Model

Apple’s AirDrop proved this UX works: most users want “contacts only” and occasionally switch to “everyone” for a specific situation. The timed mode handles the “open for 10 minutes” scenario without forgetting to turn it off. Silent rejection for unauthorized peers follows the same principle as the connection gater: don’t reveal your existence to strangers.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer.go (ReceiveMode constants)

ADR-R08: Fixed-Window Rate Limiting with Silent Rejection

Date: 2026-03-09 Status: Accepted

Context

A malicious or buggy peer could flood transfer requests. Rate limiting is needed, but the choice of algorithm affects complexity and information leakage.

Decision

Fixed-window rate limiter: 10 transfer requests per minute per peer. 60-second window. Excess requests are silently rejected (stream reset, no error message).

Implementation: transferRateLimiter struct in https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer.go. Per-peer counters with periodic cleanup of stale entries.

Also applied to multi-peer requests in HandleMultiPeerRequest (same limiter instance).

Why Fixed-Window Over Sliding Window

Fixed-window is simpler (a counter and a timestamp per peer) and sufficient for anti-spam. Sliding window adds complexity (sorted event lists or ring buffers) for marginal accuracy improvement at window boundaries. The 2x worst-case burst at window edges is acceptable: 20 requests in 2 seconds instead of 10 is not a meaningful attack vector when each request still requires connection gater approval.

Why Silent Rejection

Informative error messages (“rate limited, try again in X seconds”) help attackers calibrate their request rate. Silent stream resets are indistinguishable from network failures. The legitimate peer experience is unaffected: 10 transfers per minute is generous for real use.

Reference: https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/transfer.go (transferRateLimiter)

ADR-R09: AI Compression Deferral

Date: 2026-03-09 Status: Accepted (Revisit 2028-2029)

Context

Neural compression achieves better ratios than classical algorithms for some data types. Should Shurli use AI-based compression for file transfer?

Decision

No. Classical zstd is the right choice today. AI compression deferred with a 2028-2029 checkpoint.

Technologies assessed:

DZip (neural lossless): ~10-30x slower than zstd. Compute cost is prohibitive for real-time P2P transfer where both sides need to encode/decode.
DCVC-RT / Cool-Chic (neural video): inference requires GPU. Most Shurli nodes are headless Linux boxes or phones.
NVIDIA NTC (GPU neural textures): CUDA-only. Not portable.

Revisit Criteria

Re-evaluate when:

Hardware accelerators for neural codecs ship in consumer devices (NPUs, dedicated silicon)
A neural lossless codec achieves within 2x of zstd encode speed on CPU
A cross-platform (CPU + GPU) implementation exists with a permissive license

Until then, zstd’s combination of speed, ratio, and universality is unmatched for general-purpose P2P file transfer. No point adding 100x compute overhead for 10-20% better ratios.

Summary: Transport Policy

All file transfer operations are gated by PluginPolicy (https://github.com/shurlinet/shurli/blob/main/pkg/p2pnet/plugin_policy.go, 106 lines):

Transport	Bitmask	File Transfer
LAN	`TransportLAN` (1)	Allowed
Direct WAN	`TransportDirect` (2)	Allowed
Relay	`TransportRelay` (4)	Blocked by default

Default: TransportLAN | TransportDirect. Relay is excluded because file transfer through relay would consume relay bandwidth that should be reserved for signaling. This drives adoption of direct connectivity and own-relay deployment.

Summary: Wire Protocols

Protocol	ID	Purpose
File Transfer	`/shurli/file-transfer/2.0.0`	Send/receive files
File Browse	`/shurli/file-browse/1.0.0`	Browse shared files
File Download	`/shurli/file-download/1.0.0`	Download shared files
Multi-Peer	`/shurli/file-multi-peer/1.0.0`	RaptorQ multi-source

Summary: Fire-and-Forget Daemon Model

shurli send <file> <peer> POSTs to the daemon’s /v1/send endpoint, receives a transfer ID, and exits. The daemon manages the transfer in the background. No terminal needs to stay open. Users check progress with shurli transfers or opt into inline progress with shurli send --follow.

This is deliberate: a CLI that blocks until transfer completion ties up a terminal and fails if the terminal is closed. The daemon-mediated model means transfers survive terminal disconnection, SSH timeouts, and laptop lid closes.

Phase 9 - SDK, Plugins & Protocol Consolidation Post-Chaos Network Hardening