Reliability & Resilience
The SDK is designed for hostile network environments. It guarantees zero data loss through disk-backed buffering, automatic reconnection with exponential backoff, circuit breaking, and adaptive bandwidth management.
Spill Buffer
When the in-memory queue fills (e.g. during network outage), messages spill to a file-backed ring buffer on disk. On reconnect, spilled messages drain before the live queue to preserve FIFO ordering.
yaml SpillPolicy auto-generated
| Field | Type | Default | Description |
|---|---|---|---|
| max_disk_bytes | u64 | 64 * 1024 * 1024 | Maximum size of the spill file in bytes. When the file exceeds this limit, the `EvictionStrategy` is applied. Default: **67 108 864** (64 MiB). |
| max_age_secs | u64 | 3600 | Maximum age of spilled messages in seconds. Messages older than this are eligible for eviction regardless of disk usage. Default: **3 600** (1 hour). |
| eviction | EvictionStrategy | — | Strategy applied when the spill file reaches `max_disk_bytes`. Default: `OldestFirst`. |
| spill_path | String | /tmp/nerve-sdk-spill.bin | File system path for the spill file. The directory must exist and be writable. The file is created lazily on the first spill. Default: `"/tmp/nerve-sdk-spill.bin"`. |
# Production spill config
spill_policy:
max_disk_bytes: 134217728 # 128 MB
max_age_secs: 7200 # 2 hours
eviction: oldest_first
spill_path: /var/data/nerve-spill.bin Circuit Breaker
After circuit_breaker_threshold (default: 5) consecutive connection failures, the circuit breaker trips to Open state. While open, the worker does not attempt to connect — all messages spill to disk. After a cooldown period, it enters HalfOpen to probe with a single connection attempt.
Reconnection Strategy
On disconnect, the worker uses exponential backoff:
- Base delay:
backoff_base_ms(default: 100 ms) - Max delay:
backoff_max_ms(default: 30,000 ms = 30 seconds) - Pattern: 100 ms → 200 ms → 400 ms → 800 ms → ... → 30 s (capped)
- On success: Backoff resets to base. Circuit breaker transitions to Closed.
- During backoff: All messages spill to disk automatically.
Adaptive Bandwidth
The BandwidthEstimator samples RTT after each batch flush and reclassifies the link quality every 5 seconds. Batch parameters adjust automatically.
enum LinkTier auto-generated
| Variant | Fields | Description |
|---|---|---|
| HighBandwidth | — | Low latency, high throughput (average RTT < 50 ms). Uses default MTU-sized batches with no compression overhead. Typical for LAN and co-located deployments. |
| Constrained | — | Moderate latency (50 -- 200 ms average RTT). Increases batch size to 4 KiB, extends flush interval to 50 ms, and enables compression. Typical for cellular (4G/5G) or cross-region links. |
| Degraded | — | High latency or packet loss (average RTT > 200 ms). Maximises batch size to 8 KiB, extends flush interval to 200 ms, and forces compression. Typical for satellite links or severely degraded networks. |
Heartbeat Protocol
A dedicated 5th QUIC stream sends 19-byte heartbeat frames at heartbeat_interval_ms (default: 5,000 ms). The server uses these to detect client health and backpressure.
Wire format (19 bytes, magic 0xBEAF):
[magic: u16 LE][ts_ns: u64 LE][queue_depth: u32 LE][spill_depth: u32 LE][circuit_state: u8]
circuit_state: 0=Closed (healthy), 1=Open (tripped), 2=HalfOpen (probing) Questions?
Reach out for help with integration, deployment, or custom domain codecs.