KEIBIDROP Benchmarks, April 2026

Encrypted throughput, FUSE overhead, and PostgreSQL on a virtual filesystem

10 min read | KEIBIDROP Series

After fixing a file descriptor reuse race that caused 50% data loss during PostgreSQL initdb, we ran our benchmark suite on two machines. The numbers below document where we are. Some of them are good, some reflect the constraints of the hardware, and all of them are honest.

Test machines

The Mac is an Intel Core i7-9750H at 2.60 GHz with 6 cores and 12 threads, 32 GB of RAM, and a 500 GB NVMe SSD over PCIe. It runs macOS with macFUSE.

The Linux machine is a Contabo VPS with 4 vCPUs (Intel Xeon Gold 6140 at 2.30 GHz, single-threaded), 16 GB of RAM, and a 96 GB virtual disk that reports as rotational. The VPS shares its physical CPU and disk I/O with other tenants. IOPS are throttled by the hypervisor, which affects write-heavy workloads. It runs Ubuntu 24.04 with libfuse.

Both machines run Go 1.24. KEIBIDROP negotiates AES-256-GCM on both since the i7-9750H and the Xeon Gold 6140 both have AES-NI.

Encrypted channel throughput

The raw encrypted pipe, measured in isolation without FUSE or disk I/O.

MetricMacVPS
SecureConn 1 MiB blocks1,707 MB/s460 MB/s
SecureConn 16 MiB blocks1,740 MB/s502 MB/s
Raw net.Pipe (no encryption)19,432 MB/s4,901 MB/s

Encryption costs about 10x compared to raw memory copies on both machines. The absolute throughput exceeds what any realistic network link between two peers will deliver, so the crypto is not the bottleneck for WAN transfers.

File transfer without FUSE

Encrypted gRPC streaming, peer-to-peer over loopback. This is the PullFile path: one peer has a file, the other requests it. Measures gRPC + encryption + disk write.

SizeMacVPS
10 MB290 MB/s100 MB/s
100 MB547 MB/s213 MB/s
1 GB585 MB/s288 MB/s

The VPS plateaus at 288 MB/s, limited by virtual disk write throughput.

Round-trip transfer: write on one peer, pull from the other

Alice (FUSE) writes a file to her mount. The FUSE handler writes to the backing store and sends a notification to Bob (no-FUSE) over the encrypted gRPC channel. Bob receives the notification, then pulls the full file from Alice. Both peers run on the same machine over loopback, so network latency is zero. The timer starts before Alice's write and stops after Bob has the complete file on his disk.

SizeFUSE WritePull (gRPC)TotalTotal MB/s
1 MB494 MB/s99 MB/s315 ms3.2 MB/s
10 MB736 MB/s244 MB/s358 ms28 MB/s
100 MB1,018 MB/s547 MB/s584 ms171 MB/s
1 GB756 MB/s498 MB/s3,714 ms276 MB/s

The 1 MB total of 3.2 MB/s is almost entirely notification overhead. The file itself transfers in microseconds, but the FUSE Release handler runs a deferred notification path that lstats the file multiple times to ensure the size has stabilized (a workaround for macOS fcopyfile behavior). At 100 MB, this fixed latency cost becomes negligible and the total throughput reaches 171 MB/s.

The reverse direction (Bob adds file via API, Alice reads full file from FUSE mount, streaming from Bob over gRPC) shows different characteristics because it skips the deferred notification path:

SizeMB/s
1 MB154 MB/s
10 MB261 MB/s
100 MB227 MB/s
1 GB259 MB/s

The asymmetry is expected. Writing through FUSE is fast (kernel page cache), but the notification adds latency. Reading through FUSE triggers an on-demand gRPC stream, which starts immediately.

Where the time goes (100 MB read through FUSE)

LayerDurationOverhead
Encrypted gRPC alone210 msbaseline
+ copy into user buffer210 ms0.1%
+ pwrite to local cache241 ms8.2%
Full FUSE end-to-end371 ms35%

The FUSE kernel overhead is 35% of total time for a 100 MB read. Each read crosses the kernel-userspace boundary twice: once from the reading process into the kernel, once from the kernel into our FUSE handler, and back. This cost is fixed per operation, not per byte, so it amortizes over larger reads.

Latency

FUSE mount operations on Mac, measured per-file:

SizeCreate+WriteReadTotal
1 KB914 us412 us1.3 ms
1 MB2.2 ms1.8 ms3.9 ms

For comparison, the same operations on local disk without FUSE: 1 KB takes 1.3 ms total, 1 MB takes 1.1 ms. FUSE adds 1 to 3 ms per operation, which is the kernel round-trip cost.

Open/Close latency for 100 iterations: average open 218 us, average close 99 us.

Optimal block size and worker count

Block size sweep on Mac (no-FUSE PullFile): 256 KiB averages 406 MB/s, 1 MiB averages 571 MB/s, 4 MiB averages 657 MB/s, 16 MiB averages 690 MB/s. Diminishing returns above 4 MiB.

Worker count sweep on Mac: 1 worker does 544 MB/s, 4 workers peak at 741 MB/s, 8 workers drop to 523 MB/s. On the VPS: 1 worker does 153 MB/s, 4 workers peak at 329 MB/s, 8 workers drop to 301 MB/s. Four workers is the sweet spot on both machines. Beyond that, goroutine scheduling overhead and lock contention eat the parallelism gains.

PostgreSQL on FUSE

The reason we ran these benchmarks. After fixing the fd reuse race, PostgreSQL runs on the FUSE mount without data corruption.

On the Linux VPS we ran the following sequence: initdb (968 files, 168 legitimately empty, matching native exactly), pg_ctl start, CREATE TABLE with three indexes, INSERT 10,000 rows, INSERT 5,000 more rows interleaved with UPDATEs (archiving completed orders every 100 iterations), VACUUM ANALYZE, aggregation queries over 15,000 rows, clean shutdown. Final state: 980 files, 42 MB on the backing store.

All operations completed without errors or data corruption. PostgreSQL runs with full ACID guarantees on the FUSE mount.

POSIX compliance

We run the pjdfstest suite on Linux against the FUSE mount. Excluding symlink and hardlink tests (which we do not support, intentionally, because symlinks in a P2P sync context create path traversal risks):

1,256 passed out of 1,657 tests. 75.8% pass rate.

The remaining failures are chown tests that require root privileges and rename/unlink edge cases involving nlink counts, which are hardlink semantics. We do not plan to add hardlink or symlink support.

What these numbers mean in practice

For our target use case of syncing files between two machines, the throughput is sufficient. A medium git repository clones between peers in seconds. PostgreSQL runs with full ACID guarantees on the FUSE mount. The FUSE kernel overhead of 35% is a fixed cost of the architecture and we cannot reduce it without moving to a kernel-native filesystem, which would eliminate the cross-platform advantage.

The bottleneck for real WAN transfers between two machines will be the network bandwidth, not the crypto or the FUSE overhead. A 100 Mbit/s connection tops out at 12.5 MB/s, well below any of the numbers above. We have not yet benchmarked over a real WAN link, only over loopback. That test is planned.

10 min read | KEIBIDROP Series | FD Reuse Race | Cross-Peer PostgreSQL | Optimizing Transfer