Practical Information Security Tradeoffs: choose the right primitive for immediate value

A practical look at confidentiality, integrity, and availability benchmarks

“There’s no such thing as 100% security.” There is always a tradeoff between Confidentiality, Integrity and Availability when it comes to data. When it comes to files in production, always choose the tool that meets the risk profile for them. Do you need extremely fast checks to see if the files were modified? Or do you need cryptographic strength integrity checks for anti-tampering? You care more about the confidentiality of the files? Or maybe a combination of all three variables.

This post explores techniques and tradeoffs (with runnable benchmarks) when it comes to file confidentiality, integrity and accessibility.

Premise / Audience

Target readers: engineers, DevOps, SREs, security architects, CTOs who want actionable guidance.

Problem: teams often pick the “strongest” crypto (SHA-512, longest key lengths, slow checks) or the “simplest” fastest check without matching risk. That leads to either over-investment (high infra cost), under-operated complexity, or false confidence.

Goal: show when to use SHA-512, AES-GCM, ChaCha20-Poly1305, and a fast non-cryptographic hash (Murmur) for real tasks: availability checks, tamper detection, and encryption. Provide 4 different approaches and benchmark them in two programming languages Haskell and Rust.

People say "make this as secure as Fort Knox"

People say "make this as secure as Fort Knox", but security is risk management and operational capacity. For file pipelines (CDN, object storage, backups), the useful questions are:

  1. Is the file available for download? (Availability)
  2. Has the file been tampered with / changed? (Integrity)
  3. Is the file encrypted (confidentiality)? (Confidentiality + authenticity)

These are different requirements. We will incrementally build the techniques used in the next sections in order to address the above requirements and the following question: What’s the cost of processing this file? (in computing resources)

The availability of files

If you just care about the availability of the files, then you just need to store and serve the files on each request. This is good usually for files for public use. Thus, no need to encrypt them, nor hash them, nor anything. The End.

The integrity of the files

If you care about the integrity of the files, then hash the file and if any bit changes, the hash also changes. The End. Or not?

Here there are two different types of integrity checks:

Cryptographic integrity checks, that are resistant to collisions. Which means that given a file and a hash for it, an attacker finds it very hard to modify the file such that the new hash of the file is the same as the old one.

The industry standards for such hash functions are usually the ones approved and recommended by NIST. More details about the strength of different cryptographic hash functions can be found on the NIST Computer Security Resource Center.

Let's take SHA-512 for a closer inspection. This one is the "strongest" from the NIST list in terms of how hard it is to find a collision for it.

By using SHA-512, usually the trade-off is in terms of CPU time, thus it might not work very well in low latency environments.

For example, just by running the speed test from openssl speed sha512

Note: openssl speed uses in-memory buffers and reports MB/s for various block sizes.
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
sha512           34411.15k   134991.65k   318518.91k   565828.85k   752341.73k   768283.57k

Now we also benchmarked two implementations for the SHA512, written in rust and haskell.

For 1G file of random data, as follows:

  1. Load the full file in memory and Hash it:
    (rust) SHA512 full avg over 10 runs: 2.750s
    (haskell) SHA512 full: 2.957566s
  2. Be RAM conscious and read the file in system block size chunks (which for my case is 4096 – you can find using diskutil info /)
    (rust) SHA512 4K blocks avg: 2.733s
    (haskell) SHA512 4K blocks: 3.950727s
  3. Be a bit less RAM conscious and use linux cp default block size (128 KiB):
    (rust) SHA512 128K blocks avg: 2.409s
    (haskell) SHA512 128K blocks: 3.365536s
  4. Be even less RAM conscious and use 10M as block size (I found out that on Mac and Windows works better in terms of read/write speed.)
    (rust) SHA512 10M blocks avg: 2.484s
    (haskell) SHA512 10M blocks: 2.770903s

Ok, but what if you do not need cryptographic strength hash functions? What if you mostly care about checking if a file changed, or data was missing, or you just want to use it as a key to a hashtable and not because of an attacker modified it.

There is a class of non-cryptographic functions that fill this use case, and my favorite is Murmur, and it is my favorite mostly because there is also an unrelated Romanian clothing brand named murmur.

Now let's also run the same benchmark for Murmur (Hash)

Now we also benchmarked two implementations for Murmur, written in rust and haskell.

For 1G file of random data, as follows:

  1. Load the full file in memory and Hash it:
    (rust: murmur 3) Murmur full avg: 0.790s
    (haskell: murmur 2) Murmur full: 3.036555s
  2. Be RAM conscious and read the file in system block size chunks (which for my case is 4096 – you can find using diskutil info /)
    (rust: murmur 3) Murmur 4K blocks avg: 0.602s
    (haskell: murmur 2) Murmur 4K blocks: 3.234807s
  3. Be a bit less RAM conscious and use linux cp default block size (128 KiB):
    (rust: murmur 3) Murmur 128K blocks avg: 0.492s
    (haskell: murmur 2) Murmur 128K blocks: 3.022566s
  4. Be even less RAM conscious and use 10M as block size (I found out that on Mac and Windows works better in terms of read/write speed.)
    (rust: murmur 3) Murmur 10M blocks avg: 0.557s
    (haskell: murmur 2) Murmur 10M blocks: 2.951578s

Besides that my Haskell skills are bad, and the results between the Rust and the Haskell implementations are quite big, I blame it on the difference between language paradigms, and compiled to semi-compiled, pure functional language.

Let’s compare SHA512 to Murmur to see the huge difference in wall clock.

Block Size SHA512 (Rust) SHA512 (Haskell) Murmur (Rust) Murmur (Haskell)
Full file (1G) 2.750s 2.957566s 0.790s 3.036555s
4K blocks 2.733s 3.950727s 0.602s 3.234807s
128K blocks 2.409s 3.365536s 0.492s 3.022566s
10M blocks 2.484s 2.770903s 0.557s 2.951578s

This is the tradeoff, thus make your own conclusions when you need to mitigate the risks for integrity and availability.

Now moving on to Confidentiality.

The Confidentiality of the files

In terms of just confidentiality, you would use a symmetric cipher to encrypt the files. If you do not care about integrity at all, then you would just use a stream cipher, that is quite fast; but you wouldn’t know if bits were flipped, or data was modified.

In the realm of symmetric encryption where the integrity matters, then you would use a symmetric encryption mode for block ciphers with authentication.

More details can be found on NIST Cryptographic standards and guidelines .

But for this case let’s just limit ourselves to the following ones: AES256-GCM and ChaCha-Poly1305.

I chose them because of the following nit picks:

If your target hardware supports Hardware AES Support: Yes, more specifically Advanced Vector Extensions, then AES should be faster than ChaCha, otherwise, if not, the other way around.

(you can check by running diskutil info \)

I will not go in detail of how the symmetric encryption ciphers work, but limit to this:

Enough talk! Give me the benchmarks!

AES256-GCM

openssl speed -evp aes-256-gcm
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-256-GCM      69207.00k   255371.29k   825316.51k  2208601.77k  4060318.38k  4421708.46k

Benchmarks on 1G file of random data (Rust vs Haskell):

  1. Load the full file in memory & encrypt:
    (rust) AES256-GCM full avg: 2.389s
    (haskell) AES256-GCM full: 4.929817s
  2. Read file in 4K blocks:
    (rust) 2.048s
    (haskell) 5.440893s
  3. Read file in 128K blocks:
    (rust) 1.897s
    (haskell) 4.88268s
  4. Read file in 10M blocks:
    (rust) 1.850s
    (haskell) 4.621191s

ChaCha-Poly1305

openssl speed -evp chacha20-poly1305
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
ChaCha20-Poly1305   283875.88k   574783.89k  1213670.23k  2217616.18k  2424297.74k  2418109.10k

Benchmarks on 1G file of random data (Rust vs Haskell):

  1. Load the full file in memory & encrypt:
    (rust) ChaCha full avg: 1.910s
    (haskell) 2.789824s
  2. Read file in 4K blocks:
    (rust) 1.999s
    (haskell) 4.417005s
  3. Read file in 128K blocks:
    (rust) 1.416s
    (haskell) 3.160261s
  4. Read file in 10M blocks:
    (rust) 1.326s
    (haskell) 2.878134s

OpenSSL Benchmark (AES vs ChaCha)

Type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-256-GCM 69207.00k 255371.29k 825316.51k 2208601.77k 4060318.38k 4421708.46k
ChaCha20-Poly1305 283875.88k 574783.89k 1213670.23k 2217616.18k 2424297.74k 2418109.10k

Rust and Haskell Benchmarks

File Read Method AES256-GCM (Rust) AES256-GCM (Haskell) ChaCha20-Poly1305 (Rust) ChaCha20-Poly1305 (Haskell)
1. Load full file in memory 2.389s 4.929817s 1.910s 2.789824s
2. Read file in 4K blocks 2.048s 5.440893s 1.999s 4.417005s
3. Read file in 128K blocks 1.897s 4.88268s 1.416s 3.160261s
4. Read file in 10M blocks 1.850s 4.621191s 1.326s 2.878134s

Which makes me think that for sure I did something stupid in both the Rust and Haskell implementations, as they are wrong, and contradicting OpenSSL.

Closing remarks

  1. Speed + scalability: a fast, non-crypto hash (Murmur) is an excellent hot-path detector for accidental corruption and availability checks. Use it where adversary resistance isn’t required.
  2. Strong integrity: for legal/audit/forensic, SHA-512 remains a clear choice: expensive but auditable and collision-resistant.
  3. Confidentiality + integrity: AEAD is mandatory; AES-GCM will win on hardware enabled servers; ChaCha20-Poly1305 performs better on CPUs without AES hardware (although my implementations and system info say the opposite).
  4. Blockwise AEAD + metadata AEAD: is a pragmatic pattern for streaming encryption with detection against reorders/drops.
  5. Tradeoffs: pick the primitive that matches your risk, performance budget, and operational capacity.

Annex

Code repository