Benchmarks

Benchmarks

rLightning includes a comprehensive benchmarking suite built on Criterion for statistically rigorous performance measurement.

Available Benchmarks

BenchmarkDescription
storage_benchCore storage engine operations (set, get, delete)
protocol_benchRESP protocol parsing and serialization
throughput_benchEnd-to-end request throughput
redis_comparison_benchDirect comparison against Redis server
replication_benchReplication command propagation and sync
auth_benchAuthentication and ACL checking overhead
command_category_benchPer-category command performance
advanced_features_benchStreams, scripting, and advanced data structures
cluster_benchCluster mode hash slot routing and redirection

Running Benchmarks

Run All Benchmarks

cargo bench

Run a Specific Benchmark Suite

cargo bench --bench storage_bench
cargo bench --bench protocol_bench
cargo bench --bench throughput_bench
cargo bench --bench redis_comparison_bench

Run a Specific Benchmark Function

Filter by function name within a benchmark suite:

cargo bench --bench storage_bench -- set_get
cargo bench --bench storage_bench -- concurrent_writes
cargo bench --bench protocol_bench -- parse_bulk_string

View HTML Reports

Criterion generates detailed HTML reports with statistical analysis and comparison charts:

# After running benchmarks, open the report
open target/criterion/report/index.html

Key Performance Optimizations

rLightning achieves high throughput through several architectural choices:

jemalloc Allocator

rLightning uses jemalloc as its global memory allocator. jemalloc provides better performance than the system allocator for the allocation patterns typical in a data store — many small, short-lived allocations mixed with long-lived cached data.

Lock-Free Concurrency with DashMap

The storage engine uses DashMap, a concurrent hash map with sharded internal locking. This allows multiple threads to read and write different keys simultaneously without contention. Compared to a single RwLock<HashMap>, DashMap scales linearly with the number of CPU cores.

Tokio Async Runtime

All network I/O is handled through Tokio’s async runtime, enabling thousands of concurrent client connections with minimal thread overhead. The work-stealing scheduler distributes load evenly across worker threads.

memchr Buffer Scanning

RESP protocol parsing uses the memchr crate for SIMD-accelerated byte scanning when searching for delimiters (\r\n) in the input buffer. This significantly speeds up protocol parsing for large payloads.

Zero-Copy Parsing

Where possible, rLightning avoids copying data during protocol parsing. Byte slices from the read buffer are used directly, reducing memory allocations and improving cache locality.

Benchmarking with redis-benchmark

You can also benchmark rLightning using the standard redis-benchmark tool that ships with Redis. This provides a direct comparison under identical test conditions.

Basic Throughput Test

# Start rLightning
cargo run --release

# Run redis-benchmark against it (default: 50 connections, 100000 requests)
redis-benchmark -h 127.0.0.1 -p 6379

Specific Command Tests

# Benchmark SET/GET operations
redis-benchmark -h 127.0.0.1 -p 6379 -t set,get -n 1000000

# Benchmark with pipelining (16 commands per pipeline)
redis-benchmark -h 127.0.0.1 -p 6379 -t set,get -n 1000000 -P 16

# Benchmark with specific data size (256 bytes)
redis-benchmark -h 127.0.0.1 -p 6379 -t set -n 500000 -d 256

# Benchmark with multiple clients
redis-benchmark -h 127.0.0.1 -p 6379 -t set,get -n 1000000 -c 100

Comparison Against Redis

To compare rLightning against Redis under identical conditions:

# Start Redis on port 6379
redis-server --port 6379

# Start rLightning on port 6380
cargo run --release -- --port 6380

# Benchmark Redis
redis-benchmark -h 127.0.0.1 -p 6379 -t set,get -n 1000000 -P 16 -q

# Benchmark rLightning
redis-benchmark -h 127.0.0.1 -p 6380 -t set,get -n 1000000 -P 16 -q

The -q flag outputs a quiet summary with just the operations-per-second figure, making it easy to compare results side by side.

For meaningful benchmarks, consider testing with: