Performance¶

GoCC is designed for high throughput while maintaining simplicity.

Benchmarks¶

Measured on Apple M4 Pro laptop:

Internal (No HTTP)¶

Metric	Value
Throughput	~6 million req/s
Latency	~150 ns/request

This represents the raw actor model performance without network overhead.

With HTTP¶

Protocol	Endpoint	Throughput
HTTP/2	`/rate/:key`	~350-450k req/s
HTTP/2	`/healthz`	~500k req/s
HTTP/1.1	`/rate/:key`	~80k req/s
HTTP/1.1	`/healthz`	~85k req/s

Bottleneck Analysis¶

Component           Time per Request
─────────────────────────────────────
Rate limiter logic:     ~150 ns
HTTP/2 overhead:       ~2000 ns
HTTP/1.1 overhead:    ~12000 ns
Network (typical):    ~1-10 ms

The HTTP layer is the bottleneck, not the rate limiting logic.

Why HTTP/2 is Faster¶

Feature	HTTP/1.1	HTTP/2
Connections	One request per connection (or keep-alive)	Multiplexed streams
Headers	Text, repeated	Binary, compressed (HPACK)
Pipelining	Not reliable	Native multiplexing

Use HTTP/2 clients for best performance: - Go: http.Client with HTTP/2 transport - CLI: h2load for benchmarking

Theoretical Limits¶

Network Bandwidth¶

Assuming 100 Mbit/s per pod: - Request size: ~300-500 bytes - Max theoretical: ~25-33k req/s

In practice, local/loopback is much faster.

Actor Model¶

With pipelining/batching (not implemented): - Theoretical: ~300 million req/s - See snail for batched networking

GoCC uses strict request→response without batching for simplicity.

Optimization Decisions¶

What We Optimized¶

HTTP/2 by default - 4-5x faster than HTTP/1.1
Sharded managers - Parallel processing across 25 shards
Direct response path - Responses bypass manager
Lazy instance creation - No overhead for unused keys
Auto-expiration - Memory reclaimed for idle keys

What We Didn't Optimize¶

No batching - Simplicity over maximum throughput
No custom protocol - HTTP compatibility
No connection pooling - Clients handle this
No pre-allocation - Go's GC handles it well

Scaling Strategies¶

Vertical Scaling¶

More CPU cores = more parallelism: - Manager shards run in parallel - Each instance is independent - HTTP server handles concurrent requests

Horizontal Scaling¶

Multiple GoCC instances: - Deploy as Kubernetes StatefulSet - Consistent hashing distributes keys - No coordination overhead - See Kubernetes Deployment

Comparison with Alternatives¶

vs. In-Process Rate Limiting¶

// In-process (e.g., golang.org/x/time/rate)
limiter := rate.NewLimiter(100, 10)
if !limiter.Allow() {
    return errors.New("rate limited")
}

Aspect	In-Process	GoCC
Latency	~10 ns	~2 μs (HTTP/2)
Deployment	Per-service	Centralized
Configuration	Code change	Hot-reload
Cross-service	No	Yes

vs. Redis-Based¶

# Redis rate limiting
INCR rate:user-123
EXPIRE rate:user-123 1

Aspect	Redis	GoCC
Persistence	Yes	No (in-memory)
Latency	~1 ms	~2 μs (local)
Queueing	Limited	Native FIFO
Dependencies	Redis server	None

vs. API Gateway Rate Limiting¶

Aspect	API Gateway	GoCC
Latency	Varies	Low
Configuration	Gateway-specific	Universal
Queueing	Usually no	Yes
Protocol	HTTP only	HTTP (extensible)

Benchmarking Tools¶

HTTP/1.1¶

wrk -d10s -t4 -c100 http://localhost:8080/rate/x

HTTP/2¶

h2load -c100 -n1000000 -m100 http://localhost:8080/rate/x

Internal¶

gocc benchmark actors