Architecture¶
GoCC is built on the actor model, providing isolation, scalability, and simplicity.
Overview¶
graph TD
subgraph "HTTP Layer"
H[HTTP Server<br>Echo + HTTP/2]
end
subgraph "Manager Layer"
M1[Manager Shard 1]
M2[Manager Shard 2]
M3[Manager Shard ...]
M25[Manager Shard 25]
end
subgraph "Instance Layer"
I1[key-a limiter]
I2[key-b limiter]
I3[key-c limiter]
I4[key-d limiter]
end
H --> M1
H --> M2
H --> M3
H --> M25
M1 --> I1
M1 --> I2
M2 --> I3
M25 --> I4
Layers¶
1. HTTP Layer¶
- Echo framework with HTTP/2 support
- Handles request parsing and response formatting
- Routes requests to appropriate manager shard
2. Manager Layer¶
- 25 shards (by default) for parallelism
- Each shard is an independent goroutine
- Routes requests to rate limiter instances
- Creates new instances on demand
- Expires idle instances
Sharding Algorithm: FNV-1a hash of key → modulo 25
3. Instance Layer¶
- One goroutine per unique rate limit key
- Maintains request counter and queue
- Resets counter on window tick
- Self-expires after 3 windows of inactivity
Message Flow¶
sequenceDiagram
participant C as Client
participant H as HTTP Handler
participant M as Manager Shard
participant I as Rate Limiter Instance
C->>H: POST /rate/my-key
H->>M: RateLimitRequest{key: "my-key"}
M->>I: Forward to instance (or create)
I->>I: Check counter/queue
I->>C: Response (200 or 429)
Note: Responses go directly from the instance to the client, bypassing the manager for performance.
Instance Lifecycle¶
stateDiagram-v2
[*] --> Created: First request for key
Created --> Active: Processing requests
Active --> Active: Requests within window
Active --> WindowReset: Timer tick
WindowReset --> Active: More requests
WindowReset --> Idle: No requests
Idle --> Active: New request
Idle --> Expired: 3 windows idle
Expired --> [*]
Creation¶
Instances are created lazily on first request:
- Request arrives for unknown key
- Manager creates new instance goroutine
- Instance initializes counter and timer
- Request is processed
Window Reset¶
Every window_millis (default 1000ms):
- Timer fires
- Counter resets to 0
- Queued requests are released (FIFO)
Expiration¶
After 3 windows with no activity:
- Instance marks itself for expiration
- Notifies manager
- Manager removes from registry
- Goroutine exits
Data Structures¶
Manager¶
type LimiterManager struct {
instances map[string]*LimiterInstance
msgChan chan ManagerMessage
config Config
}
Instance¶
type LimiterInstance struct {
key string
config InstanceConfig
approvedThisWindow int
deniedThisWindow int
queue []WaitingRequest
msgChan chan InstanceMessage
}
Concurrency Model¶
No Shared State¶
Each actor owns its data: - Managers own their instance maps - Instances own their counters and queues
Message Passing¶
All communication via channels:
- HTTP handler → Manager: chan ManagerMessage
- Manager → Instance: chan InstanceMessage
- Instance → Client: Direct response channel
No Locks in Hot Path¶
The only synchronization is channel operations: - Bounded channels provide back-pressure - No mutexes in request processing
Sharding¶
Managers are sharded to reduce contention:
Key: "user-123"
↓
Hash: FNV-1a("user-123") = 0x7a3b2c1d
↓
Shard: 0x7a3b2c1d % 25 = 12
↓
Manager 12 handles this key
Benefits: - Parallel processing across shards - Keys with similar prefixes may hash to different shards - Consistent routing (same key → same shard)
Memory Usage¶
Per instance: - Fixed overhead: ~500 bytes - Queue: ~100 bytes per waiting request - Config: ~50 bytes
Example: 10,000 keys with avg 10 queued requests: - Instances: 10,000 × 500 bytes = 5 MB - Queues: 10,000 × 10 × 100 bytes = 10 MB - Total: ~15 MB
Distributed Mode¶
See Kubernetes Deployment for multi-instance setup.
In distributed mode: - Each instance handles a subset of keys - Consistent hashing routes requests - No instance-to-instance communication - Clients can hit any instance (request forwarding)