Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window)

Rate limiting is essential for protecting APIs from abuse, ensuring fair resource allocation, and maintaining system stability. While there are existing solutions, I wanted to build something lightweight, performant, and easy to integrate into any Go p…


This content originally appeared on DEV Community and was authored by Maksat Ramazan

Rate limiting is essential for protecting APIs from abuse, ensuring fair resource allocation, and maintaining system stability. While there are existing solutions, I wanted to build something lightweight, performant, and easy to integrate into any Go project.

Today, I'm sharing kazrl - a zero-dependency rate limiter library that implements three different algorithms and comes with ready-to-use middleware for popular Go web frameworks.

The Problem

Most rate limiting libraries either:

  • Come with heavy dependencies
  • Support only one algorithm
  • Require complex setup for per-client limiting
  • Lack middleware integration

I needed something that:

  • Has zero external dependencies
  • Supports multiple algorithms (Token Bucket, Leaky Bucket, Sliding Window)
  • Works with popular frameworks out of the box
  • Provides flexible per-client rate limiting

Installation

go get github.com/Makennsky/kazrl

That's it! No transitive dependencies to worry about.

Quick Start

Here's the simplest way to add rate limiting to your HTTP handler:

import (
    "net/http"
    "github.com/Makennsky/kazrl"
    "github.com/Makennsky/kazrl/middleware"
)

func main() {
    // Create a rate limiter: 100 requests per second, burst of 200
    limiter := kazrl.NewTokenBucket(100, 200)

    // Apply middleware
    rateLimitMiddleware := middleware.HTTP(limiter)

    handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        w.Write([]byte("Hello, World!"))
    })

    http.Handle("/api/", rateLimitMiddleware(handler))
    http.ListenAndServe(":8080", nil)
}

That's literally 10 lines of code to add rate limiting!

Three Algorithms, One Interface

Different use cases need different strategies. kazrl implements three battle-tested algorithms:

1. Token Bucket

Perfect for APIs that need to allow bursts while maintaining average rate limits.

limiter := kazrl.NewTokenBucket(10, 20)
// 10 requests per second, allows bursts up to 20

Use case: Public APIs, user-facing endpoints

2. Leaky Bucket

Smooths out traffic spikes by processing requests at a constant rate.

limiter := kazrl.NewLeakyBucket(10, 20)
// Processes 10 req/s, queues up to 20

Use case: Protecting downstream services, database queries

3. Sliding Window

Provides the most accurate rate limiting without fixed window edge cases.

limiter := kazrl.NewSlidingWindow(10, 20)
// 10 req/s with a sliding time window

Use case: Strict rate enforcement, billing APIs

All three implement the same interface, so switching is trivial:

type RateLimiter interface {
    Allow() bool
    AllowN(n int) bool
    Wait(ctx context.Context) error
    WaitN(ctx context.Context, n int) error
    Reserve() time.Duration
    ReserveN(n int) time.Duration
}

Per-Client Rate Limiting Made Easy

The real power comes with per-client limiting. Here's how to rate limit by IP address:

rateLimitMiddleware := middleware.HTTPWithKeyFunc(
    func() kazrl.RateLimiter {
        return kazrl.NewTokenBucket(10, 20) // 10 req/s per IP
    },
    middleware.KeyByIP, // Built-in IP extractor
)

http.Handle("/api/", rateLimitMiddleware(handler))

Each IP address automatically gets its own rate limiter instance. The library handles X-Forwarded-For and X-Real-IP headers correctly.

Rate Limit by API Key

rateLimitMiddleware := middleware.HTTPWithKeyFunc(
    func() kazrl.RateLimiter {
        return kazrl.NewTokenBucket(100, 200)
    },
    middleware.KeyByAPIKey, // Extracts from Authorization header
)

Custom Key Functions

Need something more complex? Write your own key extractor:

customKeyFunc := func(r *http.Request) string {
    // Rate limit by IP + endpoint combination
    return middleware.KeyByIP(r) + ":" + r.URL.Path
}

rateLimitMiddleware := middleware.HTTPWithKeyFunc(
    func() kazrl.RateLimiter {
        return kazrl.NewTokenBucket(5, 10)
    },
    customKeyFunc,
)

Framework Integration

kazrl provides native middleware for popular frameworks:

Gin

r := gin.Default()
limiter := kazrl.NewTokenBucket(100, 200)
r.Use(middleware.Gin(limiter))

Echo

e := echo.New()
limiter := kazrl.NewTokenBucket(100, 200)
e.Use(middleware.Echo(limiter))

Fiber

app := fiber.New()
limiter := kazrl.NewTokenBucket(100, 200)
app.Use(middleware.Fiber(limiter))

Chi

r := chi.NewRouter()
limiter := kazrl.NewTokenBucket(100, 200)
r.Use(middleware.Chi(limiter))

Multi-Layer Rate Limiting

For advanced scenarios, you can stack multiple rate limiters:

// Global limit: 1000 req/s for all clients
globalLimiter := kazrl.NewTokenBucket(1000, 2000)
globalMiddleware := middleware.HTTP(globalLimiter)

// Per-IP limit: 10 req/s per client
perIPMiddleware := middleware.HTTPWithKeyFunc(
    func() kazrl.RateLimiter {
        return kazrl.NewTokenBucket(10, 20)
    },
    middleware.KeyByIP,
)

// Stack them!
handler := globalMiddleware(perIPMiddleware(yourHandler))

This protects against both individual abuse and total system overload.

Performance

Benchmarks on a modern system (Intel i7-1355U):

BenchmarkTokenBucketAllow-12        4,574,689 ops    255.6 ns/op    0 allocs/op
BenchmarkLeakyBucketAllow-12        5,218,902 ops    208.3 ns/op    0 allocs/op
BenchmarkSlidingWindowAllow-12      6,476,462 ops    198.6 ns/op    0 allocs/op

200-260 nanoseconds per operation with zero allocations. That's fast enough for the most demanding applications.

Production-Ready Features

Context Support

All blocking operations support context cancellation:

ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
defer cancel()

if err := limiter.Wait(ctx); err != nil {
    // Handle timeout or cancellation
}

Reservation API

For advanced use cases, you can reserve tokens and schedule work:

waitDuration := limiter.Reserve()
if waitDuration > 0 {
    // Schedule for later
    time.AfterFunc(waitDuration, processRequest)
} else {
    // Process immediately
    processRequest()
}

Thread-Safe

All operations are thread-safe. You can safely use the same limiter instance across multiple goroutines.

Algorithm Comparison

Algorithm Burst Support Smoothing Memory Best For
Token Bucket Yes No Low Public APIs, burst tolerance
Leaky Bucket Queued Yes Medium Downstream protection
Sliding Window No No High Strict enforcement

Implementation Insights

Why Zero Dependencies?

Dependencies are a security and maintenance burden. By keeping kazrl dependency-free:

  • No supply chain attacks via transitive dependencies
  • Faster installation and smaller binaries
  • No version conflicts with your other dependencies
  • Easy to audit (< 2000 lines of code)

Concurrency Design

Each algorithm uses sync.Mutex for thread-safety. While this might seem simple, it's actually the right choice here:

type tokenBucket struct {
    mu         sync.Mutex
    rate       float64
    burst      int
    tokens     float64
    lastUpdate time.Time
}

Lock contention is minimal because:

  1. Operations are extremely fast (< 300ns)
  2. The critical section is tiny (just token math)
  3. Per-client limiting distributes the load

For most applications, you'll never see contention. If you're handling millions of requests per second per endpoint, you might need a distributed solution anyway.

Memory Management

The library is designed to minimize allocations:

// No allocations in the hot path
func (tb *tokenBucket) Allow() bool {
    tb.mu.Lock()
    defer tb.mu.Unlock()

    now := time.Now()
    tb.refillTokens(now) // Pure math, no allocations

    if tb.tokens >= 1.0 {
        tb.tokens -= 1.0
        return true
    }
    return false
}

The only allocations happen when creating new per-client limiters, which is infrequent.

Real-World Example

Here's a complete example of a production-ready API server:

package main

import (
    "encoding/json"
    "net/http"

    "github.com/Makennsky/kazrl"
    "github.com/Makennsky/kazrl/middleware"
)

func main() {
    // Global rate limit: 10,000 req/s
    globalLimiter := kazrl.NewTokenBucket(10000, 20000)
    globalMiddleware := middleware.HTTP(globalLimiter)

    // Per-IP rate limit: 100 req/s
    perIPMiddleware := middleware.HTTPWithKeyFunc(
        func() kazrl.RateLimiter {
            return kazrl.NewTokenBucket(100, 200)
        },
        middleware.KeyByIP,
    )

    // API handler
    apiHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        response := map[string]string{
            "status": "ok",
            "message": "Request processed",
        }
        json.NewEncoder(w).Encode(response)
    })

    // Stack middleware
    http.Handle("/api/", globalMiddleware(perIPMiddleware(apiHandler)))

    http.ListenAndServe(":8080", nil)
}

When to Use Each Algorithm

Token Bucket - Default choice

  • Public APIs
  • User-facing endpoints
  • Services that need burst capacity
  • Not suitable when strict rate enforcement is needed

Leaky Bucket - Traffic shaping

  • Protecting slow downstream services
  • Database query rate limiting
  • Smoothing traffic spikes
  • Not suitable when you need to allow bursts

Sliding Window - Strict enforcement

  • Billing/metered APIs
  • When accuracy is critical
  • Preventing gaming fixed windows
  • Not suitable when you need burst capacity

Future Plans

Ideas I'm considering:

  • Distributed rate limiting (Redis backend)
  • Prometheus metrics integration
  • Response header injection (X-RateLimit-*)
  • Dynamic rate adjustment based on system load
  • gRPC interceptors

What would you find useful? Let me know in the comments!

Resources

Try It Out

Give kazrl a try in your next project! It's production-ready, battle-tested, and takes 2 minutes to integrate.

go get github.com/Makennsky/kazrl

If you find it useful, please star the repo on GitHub!

What rate limiting challenges have you faced? Share your experiences in the comments below!

Built in Kazakhstan


This content originally appeared on DEV Community and was authored by Maksat Ramazan


Print Share Comment Cite Upload Translate Updates
APA

Maksat Ramazan | Sciencx (2025-11-19T14:49:26+00:00) Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window). Retrieved from https://www.scien.cx/2025/11/19/building-a-zero-dependency-rate-limiter-in-go-token-bucket-leaky-bucket-sliding-window/

MLA
" » Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window)." Maksat Ramazan | Sciencx - Wednesday November 19, 2025, https://www.scien.cx/2025/11/19/building-a-zero-dependency-rate-limiter-in-go-token-bucket-leaky-bucket-sliding-window/
HARVARD
Maksat Ramazan | Sciencx Wednesday November 19, 2025 » Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window)., viewed ,<https://www.scien.cx/2025/11/19/building-a-zero-dependency-rate-limiter-in-go-token-bucket-leaky-bucket-sliding-window/>
VANCOUVER
Maksat Ramazan | Sciencx - » Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/11/19/building-a-zero-dependency-rate-limiter-in-go-token-bucket-leaky-bucket-sliding-window/
CHICAGO
" » Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window)." Maksat Ramazan | Sciencx - Accessed . https://www.scien.cx/2025/11/19/building-a-zero-dependency-rate-limiter-in-go-token-bucket-leaky-bucket-sliding-window/
IEEE
" » Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window)." Maksat Ramazan | Sciencx [Online]. Available: https://www.scien.cx/2025/11/19/building-a-zero-dependency-rate-limiter-in-go-token-bucket-leaky-bucket-sliding-window/. [Accessed: ]
rf:citation
» Building a Zero-Dependency Rate Limiter in Go (Token Bucket, Leaky Bucket, Sliding Window) | Maksat Ramazan | Sciencx | https://www.scien.cx/2025/11/19/building-a-zero-dependency-rate-limiter-in-go-token-bucket-leaky-bucket-sliding-window/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.