Building a Production API Gateway on Cloudflare Workers with Hono

Modern APIs need rate limiting, authentication, caching, and observability — but running a dedicated gateway server adds cost and complexity. Cloudflare Workers lets you build a full-featured API gateway at the edge, with zero cold starts and global di…


This content originally appeared on DEV Community and was authored by Young Gao

Modern APIs need rate limiting, authentication, caching, and observability — but running a dedicated gateway server adds cost and complexity. Cloudflare Workers lets you build a full-featured API gateway at the edge, with zero cold starts and global distribution.

In this guide, we'll build a production-ready API gateway using Hono (a lightweight web framework), Durable Objects (for distributed rate limiting), and Workers' built-in Cache API.

Prerequisites

  • Node.js 18+ and npm
  • A Cloudflare account (free tier works)
  • Basic familiarity with TypeScript and REST APIs

Project Setup

npm create cloudflare@latest api-gateway -- --template hono
cd api-gateway
npm install hono jose

Your wrangler.toml needs Durable Object bindings:

name = "api-gateway"
main = "src/index.ts"
compatibility_date = "2024-01-01"

[durable_objects]
bindings = [
  { name = "RATE_LIMITER", class_name = "RateLimiter" }
]

[[migrations]]
tag = "v1"
new_classes = ["RateLimiter"]

[vars]
UPSTREAM_URL = "https://api.example.com"
RATE_LIMIT_RPM = "60"

Gateway Architecture

Client -> [Auth] -> [Rate Limit] -> [Cache Check] -> [Proxy] -> [Log] -> Response

Each step is a Hono middleware. If any step fails, the request short-circuits with an error.

Step 1: The Hono Application Shell

// src/index.ts
import { Hono } from 'hono';
import { cors } from 'hono/cors';
import { RateLimiter } from './rate-limiter';
import { authMiddleware } from './middleware/auth';
import { rateLimitMiddleware } from './middleware/rate-limit';
import { cacheMiddleware } from './middleware/cache';
import { loggingMiddleware } from './middleware/logging';
import { proxyHandler } from './handlers/proxy';

type Bindings = {
  RATE_LIMITER: DurableObjectNamespace;
  UPSTREAM_URL: string;
  RATE_LIMIT_RPM: string;
  JWT_SECRET: string;
};

const app = new Hono<{ Bindings: Bindings }>();

app.use('*', cors());
app.use('*', loggingMiddleware);

app.get('/health', (c) => c.json({ status: 'ok', edge: c.req.header('cf-ray') }));

app.use('/api/*', authMiddleware);
app.use('/api/*', rateLimitMiddleware);
app.get('/api/*', cacheMiddleware);
app.all('/api/*', proxyHandler);

export default app;
export { RateLimiter };

Step 2: JWT Authentication

// src/middleware/auth.ts
import { createMiddleware } from 'hono/factory';
import * as jose from 'jose';

export const authMiddleware = createMiddleware(async (c, next) => {
  const authHeader = c.req.header('Authorization');
  if (!authHeader?.startsWith('Bearer ')) {
    return c.json({ error: 'Missing or invalid Authorization header' }, 401);
  }

  const token = authHeader.slice(7);
  try {
    const secret = new TextEncoder().encode(c.env.JWT_SECRET);
    const { payload } = await jose.jwtVerify(token, secret, {
      algorithms: ['HS256'],
    });
    c.set('userId', payload.sub as string);
    c.set('scopes', (payload.scopes as string[]) || []);
    await next();
  } catch (err) {
    if (err instanceof jose.errors.JWTExpired) {
      return c.json({ error: 'Token expired' }, 401);
    }
    return c.json({ error: 'Invalid token' }, 401);
  }
});

Why jose over jsonwebtoken? jose uses the Web Crypto API natively -- perfect for edge runtimes without Node.js polyfills.

Step 3: Distributed Rate Limiting with Durable Objects

// src/rate-limiter.ts
export class RateLimiter {
  private state: DurableObjectState;
  private requests: number[] = [];

  constructor(state: DurableObjectState) {
    this.state = state;
  }

  async fetch(request: Request): Promise<Response> {
    const url = new URL(request.url);
    const limit = parseInt(url.searchParams.get('limit') || '60');
    const windowMs = parseInt(url.searchParams.get('window') || '60000');
    const now = Date.now();

    const stored = await this.state.storage.get<number[]>('requests');
    if (stored) this.requests = stored;

    this.requests = this.requests.filter((ts) => now - ts < windowMs);

    if (this.requests.length >= limit) {
      const oldestInWindow = Math.min(...this.requests);
      const retryAfter = Math.ceil((oldestInWindow + windowMs - now) / 1000);
      return new Response(
        JSON.stringify({ error: 'Rate limit exceeded', retryAfter, limit, remaining: 0 }),
        {
          status: 429,
          headers: {
            'Content-Type': 'application/json',
            'Retry-After': retryAfter.toString(),
            'X-RateLimit-Limit': limit.toString(),
            'X-RateLimit-Remaining': '0',
          },
        }
      );
    }

    this.requests.push(now);
    await this.state.storage.put('requests', this.requests);

    const remaining = limit - this.requests.length;
    return new Response(
      JSON.stringify({ allowed: true, remaining, limit }),
      {
        headers: {
          'X-RateLimit-Limit': limit.toString(),
          'X-RateLimit-Remaining': remaining.toString(),
        },
      }
    );
  }
}

The middleware:

// src/middleware/rate-limit.ts
import { createMiddleware } from 'hono/factory';

export const rateLimitMiddleware = createMiddleware(async (c, next) => {
  const userId = c.get('userId') || c.req.header('cf-connecting-ip') || 'anonymous';
  const id = c.env.RATE_LIMITER.idFromName(userId);
  const limiter = c.env.RATE_LIMITER.get(id);

  const limit = parseInt(c.env.RATE_LIMIT_RPM || '60');
  const resp = await limiter.fetch(
    new Request(`https://limiter/?limit=${limit}&window=60000`)
  );

  const result = await resp.json<{ allowed?: boolean; remaining: number; limit: number }>();

  c.header('X-RateLimit-Limit', result.limit.toString());
  c.header('X-RateLimit-Remaining', result.remaining.toString());

  if (!result.allowed) {
    return c.json({ error: 'Rate limit exceeded' }, 429);
  }
  await next();
});

Each user gets their own Durable Object instance -- rate limits are per-user and globally consistent across all edge locations.

Step 4: Response Caching

// src/middleware/cache.ts
import { createMiddleware } from 'hono/factory';

export const cacheMiddleware = createMiddleware(async (c, next) => {
  if (c.req.method !== 'GET') { await next(); return; }

  const cache = caches.default;
  const cacheKey = new Request(c.req.url, { method: 'GET' });

  const cached = await cache.match(cacheKey);
  if (cached) {
    c.header('X-Cache', 'HIT');
    const body = await cached.text();
    const headers = Object.fromEntries(cached.headers.entries());
    return c.body(body, 200, headers);
  }

  c.header('X-Cache', 'MISS');
  await next();

  if (c.res.status === 200) {
    const response = c.res.clone();
    const cacheResponse = new Response(response.body, {
      headers: {
        ...Object.fromEntries(response.headers.entries()),
        'Cache-Control': 'public, max-age=60',
      },
    });
    c.executionCtx.waitUntil(cache.put(cacheKey, cacheResponse));
  }
});

Step 5: Request Proxying

// src/handlers/proxy.ts
import { createMiddleware } from 'hono/factory';

export const proxyHandler = createMiddleware(async (c) => {
  const upstreamUrl = new URL(c.req.path.replace('/api', ''), c.env.UPSTREAM_URL);

  const requestUrl = new URL(c.req.url);
  requestUrl.searchParams.forEach((value, key) => {
    upstreamUrl.searchParams.set(key, value);
  });

  const headers = new Headers(c.req.raw.headers);
  headers.delete('Authorization');
  headers.set('X-Forwarded-For', c.req.header('cf-connecting-ip') || '');
  headers.set('X-Request-ID', crypto.randomUUID());

  const upstreamReq = new Request(upstreamUrl.toString(), {
    method: c.req.method,
    headers,
    body: ['GET', 'HEAD'].includes(c.req.method) ? null : c.req.raw.body,
  });

  const startTime = Date.now();
  const response = await fetch(upstreamReq);
  const duration = Date.now() - startTime;

  const responseHeaders = new Headers(response.headers);
  responseHeaders.set('X-Gateway-Duration', `${duration}ms`);
  responseHeaders.set('X-Request-ID', headers.get('X-Request-ID')!);

  return new Response(response.body, {
    status: response.status,
    headers: responseHeaders,
  });
});

Step 6: Structured Logging

// src/middleware/logging.ts
import { createMiddleware } from 'hono/factory';

export const loggingMiddleware = createMiddleware(async (c, next) => {
  const requestId = crypto.randomUUID();
  const startTime = Date.now();
  c.header('X-Request-ID', requestId);

  await next();

  const logEntry = {
    timestamp: new Date().toISOString(),
    requestId,
    method: c.req.method,
    path: c.req.path,
    status: c.res.status,
    duration: Date.now() - startTime,
    ip: c.req.header('cf-connecting-ip'),
    userAgent: c.req.header('user-agent'),
    country: c.req.header('cf-ipcountry'),
    userId: c.get('userId') || null,
    cacheStatus: c.res.headers.get('X-Cache') || 'N/A',
  };

  console.log(JSON.stringify(logEntry));
});

Performance

Component Overhead
JWT verification ~1-2ms
Rate limit (Durable Object) ~5-15ms
Cache hit ~1ms
Cache miss + proxy Upstream latency + ~2ms
Logging (async) 0ms

Total overhead for cached responses: under 5ms. For uncached with rate limiting: 10-20ms.

Production Hardening

app.onError((err, c) => {
  console.error(JSON.stringify({
    error: err.message,
    stack: err.stack,
    path: c.req.path,
  }));
  return c.json({ error: 'Internal gateway error' }, 500);
});

// Request size limit
app.use('/api/*', async (c, next) => {
  const contentLength = parseInt(c.req.header('content-length') || '0');
  if (contentLength > 10 * 1024 * 1024) {
    return c.json({ error: 'Request too large' }, 413);
  }
  await next();
});

Conclusion

Under 300 lines of TypeScript gives you authentication, distributed rate limiting, caching, and structured logging at the edge. Key advantages:

  • Zero cold starts and global distribution across 300+ cities
  • Pay per request ($0.50/million on paid plan)
  • Strongly consistent rate limiting via Durable Objects

Next steps: API key management (KV), request transformation, A/B routing, WebSocket proxying.


This content originally appeared on DEV Community and was authored by Young Gao


Print Share Comment Cite Upload Translate Updates
APA

Young Gao | Sciencx (2026-03-21T02:30:09+00:00) Building a Production API Gateway on Cloudflare Workers with Hono. Retrieved from https://www.scien.cx/2026/03/21/building-a-production-api-gateway-on-cloudflare-workers-with-hono/

MLA
" » Building a Production API Gateway on Cloudflare Workers with Hono." Young Gao | Sciencx - Saturday March 21, 2026, https://www.scien.cx/2026/03/21/building-a-production-api-gateway-on-cloudflare-workers-with-hono/
HARVARD
Young Gao | Sciencx Saturday March 21, 2026 » Building a Production API Gateway on Cloudflare Workers with Hono., viewed ,<https://www.scien.cx/2026/03/21/building-a-production-api-gateway-on-cloudflare-workers-with-hono/>
VANCOUVER
Young Gao | Sciencx - » Building a Production API Gateway on Cloudflare Workers with Hono. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2026/03/21/building-a-production-api-gateway-on-cloudflare-workers-with-hono/
CHICAGO
" » Building a Production API Gateway on Cloudflare Workers with Hono." Young Gao | Sciencx - Accessed . https://www.scien.cx/2026/03/21/building-a-production-api-gateway-on-cloudflare-workers-with-hono/
IEEE
" » Building a Production API Gateway on Cloudflare Workers with Hono." Young Gao | Sciencx [Online]. Available: https://www.scien.cx/2026/03/21/building-a-production-api-gateway-on-cloudflare-workers-with-hono/. [Accessed: ]
rf:citation
» Building a Production API Gateway on Cloudflare Workers with Hono | Young Gao | Sciencx | https://www.scien.cx/2026/03/21/building-a-production-api-gateway-on-cloudflare-workers-with-hono/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.