Question 1

Token bucket vs sliding window vs fixed window — which one should I use?

Accepted Answer

Fixed window: simplest, per-minute counters, prone to edge-case bursts at window boundary. Token bucket: smooth refill at constant rate, handles bursts gracefully, industry standard for public APIs (GitHub, Stripe use it). Sliding window: accurate but complex, higher memory cost in distributed systems. Default to token bucket for new APIs; fixed window only for simple internal services.

Question 2

How do I implement distributed rate limiting across multiple servers?

Accepted Answer

Single-server rate limiting (in-memory counters) fails immediately when you scale. Use Redis as the source of truth: each request increments a counter with TTL = window size. Lua scripts atomically check-and-increment to avoid race conditions. For geo-distributed systems (multi-region), replicate Redis across regions or accept slightly stale limits for performance.

Question 3

What headers should I return: RateLimit-* or X-RateLimit-*?

Accepted Answer

IETF standard is RateLimit-Limit / RateLimit-Remaining / RateLimit-Reset (RFC 6585). GitHub uses X-RateLimit-* (legacy de-facto standard, still widely recognized). Return both for compatibility: RateLimit-* for modern clients, X-RateLimit-* for legacy. Always include Retry-After header with 429 responses (seconds or HTTP-date format).

Question 4

Per-user, per-IP, or per-API-key rate limiting — how do I choose?

Accepted Answer

Per-user (for authenticated requests): fairest, prevents account abuse. Per-IP (for public/unauthenticated endpoints): blocks abusers but punishes office-behind-NAT. Per-API-key: best for B2B APIs, allows tiered limits per subscription. Combine all three: strictest limit applies. Example: 1000/day per user, 100/min per IP, 10k/day per API key.

Question 5

How do I handle rate limiting with exponential backoff on the client side?

Accepted Answer

Check Retry-After header (in seconds or HTTP-date). Implement exponential backoff: wait 2^attempt seconds (1, 2, 4, 8, 16…) with jitter (±20%) to avoid thundering herd. For 429 responses, obey Retry-After strictly. For other transient failures (5xx), backoff is optional. Libraries: `axios-retry` (Node.js), `tenacity` (Python), `@shopify/network` (TypeScript).

Question 6

Free tier vs paid tier rate limiting — how do I differentiate?

Accepted Answer

Embed tier in the request context (user object or API key lookup). Apply loose limits to free tier (100 req/min), tight to premium (1000 req/min). Use same algorithm for both; just parameterize the limit. Track limit exhaustion per tier for analytics/billing. Offer burst allowance to premium tiers (smooth out spikes).

Question 7

Should I return 429 Conflict or 503 Unavailable when rate limited?

Accepted Answer

Use HTTP 429 Too Many Requests (RFC 6585) for client-side rate limit hits. Use 503 Service Unavailable only for server overload (not the client's fault). 429 signals 'retry later' (idempotent), while 503 suggests 'backend is down' (may not be safe to retry). Always pair 429 with Retry-After header.

Region	Junior	Mid	Senior
USA	$110k	$155k	$210k
UK	£65k	£95k	£135k
EU	€72k	€105k	€145k
CANADA	C$118k	C$165k	C$225k

API Rate Limiting

What is API Rate Limiting

📋 Before you start

💰 Salary by region

🎓 Certifications

🎯 Careers using API Rate Limiting

⚖ Compare with

❓ FAQ

Not sure this skill is for you?

Find your ideal career path