Question 1

Layer 4 vs Layer 7 load balancing — which should I use?

Accepted Answer

L4 (transport: TCP/UDP) is blazingly fast, handles any protocol, but has no request awareness — good for raw throughput, VoIP, gaming. L7 (application: HTTP/HTTPS) inspects headers/cookies, enables path-based routing, cookie stickiness, compression — best for web apps. Hybrid: L4 in front for DDoS/capacity, then L7 for app logic. Latency: L7 adds ~5-15ms vs L4's <1ms.

Question 2

When do I need sticky sessions and what's the tradeoff?

Accepted Answer

Sticky sessions bind a user to one backend server — needed for in-memory session state (PHP $_SESSION, user shopping carts without Redis). Cost: reduces load balancing efficiency, breaks auto-scaling (can't kill a sticky server mid-request). Better: externalize sessions to Redis/Memcached (stateless backends), use JWT tokens, or store in database. If sticky required: use IP hash or cookie-based affinity, set TTL, monitor uneven load.

Question 3

What autoscaling pitfalls should I avoid?

Accepted Answer

Metrics lag: CPU average measured over 5 min, so burst traffic creates 5-min delay before scaling. Too aggressive scaling = flapping/thrashing. Too conservative = customers hit rate limits. Solutions: use multiple metrics (CPU + request queue depth + network), set scale-up faster than scale-down, use predictive scaling (forecast based on time-of-day). Always test with load generator; don't trust defaults.

Question 4

What load balancing algorithms exist and when do I use each?

Accepted Answer

Round-robin: simple, even distribution (good when servers identical). Least-connections: routes to server with fewest active connections (good for long-lived connections). Weighted: route proportionally (good when servers different sizes: 70%→new, 30%→old during canary). IP hash: deterministic per client (good for session stickiness without cookies). Random: mathematically optimal at scale (used in CDNs). Latency-aware: active measurement of each backend's response time (rare, cutting-edge).

Question 5

DNS-based load balancing vs anycast — what's the difference?

Accepted Answer

DNS LB: client asks DNS, gets A record pointing to one LB instance, then connects via that LB. High latency (DNS lookup ~100ms), client not aware of failover (stale TTL). Anycast: multiple servers advertise same IP, network routes packet to nearest. Sub-1ms latency, automatic failover, but requires BGP + complex operations. Hybrid: use DNS for geo-routing (client → nearest region), then anycast within region.

Question 6

How do I implement health checks without false positives?

Accepted Answer

Simple ping (ICMP) = unreliable. Better: TCP connect (proves port open), or HTTP GET + 200 status (proves app responding). Advanced: synthetic transactions (hit /health endpoint, verify database connectivity). Pitfall: health check interval too short = traffic overhead; too long = 30-60sec before detecting failure. Rule: 3-5 sec interval, 3 failures to mark unhealthy, 1 success to mark healthy. Avoid: health checks that trigger expensive operations (full DB scan).

Question 7

How do blue-green deployments interact with load balancing?

Accepted Answer

Blue (live) + Green (new) environments run in parallel. LB switches traffic 100% at once (instant, easy rollback) or gradually (canary: 5% → 50% → 100%). Requires: health checks detect bad green, instant DNS/LB update capability. Gotcha: if sessions sticky to blue servers, green servers sit idle. Solution: external session store, or use canary to slowly drain blue.

Region	Junior	Mid	Senior
USA	$95k	$155k	$220k
UK	£70k	£105k	£145k
EU	€75k	€115k	€155k
CANADA	C$100k	C$165k	C$225k

Load Balancing & Scaling

What is Load Balancing & Scaling

📋 Before you start

💰 Salary by region

🎓 Certifications

🎯 Careers using Load Balancing & Scaling

⚖ Compare with

❓ FAQ

Not sure this skill is for you?

Find your ideal career path