Skip to main content
JobCannon
All skills

Event-Driven Architecture

⬢ TIER 2Tech
High
Salary impact
8 months
Time to learn
Hard
Difficulty
7
Careers
TL;DR

Event-driven architecture decouples services via events instead of direct calls: publishers emit events to brokers (Kafka, RabbitMQ, AWS EventBridge), subscribers react asynchronously. Patterns: pub/sub, event sourcing, CQRS, sagas. Career path: Practitioner (queues + retries, $110-140k) → Architect (sourcing + CQRS + idempotency, $140-180k) → Expert (mesh + stream processing + observability, $180-240k) over 6-8 months. Kafka processes trillions of events/day. Essential for scale: Netflix/Airbnb/Uber all event-driven at core.

What is Event-Driven Architecture

Asynchronous, decoupled systems using events. Message queues, event sourcing, CQRS. Modern microservices communication pattern. Powers Uber, Netflix, Airbnb backends. Learning Curve: Medium-Hard (async mindset + patterns)

🔧 TOOLS & ECOSYSTEM
Apache KafkaRabbitMQNATSAWS EventBridgeAWS KinesisAmazon SNSAmazon SQSApache PulsarRedpandaDebeziumEventStoreAWS EventBridgeKafka StreamsSchema RegistryFlink

💰 Salary by region

RegionJuniorMidSenior
USA$110k$155k$210k
UK£65k£90k£135k
EU€70k€95k€145k
CANADAC$115kC$160kC$220k

❓ FAQ

Event-driven vs request/response (REST) — when do I use which?
REST: synchronous, blocking, point-to-point. Client waits for response. Works for simple flows: fetch user, update profile. Event-driven: asynchronous, non-blocking, decoupled. Service A emits 'UserCreated', services B/C/D react independently. Use REST for queries/reads; events for side effects (emails, analytics, notifications). Hybrid: REST API + event backbone for workflows. Airbnb uses REST for searches, events for booking cascades.
What is CQRS and event sourcing — are they the same?
No. Event sourcing: store every change as an immutable event. Current state = replay all events. Enables audit trail, time travel, reordering. CQRS (Command Query Responsibility Segregation): separate write model (commands that emit events) from read model (optimized snapshots). Usually together: commands emit events → event store persists → read models subscribe. CQRS alone (without sourcing) = just split reads/writes. Sourcing + CQRS = full audit + performance.
How do I guarantee exactly-once delivery in event systems?
Exactly-once is hard; practically target at-least-once + idempotency. Three layers: (1) Broker: Kafka with acks=all + min.insync.replicas=2 prevents loss. (2) Consumer: track offsets, commit after processing. (3) Idempotency: handlers must tolerate dups (idempotency key in DB, check before insert). Saga pattern with compensations for distributed txns. Exactly-once is expensive; Netflix/Uber accept at-least-once + idempotent ops.
CQRS + eventual consistency = how do users see fresh data?
Eventual consistency: write succeeds, read lags 100-500ms. For sensitive flows: use synchronous confirmation (command succeeds, wait for read model) or query the write model directly (master-slave read). Eventbus publishes 'PaymentSucceeded', payment service queries its own state, returns to UI immediately, read model catches up in background. CQRS shines when read lags are acceptable (dashboards, analytics). Avoid for transactional flows (checkout, bank transfers).
Schema evolution — how do I evolve event schemas without breaking subscribers?
Versioned events: include version in event payload or schema registry (Confluent). Old subscribers ignore new fields, new subscribers use defaults for missing old fields. Example: 'UserCreated' v1 has {name, email}, v2 adds {country}. Old code ignores country, new code treats missing country as null. Debezium + Schema Registry handle this auto. Never remove fields; mark deprecated. Test backward/forward compat before deploying. LinkedIn/Netflix use schema-evolution heavily.
How does the saga pattern handle distributed transactions?
Saga = choreography or orchestration. Choreography: each service listens to events and emits next (BookingService → PaymentService → NotificationService). If PaymentService fails, compensate (refund saga). Orchestration: central coordinator orchestrates steps (Airbnb Booking Saga Manager). Sagas tolerate partial failures better than 2PC. Netflix uses sagas for reservations. Drawback: complex error paths, eventual consistency. Use for non-atomic, long-running flows (booking, onboarding).
Event-driven observability — how do I debug async systems?
Traditional logs are hard (request context lost). Use: (1) Distributed tracing: correlation ID per event chain, trace across services (Jaeger, Zipkin). (2) Event replay: store events, replay for debugging. (3) Event sourcing itself is an audit log. (4) Metrics: lag (consumer behind), throughput, errors per event type. (5) Dead letter queues: catch failed events for replay. Datadog/New Relic with Kafka integrations recommended. Observability is 50% of event-driven cost/complexity.

Not sure this skill is for you?

Take a 10-min Career Match — we'll suggest the right tracks.

Find my best-fit skills →

Find your ideal career path

Skill-based matching across 2,536 careers. Free, ~10 minutes.

Take Career Match — free →