git-branchEngineering

Event-Driven Architecture

Decoupling with messaging: broker choices, the transactional outbox, idempotent consumers, ordering, DLQs.

1 item

Links1

01NotesNote

Why event-driven

Decouple producers from consumers. The service that records "order placed" shouldn't know or care that email, analytics, and inventory all need to react. It emits an event; consumers subscribe. This is what lets parts of the system evolve and scale independently.

Messaging backbone options

  • Kafka / Redpanda — high-throughput, ordered, replayable log. Right when you need durability, replay, and many consumers. (Redpanda = Kafka-compatible, simpler ops, no JVM/ZooKeeper.)
  • RabbitMQ — flexible routing, good for task distribution and RPC-style messaging.
  • Redis Streams — lightweight, good for moderate volume without standing up Kafka.
  • Postgres + LISTEN/NOTIFY or an outbox table — perfectly fine at small/medium scale, and one less system to run.

The two patterns that make it reliable

Transactional outbox — the fix for dual-write. You can't atomically write to Postgres and publish to a broker; if the publish fails after the commit, you've lost the event. Instead:

  1. In the same DB transaction as your business write, insert the event into an outbox table.
  2. A separate relay polls the outbox and publishes to the broker, marking rows sent.
  3. Now the event is guaranteed to exist if and only if the business change committed.

Idempotent consumers — brokers deliver at-least-once, so duplicates happen. Every consumer must tolerate seeing the same event twice:

  • Carry a unique event ID; record processed IDs and skip duplicates, or
  • Make the handler naturally idempotent (upserts, "set status = X" rather than "increment").

Other essentials

  • Ordering: only guaranteed within a partition/key. Partition by entity ID (e.g. client_id) when per-entity order matters.
  • Schema evolution: version your event payloads; add fields, don't repurpose them. A schema registry helps once you have many producers/consumers.
  • Dead-letter queue: events that fail repeatedly go to a DLQ for inspection instead of blocking the stream.
  • Choreography vs orchestration: choreography (services react to events) is loosely coupled but hard to trace; orchestration (a coordinator like Temporal drives the steps) is easier to reason about. Mix deliberately.