git-branchEngineering
Event-Driven Architecture
Decoupling with messaging: broker choices, the transactional outbox, idempotent consumers, ordering, DLQs.
1 item
Links1
Why event-driven
Decouple producers from consumers. The service that records "order placed" shouldn't know or care that email, analytics, and inventory all need to react. It emits an event; consumers subscribe. This is what lets parts of the system evolve and scale independently.
Messaging backbone options
- Kafka / Redpanda — high-throughput, ordered, replayable log. Right when you need durability, replay, and many consumers. (Redpanda = Kafka-compatible, simpler ops, no JVM/ZooKeeper.)
- RabbitMQ — flexible routing, good for task distribution and RPC-style messaging.
- Redis Streams — lightweight, good for moderate volume without standing up Kafka.
- Postgres + LISTEN/NOTIFY or an outbox table — perfectly fine at small/medium scale, and one less system to run.
The two patterns that make it reliable
Transactional outbox — the fix for dual-write. You can't atomically write to Postgres and publish to a broker; if the publish fails after the commit, you've lost the event. Instead:
- In the same DB transaction as your business write, insert the event into an
outboxtable. - A separate relay polls the outbox and publishes to the broker, marking rows sent.
- Now the event is guaranteed to exist if and only if the business change committed.
Idempotent consumers — brokers deliver at-least-once, so duplicates happen. Every consumer must tolerate seeing the same event twice:
- Carry a unique event ID; record processed IDs and skip duplicates, or
- Make the handler naturally idempotent (upserts, "set status = X" rather than "increment").
Other essentials
- Ordering: only guaranteed within a partition/key. Partition by entity ID (e.g.
client_id) when per-entity order matters. - Schema evolution: version your event payloads; add fields, don't repurpose them. A schema registry helps once you have many producers/consumers.
- Dead-letter queue: events that fail repeatedly go to a DLQ for inspection instead of blocking the stream.
- Choreography vs orchestration: choreography (services react to events) is loosely coupled but hard to trace; orchestration (a coordinator like Temporal drives the steps) is easier to reason about. Mix deliberately.