Saga Pattern
Manage distributed transactions across services using a sequence of local transactions with compensating rollbacks on failure.
★★★★★3/5System topology — how multiple services are organised
Interactive visualization
LiveHow it works
A Saga is a sequence of local transactions where each step publishes an event or message triggering the next. If a step fails, compensating transactions roll back completed steps.
Two flavours: — Choreography: each service listens for events and reacts, no central coordinator. — Orchestration: a central Saga orchestrator sends commands to each service.
Sagas replace ACID distributed transactions (which don't scale) with eventual consistency + compensation.
Why it matters
Distributed transactions via 2PC are impractical across microservices. The Saga pattern is the standard solution for maintaining data consistency across service boundaries.
✓ When to use
- →Multi-step business transactions spanning multiple services
- →Order processing: reserve inventory → charge payment → ship
- →Any workflow where partial completion needs to be undone
✗ When NOT to use
- →Simple transactions within a single service/database
- →When strong consistency is non-negotiable
Trade-offs
No distributed locking — services stay independent
Compensating transactions must be designed and maintained
Works across different databases and services
Debugging failures across steps is complex
In production
Order workflow: payment → inventory → shipping each as separate steps with compensation
Saga orchestration framework widely used in fintech
Industry adoption
Related principles
Microservices Architecture
LiveDecompose an application into small, independently deployable services that communicate over a network.
Event-Driven Architecture
LiveServices communicate by producing and consuming events asynchronously through a central event bus or message broker.
Event Sourcing
LiveStore state as an immutable log of events rather than the current snapshot — rebuild state by replaying events.
CQRS
LiveSeparate the model used to read data (Query) from the model used to write data (Command) — each optimised independently.