Skip to main content

Command Palette

Search for a command to run...

Designing the Write Path in CQRS

Where the System Actually Begins

Updated
13 min read
Designing the Write Path in CQRS

In a typical monolith, the read and write sides of the application are entangled — reads feed off the same models the writes mutate. But in CQRS, the write side stands alone. It’s the source of truth. The heartbeat. The origin from which all downstream models flow.

If the write path fails — nothing else matters.
If the write path is designed poorly — everything downstream inherits its flaws.

So before we talk about syncing events or denormalized reads, we need to zoom in on this foundational piece.

This section kicks off with:

  • What the write path really is in CQRS

  • The expectations placed on it (consistency, idempotency, isolation)

  • How commands differ from CRUD, and why this subtlety matters

  • Why the write path isn't just “the old system without the queries”

Let’s get into it.


The Traits of a Good Write Path

In CQRS, the write path is not just the original system with its SELECTs removed. It’s a precision-built component whose job is to accept commands, validate intent, persist durable change, and emit events for everything else to catch up.

A strong write path is shaped by five core traits:


✅ 1. Intent-first, not Data-first

You don’t say “insert a row in the orders table.”
You say: “PlaceOrder.”
The system — not the user — decides how that maps to persistence.

This keeps the model safe from leakage, and your invariants protected.


✅ 2. Idempotency is Non-Negotiable

Whether it’s retries from clients or message duplication from queues, every write operation must do the same thing every time for the same command.

Idempotency ≠ “ignore duplicates.”
It means: process exactly once, even if delivered multiple times.


✅ 3. Transactional Boundaries Are Clear

A good write path knows its unit of work. You never half-update a customer and leave their invoice dangling.

Either the entire change goes through — or none of it does.
This makes rollback reasoning (and debugging) straightforward.


✅ 4. Event Emission Is a Core Concern

The write model doesn’t just write to the DB — it produces events that fuel the read model and other subsystems.

But these events aren’t side effects — they’re first-class citizens.
Their contracts must be stable, well-versioned, and auditable.


✅ 5. Backpressure-Aware and Operationally Lean

Your write path should fail fast, validate early, and shed load when overwhelmed.

It’s better to reject bad or excessive writes than silently clog queues and downstream processors.


Designing Commands and Write Models

The write path in CQRS doesn’t deal with rows or columns — it deals with commands and aggregates. This is where design discipline kicks in: the separation of what is being requested vs how it should be applied safely and consistently.


Commands: Requests With Intent, Not Instructions

A command is a request to perform an action that mutates state. It’s not a DTO with raw data. It’s not a "please insert" instruction.

Example:

public record PlaceOrderCommand(
    UUID customerId,
    List<OrderItem> items,
    PaymentMethod paymentMethod
) {}

A good command object:

  • Is explicit: No overloaded flags or boolean toggles.

  • Is immutable: Prevents mid-flight tampering.

  • Models business language, not technical mechanics.


Aggregates: Gatekeepers of Invariants

The write model is built around aggregates — transactional consistency boundaries that enforce rules.

A single aggregate:

  • Owns its own lifecycle (create, update, delete)

  • Rejects illegal state transitions

  • Produces events to communicate state change

Example (simplified):

public class Order {
    private UUID id;
    private OrderStatus status;
    private List<OrderItem> items;

    public Order(PlaceOrderCommand cmd) {
        validate(cmd);
        this.id = UUID.randomUUID();
        this.items = cmd.items();
        this.status = OrderStatus.CREATED;
        // Emit: OrderPlacedEvent
    }

    public void cancel() {
        if (this.status != OrderStatus.CREATED)
            throw new IllegalStateException("Cannot cancel after fulfillment");
        this.status = OrderStatus.CANCELLED;
        // Emit: OrderCancelledEvent
    }
}

🎯 Key Design Principles

  1. Aggregates enforce invariants locally — you don’t rely on DB constraints alone.

  2. All commands go through aggregates — no bypasses or direct repository hacks.

  3. Each command results in at most one state transition — no batch mutations inside one command.

  4. Emit events before persisting — to enable unit testing and auditability.


Patterns of Persistence in the Write Path

You’ve validated the command. The aggregate accepted the state transition. Now comes the part that kills systems when done wrong: persistence.

The write path must persist:

  • The new application state (e.g., in a write-optimized DB)

  • The corresponding domain event(s)

And it must do this with consistency guarantees that avoid the classic “write succeeded, but event was lost” pitfall.

Let’s explore the dominant patterns.


1. Dual Writes (Anti-pattern)

What it is: Save state to DB → separately publish event to Kafka/message broker.

Why it breaks:

  • No transactional boundary between DB and broker.

  • If the second step fails, state is updated but the event is lost → read models go out of sync.

Still common? Unfortunately yes — especially in rushed microservices or legacy splits.


2. Transactional Outbox Pattern

What it is:

  • Instead of publishing directly to Kafka, you write the event to a dedicated outbox table in the same transaction as your domain update.

  • A separate relay process reads from the outbox and pushes to Kafka.

Why it works:

  • Strong consistency with app state.

  • Resilient to crashes and retries — you control reprocessing.

Downsides:

  • More infra (outbox relayer, deduplication keys).

  • Eventual dispatch still needs to be monitored.


3. Event Sourcing (Special Case)

What it is:

  • The event is the state. You don’t store the final model — you persist the sequence of events that led to it.

Why it works:

  • Perfect alignment between state and events.

  • Historical replay, versioning, auditing become native features.

But:

  • Not always suitable — rebuild costs, event schema drift, and tooling limitations are real.

4. Append-only Logs with Materialization

Used in high-throughput systems (e.g., order books, IoT ingestion):

  • Log all writes to a fast, immutable store (e.g., Kafka, EventStoreDB).

  • Materialize the current state asynchronously using event processors.

Advantage: Write speed, decoupling.
Tradeoff: Read-after-write consistency is sacrificed unless the reader is log-aware.


Real-World Decisions

Use CaseRecommended Pattern
E-commerce order systemTransactional Outbox
Payment eventsEvent Sourcing or Dual with audit
IoT ingestionAppend-only logs + materializer
Legacy monolith splitDual Write (but beware)

Choosing the Right Database for the Write Path

CQRS doesn’t dictate what database you must use. It only says: pick the one that best fits the shape of your writes. And that’s where engineering rigor is either shown — or skipped.

Let’s unpack what this actually means.


Start With the Shape of the Workload

Not all writes are equal. Ask:

  1. How frequent are the writes?

    • 10/s or 10,000/s?

    • Are writes spiky (flash sales) or steady (IoT sensors)?

  2. What consistency guarantees do you need?

    • Is exactly-once required?

    • Is at-least-once tolerable?

  3. Is data mutable or append-only?

    • Orders mutate (status change).

    • Logs just grow.

  4. Do you need ACID?

    • Single-row vs multi-row vs distributed transactions.
  5. What is the write amplification cost?

    • Some systems update indexes, materialized views, constraints — all on write.

Write-Optimized DB Categories (and Their Strengths)

DB TypeStrengthsWeaknesses
Relational (Postgres, MySQL)Transactions, constraints, familiar toolingVertical scaling, joins hurt on scale
Document DBs (Mongo, Couchbase)Flexible schemas, denormalized writesACID limited to single doc, schema drift risks
Wide-Column (Cassandra, Scylla)High write throughput, predictable patternsPoor ad-hoc querying, modeling must be done upfront
Key-Value Stores (DynamoDB, Redis)Extremely fast, simple access pathsNo multi-key transactions, limited querying
Event Stores (EventStoreDB, Kafka)Append-only writes, replayabilityComplex read-side modeling, harder to evolve schemas

Thinking Like an Architect

Instead of asking "which DB is fastest?" ask:

  • Can the DB maintain integrity under concurrency?

  • How does it behave when a partition occurs?

  • Is it observable under production pressure (e.g., write lag, tombstones)?

  • What’s the cost per 1000 writes under sustained load?

  • Can it support idempotent upserts, retries, and backpressure?


What to Avoid

  1. Defaulting to your favorite DB
    Just because you know Mongo doesn't mean it’s the right tool for payment mutations.

  2. Choosing based on read-side needs
    The write DB must be chosen for command handling, not analytics.

  3. Assuming eventual consistency means "it doesn’t matter"
    Eventual ≠ sloppy. You need consistency models you can reason about.


How to Choose a Write DB — 6 Real Systems, 6 Tradeoffs

🧾 Note:
These aren’t prescriptions — they’re mindset blueprints.
The right database depends on your app’s real constraints: latency, scale, consistency, and team skillsets. The examples here are to show how architects think, not what everyone should use.


Example 1: Online Retail — Orders, Payments, and Inventory

Workload Shape:

  • Moderate, bursty write traffic (flash sales, promos)

  • Strong need for consistency (order placement, stock availability)

  • Multi-entity transactions: orders, inventory, payments

  • Data is mutable: status updates, delivery tracking, cancellations

Database Chosen: PostgreSQL

Why:

  • Full ACID support for multi-table transactions

  • Strong integrity guarantees (foreign keys, constraints)

  • Can be scaled with read replicas + partitioning on large order volumes

  • Works well with event-based CDC tools for CQRS sync

What Was Rejected (and Why):

  • MongoDB: Easy to start, but handling multi-entity atomicity needs awkward denormalization or two-phase logic

  • DynamoDB: Great for high throughput, but lacks native ACID across multiple items unless you model very carefully

  • Cassandra: Too much modeling effort for something that needs relational joins internally

Notes for CQRS:

  • Write side uses Postgres to handle atomic commands (e.g., createOrder + reserveInventory + initiatePayment)

  • Read side can project to Redis or Elasticsearch for speed


Example 2: Ride-Sharing Platform — Real-Time Trip, Driver, and Location Events

Workload Shape:

  • Extremely high write throughput: location pings, trip state changes, fare estimates

  • Writes are small but frequent (e.g., every 3–5 seconds per driver)

  • Low write latency is critical — riders must see updates in near real-time

  • Reads and writes often target the same object (e.g., trip status), but read models can be async

Database Chosen: Apache Cassandra or DynamoDB

Why:

  • Optimized for high write ingestion at low latency

  • Scales horizontally — critical when tracking millions of concurrent trips

  • Tunable consistency levels — can relax reads for speed while guaranteeing writes

  • Write availability is prioritized over strict read accuracy

What Was Rejected (and Why):

  • Postgres/MySQL: Can’t handle write velocity without aggressive partitioning and connection pooling gymnastics

  • MongoDB: Better suited for semi-structured doc updates than time-series writes at this granularity

  • Elasticsearch: Good for analytics, but not built for fast primary writes

Notes for CQRS:

  • Write path uses Cassandra with partition keys tuned to trip IDs or driver IDs

  • Sync layer streams trip events to read DBs (e.g., Redis for live map updates, Elasticsearch for search)


Example 3: Gaming Server — Multiplayer Sessions, State Sync, and Leaderboards

Workload Shape:

  • Very high concurrency (thousands of players interacting live)

  • Rapid state mutations: health, ammo, position, cooldown timers

  • Requires fast reads and writes for in-game logic

  • Some data is ephemeral (e.g., session state), some needs durability (e.g., match history, leaderboards)

Database Chosen: Redis (for live state) + Postgres (for durable writes)

Why:

  • Redis offers in-memory speed for per-frame updates with predictable latency

  • Postgres handles transactional storage of completed matches, player stats, XP progression

  • This dual system splits the fast game loop from persistent storage — a practical mini-CQRS

What Was Rejected (and Why):

  • MongoDB: Good for semi-structured data, but RAM-bound working set limits live session scalability

  • Cassandra: Too eventual; game state needs tighter consistency during interactions

  • DynamoDB: Viable but can get expensive and requires careful tuning for sub-10ms latencies

Notes for CQRS:

  • Live state updates flow into Redis directly

  • Post-game events (kills, score, achievements) are synced to Postgres via event stream

  • Read model (e.g., leaderboard) is asynchronously projected into Redis or Elasticsearch


Example 4: IoT Fleet Management — Sensors, Telemetry, and Alerts

Workload Shape:

  • Devices push time-series sensor data every few seconds or minutes

  • High write frequency, low payload per write (e.g., location, battery, temp)

  • Read patterns include recent-device summaries, anomaly detection, and aggregates

  • Writes far outnumber reads, but alerts and dashboards must remain responsive

Database Chosen: TimeScaleDB or InfluxDB

Why:

  • Purpose-built for time-series ingestion with efficient storage formats and rollups

  • Native support for downsampling, compression, and time-based retention policies

  • Can index on device ID and time, enabling fast recent-history lookups

  • Integrates well with Grafana and alerting pipelines

What Was Rejected (and Why):

  • Postgres/MySQL vanilla: Requires manual partitioning, indexing, and pruning

  • MongoDB: Flexible, but falls short for high-ingestion, time-series optimizations

  • Cassandra: Can ingest fast, but hard to query recent time slices efficiently

  • Redis: Too memory-bound; not sustainable for multi-TB time-series

Notes for CQRS:

  • Write path dumps device readings into TimeScaleDB

  • Read model pulls from materialized aggregates (e.g., last 1h avg per region)

  • Alerting services consume from a Kafka stream for real-time reactions


Example 5: Social Media Platform — Posts, Likes, Follows, and Fanout Triggers

Write Workload Characteristics:

  • High velocity writes: user posts, comments, likes, follow/unfollow events

  • Some writes trigger large-scale fanout (e.g., one post → thousands of followers)

  • Append-mostly behavior but occasionally involves mutability (like unlikes, deletions)

  • Event order matters (e.g., a follow before a post should show the post in feed)

Database Chosen for Write Path:

  • Postgres (if you want strong consistency + relational integrity)

  • Cassandra (if you're optimizing for scale-first, especially write throughput)

Why These Work:

  • Postgres: Ideal for enforcing constraints (e.g., no double-likes) and ensuring follow graphs are correct. ACID guarantees help maintain consistency across related entities (e.g., post visibility + user status)

  • Cassandra: Handles massive write throughput with tunable consistency. Suitable for denormalized, write-once models like append-only activity logs

Why Others Were Rejected:

  • MongoDB: Subdocuments lead to bloated documents or unbounded growth (e.g., comments array)

  • DynamoDB: Requires overly careful schema planning with GSIs, LSIs, and time-based writes

  • Redis: Not sustainable as a primary store — volatile memory, eventual consistency, lacks durability

CQRS Hint:

  • Write events here often fan out to read models asynchronously, making decoupling essential.

  • The write DB’s role is to ensure integrity and durability — not to serve feeds.


Example 6: Financial Systems — Transactions, Balances, and Audit Trails

Write Workload Characteristics:

  • Every write mutates core state: balances, ledger entries, transaction logs

  • Precision is non-negotiable — no replays, no duplicates, no mismatched balances

  • Often governed by legal, compliance, or regulatory constraints

  • All changes must be traceable, timestamped, and ideally immutable (append-only)

Database Chosen for Write Path:

  • Postgres with audit extensions or double-entry schema

  • Optionally, CockroachDB or Yugabyte for distributed ACID workloads

Why These Work:

  • Postgres: Strong ACID guarantees, transactional DDL, support for foreign keys and triggers. Native support for complex constraints, isolation levels, and stored procedures

  • CockroachDB / Yugabyte: Scale-out Postgres-compatible engines that retain serializability

Why Others Were Rejected:

  • MongoDB: Multi-document transaction support is recent and fragile at scale

  • Cassandra: No true ACID — would require complex compensating logic

  • DynamoDB: Difficult to enforce transactional flows, and audit trails are external

  • Redis: Zero durability by default, not even in the conversation

CQRS Hint:

  • The write system is your source of financial truth.

  • Reads for dashboards or reporting are derived later, through rigorously controlled pipelines — not via direct reads on the write DB.


Closing Thoughts

Picking a write-side database isn't about flavor-of-the-month tech. It’s about mapping your system’s write shape — the volume, structure, consistency demands, and lifecycle of each incoming event — to a storage engine that won’t choke when traffic spikes or edge cases hit.

In CQRS, the write path is the source of truth. If it leaks, lags, or locks under pressure, no read model can save you.

Every example above started with one question:

What is this system trying to persist, and what promises must it keep while doing that?

That question shapes everything else.


Up Next:

We’ve now chosen the right database to capture the truth.

But what happens when that truth needs to be read a million different ways — sliced, aggregated, ranked, or searched in milliseconds?

In the next post, we’ll walk through how teams choose the right read-side database — and why trying to “just reuse the write DB” often backfires.

Stay tuned…

Why CQRS Was Conceived

Part 3 of 7

Not another “what is CQRS” series. This one shows why it became necessary — through real-world failures, overloaded systems, and architectural pressure that forced teams to split reads and writes just to keep systems alive.

Up next

The CQRS Sync Architecture: The Bridge Between Two Worlds

By now, we’ve covered why CQRS exists.We split the system because one DB couldn’t serve two masters — and that split gave reads and writes the space to do what they’re good at. But that split came with a new responsibility: 👉 How do you keep those t...