Skip to main content

Command Palette

Search for a command to run...

Idempotent Ledger in Rust: What the Compiler Enforces and What Go Leaves to You.

Updated
19 min read
Idempotent Ledger in Rust: What the Compiler Enforces and What Go Leaves to You.
L
Backend developer. Expert in Rust and GoLang.

The failure scenario

At 14:03:42, Alice's app sends a £100 transfer to Bob. The server processes it, Alice's balance drops, Bob's rises. Then the network drops. The response never arrives. Alice's app retries at 14:03:45. The server processes it again. Alice has lost £200. Bob has £200. The ledger balances. No error was logged. Nobody knows.

This is not a theoretical edge case. It is the default behaviour of any transfer endpoint that does not explicitly defend against it. Every mobile network hiccup, every load balancer timeout, every client retry library is a trigger. The question is not whether your users will retry. The question is whether your system is ready for it.

What makes this particularly dangerous in a financial ledger is that the damage is silent. No exception is raised. No constraint is violated. The numbers add up. The bug only surfaces when Alice checks her statement, disputes the charge, and your support team starts manually reconciling entries.

Idempotency -- The Principle

An operation is idempotent if applying it multiple times produces the same result as applying it once.

Formally: f(f(x)) = f(x)

Intuitively: pressing a lift button twice does not make the lift arrive twice.

You already rely on idempotency everywhere:

  • HTTP GET is idempotent. Refreshing a page does not create a new resource.

  • Setting a value is idempotent. balance = 100 twice leaves balance = 100.

  • DELETE by ID is idempotent. Deleting something already gone is fine.

  • HTTP POST is not idempotent by default. Submitting a form twice creates two records.

The key insight: idempotency is not a property you add after the fact. It is a property you design in from the start. The mechanism is an idempotency key, a unique identifier the client generates and sends with every request. The server uses this key to detect replays and return the original result instead of re-executing.


What a ledger is -- and why it raises the stakes

A ledger is an append-only record of every financial movement in a system. Every transfer produces two entries: a debit on the sender's account and a credit on the receiver's. The sum of all entries must always equal zero.

Three invariants must hold at all times:

  1. Balance integrity -- no account balance ever goes negative

  2. Ledger balance -- the sum of all entries across all accounts equals zero

  3. Transfer atomicity -- a transfer either fully completes or has zero effect

Violating any of these in a conventional application means corrupt data and manual reconciliation. This is why the ledger is the hardest place to get idempotency wrong, and the most instructive place to get it right.


The Design

Schema


CREATE TABLE accounts (
    id         UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    owner      TEXT        NOT NULL,
    balance    BIGINT      NOT NULL DEFAULT 0,
    created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
    CONSTRAINT balance_non_negative CHECK (balance >= 0)
);

CREATE TABLE ledger_entries (
    id          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id  UUID        NOT NULL REFERENCES accounts(id),
    amount      BIGINT      NOT NULL,
    transfer_id UUID        NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- idempotency_keys
-- The UNIQUE constraint on `key` means
-- a concurrent duplicate insert will get a conflict rather than creating
-- two transfers. ON CONFLICT DO NOTHING in the application code handles
-- this gracefully.
CREATE TABLE idempotency_keys (
    key        TEXT        PRIMARY KEY,
    response   JSONB       NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

Every decision here is deliberate:

  • BIGINT for money -- never FLOAT or DECIMAL for storage. Floats introduce rounding errors. BIGINT stores cents. £1.00 is stored as 100. Arithmetic is exact.

  • CHECK (balance >= 0) -- the database enforces the balance invariant even if the application layer has a bug. This is the last line of defence.

  • idempotency_keys.key TEXT PRIMARY KEY -- the PRIMARY KEY constraint means a duplicate insert fails at the database level. Two concurrent requests with the same key cannot both succeed.

  • response JSONB -- the original response is stored verbatim. A replay returns exactly what the first request returned, not a recomputed result.

  • ledger_entries is append-only -- no UPDATE, no DELETE. Every financial movement is a permanent record.

Failure modes

Failure What the system does
Network drops before response arrives Client retries -> idempotency key found -> original response returned
Server crashes mid-transaction Postgres rolls back -> retry re-runs cleanly from scratch
Two concurrent identical requests PRIMARY KEY constraint -> one insert succeeds, one silently ignored
Transfer exceeds balance CHECK (balance >= 0) fires -> error returned, both balances unchanged
Wrong account ID Foreign key constraint -> error returned before any money moves

The Implementation -- Five Phases

The system was built in five phases. Each phase adds exactly one guarantee and leaves the previous phases unchanged.

Phase 1 -- Types + schema    ->  invalid money is unrepresentable
Phase 2 -- DB layer          ->  SQL is verified at compile time
Phase 3 -- Core transfer     ->  balance never goes negative
Phase 4 -- Idempotency       ->  exactly one transfer per key
Phase 5 -- HTTP + tests      ->  proof under concurrent load

Phase 1 -- Types and schema: making invalid states unrepresentable

Before a single query is written, the type system does its first job.

pub struct Money(i64);

impl Money {
    pub fn from_cents(cents: i64) -> Result<Self, LedgerError> {
        if cents <= 0 {
            return Err(LedgerError::InvalidAmount);
        }
        Ok(Self(cents))
    }

    pub fn cents(&self) -> i64 { self.0 }
}

Money(i64) is a newtype. The inner i64 is private by default. The only way to construct a Money value is through from_cents, which rejects zero and negative amounts at the boundary. A caller cannot pass a float, the compiler refuses it. A caller cannot pass -50 , from_cents rejects it before the value reaches the database.

pub struct AccountId(pub Uuid);
pub struct TransferId(pub Uuid);

AccountId and TransferId are both wrappers around Uuid, but they are distinct types at compile time. Passing an AccountId where a TransferId is expected is a compile error. In Go, both would be uuid.UUID , the distinction lives in variable names and developer discipline.

IdempotencyKey wraps a String and validates it at construction time, empty strings and keys longer than 255 characters are rejected before they ever reach the database:

#[derive(Debug, Deserialize, Clone)]
#[serde(transparent)]
pub struct IdempotencyKey(String);

impl IdempotencyKey {
    pub fn new(s: impl Into<String>) -> Result<Self, LedgerError> {
        let s = s.into();
        if s.is_empty() || s.len() > 255 {
            return Err(LedgerError::InvalidIdempotencyKey);
        }
        Ok(Self(s))
    }

    pub fn as_str(&self) -> &str { &self.0 }
}

TransferRequest bundles everything needed to execute and deduplicate a transfer. transfer_id has a serde default so clients that don't supply one get a fresh Uuid generated server-side:

#[derive(Debug, Deserialize)]
pub struct TransferRequest {
    pub idempotency_key: IdempotencyKey,
    pub from_account:    AccountId,
    pub to_account:      AccountId,
    pub amount:          Money,
    #[serde(default = "default_transfer_id")]
    pub transfer_id:     TransferId,
}

TransferResult is the value that gets serialised into idempotency_keys.response and returned verbatim on replay, the client sees the same JSON whether the request is fresh or a duplicate:

#[derive(Debug, Deserialize, Serialize)]
pub struct TransferResult {
    pub from_account: AccountId,
    pub to_account:   AccountId,
    pub amount:       Money,
    pub transfer_id:  TransferId,
}

The error type is an enum:

pub enum LedgerError {
    InsufficientFunds { account_id: Uuid },
    InvalidAmount,
    InvalidIdempotencyKey,
    AccountNotFound(Uuid),
    TransferNotFound(Uuid),
    Database(sqlx::Error),
    Serialization(serde_json::Error),
}

Every match on LedgerError is exhaustive. Adding a new variant without handling it everywhere and the code will not compile. In Go, a new error condition is a new sentinel value -- every if err == ErrX check that does not know about it silently falls through.


Phase 2 -- Database layer: compile-time SQL verification

sqlx provides the query! macro, which validates SQL against a live database at compile time. If a column name is wrong or a parameter type does not match what Postgres expects, the code does not compile:

let row = sqlx::query!(
    "SELECT balance FROM accounts WHERE id = $1",
    account_id
)
.fetch_optional(&self.pool)
.await?;

For offline builds like CI, Docker , cargo sqlx prepare captures the query metadata into a .sqlx directory. SQLX_OFFLINE=true tells the compiler to use that cache. The CI pipeline verifies the cache is current with cargo sqlx prepare --check.

Transactions use ownership to enforce rollback. If commit() is never called, because an error was returned early or the ? operator short-circuited, tx is dropped and sqlx rolls back automatically. There is no defer tx.Rollback() to forget.

pub async fn with_transaction<'a, F, Fut, T>(&self, f: F) -> Result<T, LedgerError>
where
    F: FnOnce(Transaction<'a, Postgres>) -> Fut,
    Fut: Future<Output = Result<(T, Transaction<'a, Postgres>), LedgerError>>,
{
    let tx = self.pool.begin().await?;
    let (result, tx) = f(tx).await?;  // if f returns Err, tx is dropped -> rollback
    tx.commit().await?;
    Ok(result)
}

The FnOnce bound means the closure is consumed, it can only be called once. The compiler rejects a second call at the same call site.

The four DB functions the transfer layer calls are lock_accounts, apply_entry, cache_result, and get_cached_result. Each runs inside the same transaction that with_transaction manages.

lock_accounts selects both account rows with FOR UPDATE in UUID order. Sorting before locking is the one line that prevents deadlocks when two concurrent transfers touch the same accounts in opposite directions:

pub async fn lock_accounts(
    tx: &mut Transaction<'_, Postgres>,
    a: Uuid,
    b: Uuid,
) -> Result<(), LedgerError> {
    let (first, second) = if a < b { (a, b) } else { (b, a) };

    sqlx::query!(
        "SELECT id FROM accounts WHERE id = ANY($1) ORDER BY id FOR UPDATE",
        &[first, second] as &[Uuid]
    )
    .fetch_all(&mut **tx)
    .await?;

    Ok(())
}

apply_entry does two things atomically: it increments (or decrements) accounts.balance and inserts a row into ledger_entries. The balance CHECK constraint fires inside the UPDATE, the application does not check the balance beforehand, it lets the database reject the write and maps error code 23514 (check_violation) to LedgerError::InsufficientFunds:

pub async fn apply_entry(
    tx: &mut Transaction<'_, Postgres>,
    account_id: Uuid,
    transfer_id: Uuid,
    amount: i64,      // negative for debit, positive for credit
) -> Result<(), LedgerError> {
    sqlx::query!(
        r#"
        UPDATE accounts
        SET    balance = balance + $1
        WHERE  id = $2
        "#,
        amount,
        account_id
    )
    .execute(&mut **tx)
    .await
    .map_err(|e| {
        if let sqlx::Error::Database(ref db_err) = e {
            if db_err.code().as_deref() == Some("23514") {
                return LedgerError::InsufficientFunds { account_id };
            }
        }
        LedgerError::Database(e)
    })?;

    sqlx::query!(
        "INSERT INTO ledger_entries (account_id, amount, transfer_id) VALUES (\(1, \)2, $3)",
        account_id, amount, transfer_id
    )
    .execute(&mut **tx)
    .await?;

    Ok(())
}

cache_result writes the TransferResult into idempotency_keys as JSONB inside the same transaction. ON CONFLICT DO NOTHING means a replay that races with an in-flight first write is safe, the row is only ever written once:

pub async fn cache_result(
    tx: &mut Transaction<'_, Postgres>,
    key: &IdempotencyKey,
    result: &TransferResult,
) -> Result<(), LedgerError> {
    let response = serde_json::to_value(result)?;

    sqlx::query!(
        "INSERT INTO idempotency_keys (key, response) VALUES (\(1, \)2) ON CONFLICT (key) DO NOTHING",
        key.as_str(),
        response
    )
    .execute(&mut **tx)
    .await?;

    Ok(())
}

get_cached_result checks for an existing key before the transfer opens its transaction. A Some return means the request is a replay, the stored JSONB is deserialised back into a TransferResult and returned immediately without touching the ledger:

pub async fn get_cached_result(
    tx: &mut Transaction<'_, Postgres>,
    key: &IdempotencyKey,
) -> Result<Option<TransferResult>, LedgerError> {
    let row = sqlx::query!(
        "SELECT response FROM idempotency_keys WHERE key = $1",
        key.as_Some return means the request is a replay, the stored JSONB is str()
    )
    .fetch_optional(&mut **tx)
    .await?;

    match row {
        None => Ok(None),
        Some(r) => {
            let result: TransferResult = serde_json::from_value(r.response)?;
            Ok(Some(result))
        }
    }
}

Phase 3 -- Core transfer: atomicity and the balance invariant

The transfer does four things inside a single transaction:

1. Lock both accounts in UUID order     (prevents deadlocks)
2. Insert debit entry  (amount negative)  (sender loses money)
3. Insert credit entry (amount positive)  (receiver gains money)
4. Commit                               (all or nothing)

Lock ordering is the detail most implementations get wrong. If two concurrent transfers touch the same two accounts in opposite directions, they deadlock. The fix: always acquire locks in the same order, regardless of transfer direction.

pub async fn lock_accounts(
    tx: &mut Transaction<'_, Postgres>,
    a: Uuid,
    b: Uuid,
) -> Result<(), LedgerError> {
    let (first, second) = if a < b { (a, b) } else { (b, a) };

    sqlx::query!(
        "SELECT id FROM accounts WHERE id = ANY($1) ORDER BY id FOR UPDATE",
        &[first, second] as &[Uuid]
    )
    .fetch_all(&mut **tx)
    .await?;

    Ok(())
}

Both rows are locked in a single query, ordered by UUID. Any concurrent transfer involving the same accounts will wait at the lock rather than deadlock.

The balance invariant is enforced by the CHECK constraint, not application code. The application attempts the update and lets the database reject it:

.map_err(|e| {
    if let sqlx::Error::Database(ref db_err) = e {
        if db_err.code().as_deref() == Some("23514") {
            return LedgerError::InsufficientFunds { account_id };
        }
    }
    LedgerError::Database(e)
})?;

Error code 23514 is a Postgres check constraint violation. The application does not check balances before attempting the update, it attempts the update and maps the constraint violation to a domain error. This eliminates a check-then-act race where the balance could change between the check and the update.


Phase 4 -- Idempotency: exactly one transfer per key

The naive implementation has a race condition:

Thread A: check key -> not found
Thread B: check key -> not found     <- both pass the guard
Thread A: execute transfer
Thread B: execute transfer          <- duplicate
Thread A: store key
Thread B: store key

Moving the key check inside the transaction, after locking, closes the race:

pub async fn transfer(&self, req: TransferRequest) -> Result<TransferResult, LedgerError> {
    self.db
        .with_transaction(|mut tx| async {
            // lock accounts FIRST -- concurrent requests now wait here
            Db::lock_accounts(&mut tx, req.from_account.0, req.to_account.0).await?;

            // check idempotency key INSIDE the transaction, AFTER locking
            if let Some(cached) = Db::get_cached_result(&mut tx, &req.idempotency_key).await? {
                return Ok((cached, tx));  // replay -- return without executing
            }

            // new request -- execute and store atomically
            Db::apply_entry(&mut tx, req.from_account.0, req.transfer_id.0, -req.amount.cents()).await?;
            Db::apply_entry(&mut tx, req.to_account.0,   req.transfer_id.0,  req.amount.cents()).await?;

            let result = TransferResult { /* ... */ };

            Db::cache_result(&mut tx, &req.idempotency_key, &result).await?;

            Ok((result, tx))
        })
        .await
}

Why does this work under READ COMMITTED (Postgres default)?

When Thread B acquires the FOR UPDATE lock, Thread A has already committed. Thread B takes a fresh snapshot at the point it acquires the lock, sees the committed idempotency key, and returns the cached result. No duplicate transfer.

The cache_result function uses ON CONFLICT DO NOTHING as a secondary safety net:

sqlx::query!(
    "INSERT INTO idempotency_keys (key, response)
     VALUES (\(1, \)2)
     ON CONFLICT (key) DO NOTHING",
    key.as_str(),
    response
)
.execute(&mut **tx)
.await?;

If two requests somehow both reach the insert (which the lock prevents, but defence in depth), only one succeeds. The PRIMARY KEY constraint is the final word.

The full request flow:


Phase 5 -- HTTP layer and tests

The HTTP layer uses axum's extractor pattern. Types drive what gets pulled from the request:

async fn transfer_handler(
    State(service): State<Arc<LedgerService>>,
    Json(req): Json<TransferRequest>,
) -> Result<Json<TransferResult>, LedgerError> {
    let result = service.transfer(req).await?;
    Ok(Json(result))
}

LedgerError implements IntoResponse, so the ? operator converts domain errors directly into HTTP responses with the correct status codes. No error-mapping middleware to maintain.


Proving it works -- three adversarial tests

The tests do not test the happy path. They test the invariants.

Test 1 -- Concurrent same key produces exactly one transfer

#[tokio::test]
async fn same_key_concurrent_produces_one_transfer() {
    let (ledger, alice, bob) = setup().await;
    let key = IdempotencyKey::new(format!("test-{}", Uuid::new_v4())).unwrap();

    let mut set = JoinSet::new();
    for _ in 0..10 {
        let ledger = ledger.clone();
        let key = key.clone();
        set.spawn(async move {
            ledger.transfer(TransferRequest {
                idempotency_key: key,
                from_account: alice,
                to_account: bob,
                amount: Money::from_cents(1000).unwrap(),
                transfer_id: TransferId(Uuid::new_v4()),
            }).await
        });
    }

    let mut results = Vec::new();
    while let Some(r) = set.join_next().await {
        results.push(r);
    }

    // all 10 calls must return the same transfer_id
    let ids: HashSet<_> = results.iter()
        .filter_map(|r| r.as_ref().ok())
        .filter_map(|r| r.as_ref().ok())
        .map(|t| t.transfer_id)
        .collect();

    assert_eq!(ids.len(), 1, "expected exactly one unique transfer_id");

    // alice lost exactly 1000 cents -- not 10 000
    let balance = ledger.get_balance(alice).await.unwrap();
    assert_eq!(balance, 9000);
}

Ten concurrent tasks fire the same idempotency key at the same time. The HashSet collapsing to a single element is the proof, every response carries the same transfer_id. The balance assertion is the second proof, exactly one debit occurred.

If the idempotency check were outside the transaction, all ten would pass the guard simultaneously and all ten would debit Alice. The balance would be 0. The HashSet would contain ten distinct IDs. The test would fail loudly.

Test 2 -- Insufficient funds leaves balances unchanged

#[tokio::test]
async fn transfer_exceeding_balance_is_rejected() {
    let (ledger, alice, bob) = setup().await;

    let result = ledger.transfer(TransferRequest {
        amount: Money::from_cents(99_999).unwrap(), // more than 10_000
        // ...
    }).await;

    assert!(matches!(result, Err(LedgerError::InsufficientFunds { .. })));

    let balance = ledger.get_balance(alice).await.unwrap();
    assert_eq!(balance, 10_000);  // unchanged
}

This test proves atomicity. If the debit entry were written before the constraint check, even partially, Alice's balance would change despite the error. It does not.

Test 3 -- Sequential transfers maintain the balance invariant

#[tokio::test]
async fn sequential_transfers_maintain_balance_invariant() {
    let (ledger, alice, bob) = setup().await;

    for i in 0..5 {
        ledger.transfer(TransferRequest {
            idempotency_key: IdempotencyKey::new(format!("seq-{}", i)).unwrap(),
            amount: Money::from_cents(1000).unwrap(),
            // ...
        }).await.unwrap();
    }

    let alice_balance = ledger.get_balance(alice).await.unwrap();
    let bob_balance   = ledger.get_balance(bob).await.unwrap();

    assert_eq!(alice_balance, 5_000);
    assert_eq!(bob_balance,   5_000);
    assert_eq!(alice_balance + bob_balance, 10_000);  // money is conserved
}

Money is conserved. 10_000 cents entered the system. After five transfers, 10_000 cents remain -- distributed differently, but not created or destroyed.

Running the tests

docker compose up -d postgres
sqlx migrate run
cargo test -- --nocapture
test transfer_exceeding_balance_is_rejected          ... ok
test sequential_transfers_maintain_balance_invariant ... ok
test same_key_concurrent_produces_one_transfer       ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; finished in 0.78s

The Go comparison

Both languages can implement this system correctly. They reach correctness differently.

Guarantee Rust mechanism Go mechanism
Money type safety Money(i64) newtype, private field type Money int64 -- convention only
Invalid amount rejected from_cents returns Result -- compiler enforces handling NewMoney returns error -- caller can ignore it
Wrong ID type AccountId vs TransferId -- distinct types, compile error Both uuid.UUID -- variable name discipline
Error handling Result<T, E> with ? -- cannot be silently ignored error return value -- if err != nil can be omitted
Exhaustive errors enum LedgerError -- unhandled variant = compile error Sentinel errors -- new value silently falls through
Transaction rollback tx dropped on error -- automatic defer tx.Rollback() -- must be written, easy to forget
SQL correctness sqlx::query! -- compile-time verification database/sql -- runtime errors

What Rust pushes to compile time

These are runtime bugs in Go and compile errors in Rust:

  • Passing a negative amount to from_accounts -- rejected at construction

  • Passing an AccountId where a TransferId is expected -- distinct types

  • Ignoring the Result returned by a database operation -- ? makes it visible

  • Adding a new LedgerError variant without handling it in every match -- exhaustive enums

What Go does better

Go's HTTP handler signatures are simpler. func(w http.ResponseWriter, r *http.Request) requires no framework knowledge. Axum's extractor pattern is powerful but has a learning curve. Go also compiles faster, produces smaller binaries, and has an easier learning curve.

When to choose each

Choose Rust when the cost of a runtime bug is high and the team can invest in the type system, eg. financial systems, infrastructure tooling, anything where correctness is non-negotiable. Choose Go when development velocity matters more, or when the domain is well-understood and the error surface is small. The correctness guarantees Rust provides are real, but they come at a cost: slower compile times, a steeper learning curve, and more upfront design work.


What I learned

The most surprising thing was not the rust borrow checker, I expected that to be the hard part. The hard part was a single SQL clause: ON CONFLICT DO NOTHING.

The first implementation checked for the idempotency key before opening the transaction, then stored it after. It looked correct. It passed every sequential test. It failed the concurrent test -- ten tasks with the same key would all pass the guard simultaneously, all execute the transfer, and Alice would lose £100 instead of £10. The fix was moving the key check inside the transaction, after acquiring the FOR UPDATE lock. The waiting transaction takes a fresh snapshot after the lock, sees the committed key, and returns the cached result.

The lesson: when a constraint must hold across concurrent operations, the database is the right place to enforce it. Application-level checks are subject to race conditions. The lock and the constraint together are not.


Where to go next


Part of a series on building backend systems in Rust and Go.

More from this blog

L

lethu-technical-writes

6 posts

Backend engineering concepts