Building an Idempotent Ledger in Go

The failure scenario
At 14:03:42, Alice's app sends a £100 transfer to Bob. The server processes it, Alice's balance drops, Bob's rises. Then the network drops. The response never arrives. Alice's app retries at 14:03:45. The server processes it again. Alice has lost £200. Bob has £200. The ledger balances. No error was logged. Nobody knows.
This is not a theoretical edge case. It is the default behaviour of any transfer endpoint that does not explicitly defend against it. Every mobile network hiccup, every load balancer timeout, every client retry library is a trigger. The question is not whether your users will retry -- they will. The question is whether your system is ready for it.
What makes this particularly dangerous in a financial ledger is that the damage is silent. No exception is raised. No constraint is violated. The numbers add up. The bug only surfaces when Alice checks her statement, disputes the charge, and your support team starts manually reconciling entries.
Idempotency -- the principle
An operation is idempotent if applying it multiple times produces the same result as applying it once.
Formally: f(f(x)) = f(x)
Intuitively: pressing a lift button twice does not make the lift arrive twice.
You already rely on idempotency everywhere:
HTTP GETis idempotent. Refreshing a page does not create a new resource.Setting a value is idempotent.
balance = 100twice leavesbalance = 100.DELETEby ID is idempotent. Deleting something already gone is fine.HTTP POSTis not idempotent by default. Submitting a form twice creates two records.
The key insight: idempotency is not a property you add after the fact. It is a property you design in from the start. The mechanism is an idempotency key -- a unique identifier the client generates and sends with every request. The server uses this key to detect replays and return the original result instead of re-executing.
What a ledger is -- and why it raises the stakes
A ledger is an append-only record of every financial movement in a system. Every transfer produces two entries: a debit on the sender's account and a credit on the receiver's. The sum of all entries must always equal zero.
Three invariants must hold at all times:
Balance integrity -- no account balance ever goes negative
Ledger balance -- the sum of all entries across all accounts equals zero
Transfer atomicity -- a transfer either fully completes or has zero effect
Violating any of these means corrupt data and manual reconciliation. In a payment system it means regulatory exposure. This is why the ledger is the hardest place to get idempotency wrong, and the most instructive place to get it right.
The design
Schema
CREATE TABLE accounts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
owner TEXT NOT NULL,
balance BIGINT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT balance_non_negative CHECK (balance >= 0)
);
CREATE TABLE ledger_entries (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
account_id UUID NOT NULL REFERENCES accounts(id),
amount BIGINT NOT NULL,
transfer_id UUID NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE idempotency_keys (
key TEXT PRIMARY KEY,
response JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
Every decision here is deliberate:
BIGINTfor money -- neverFLOATorDECIMALfor storage. Floats introduce rounding errors.BIGINTstores cents. £1.00 is stored as100. Arithmetic is exact.CHECK (balance >= 0)-- the database enforces the balance invariant even if the application layer has a bug. This is the last line of defence.idempotency_keys.key TEXT PRIMARY KEY-- thePRIMARY KEYconstraint means a duplicate insert fails at the database level. Two concurrent requests with the same key cannot both succeed.response JSONB-- the original response is stored verbatim. A replay returns exactly what the first request returned, not a recomputed result.ledger_entriesis append-only -- noUPDATE, noDELETE. Every financial movement is a permanent record.
Failure modes
| Failure | What the system does |
|---|---|
| Network drops before response arrives | Client retries -> idempotency key found -> original response returned |
| Server crashes mid-transaction | Postgres rolls back -> retry re-runs cleanly from scratch |
| Two concurrent identical requests | PRIMARY KEY constraint -> one insert succeeds, one silently ignored |
| Transfer exceeds balance | CHECK (balance >= 0) fires -> error returned, both balances unchanged |
| Wrong account ID | Foreign key constraint -> error returned before any money moves |
The implementation -- five phases
The system is built in five phases. Each phase adds exactly one guarantee and leaves the previous phases unchanged.
Phase 1 -- Types + schema -> invalid money is unrepresentable
Phase 2 -- DB layer -> balance constraint enforced at the database level
Phase 3 -- Core transfer -> balance never goes negative
Phase 4 -- Idempotency -> exactly one transfer per key
Phase 5 -- HTTP + tests -> proof under concurrent load
Phase 1 -- Types and schema: making invalid states hard to construct
Before a single query is written, the type system does what it can.
type Money struct {
cents int64
}
func FromCents(cents int64) (Money, error) {
if cents <= 0 {
return Money{}, fmt.Errorf("amount must be a positive number of cents")
}
return Money{cents: cents}, nil
}
func (m Money) Cents() int64 { return m.cents }
Money is a struct with a private cents field. The only way to construct one is through FromCents, which rejects zero and negative amounts at the boundary. A caller cannot pass a float, the type mismatch is caught at compile time. A caller cannot pass -50, FromCents rejects it before the value reaches the database.
type AccountID uuid.UUID
type TransferID uuid.UUID
AccountID and TransferID are distinct named types. Passing one where the other is expected is a compile error. They are both backed by uuid.UUID, but the compiler treats them as separate types. This is not as strong as Rust's newtype pattern, a cast AccountID(someTransferID) compiles.
IdempotencyKey validates length at construction:
type IdempotencyKey struct {
value string
}
func NewIdempotencyKey(s string) (IdempotencyKey, error) {
if len(s) == 0 || len(s) > 255 {
return IdempotencyKey{}, fmt.Errorf("idempotency key must be 1 to 255 characters")
}
return IdempotencyKey{value: s}, nil
}
An empty or oversized key cannot be constructed. The HTTP handler calls NewIdempotencyKey before any database interaction, the validation happens at the boundary.
The error type:
type LedgerError struct {
kind ledgerErrorKind
message string
}
type ledgerErrorKind int
const (
kindInsufficientFunds ledgerErrorKind = iota
kindDuplicateKey
kindInvalidAmount
kindInvalidIdempotencyKey
kindAccountNotFound
kindTransferNotFound
kindDatabase
kindSerialization
)
Callers use Is* helpers to distinguish variants without a type switch:
func IsInsufficientFunds(err error) bool { return isKind(err, kindInsufficientFunds) }
func IsAccountNotFound(err error) bool { return isKind(err, kindAccountNotFound) }
And HTTPStatus maps each variant to the correct status code in one place:
func HTTPStatus(err error) int {
var le *LedgerError
if !errors.As(err, &le) {
return http.StatusInternalServerError
}
switch le.kind {
case kindInsufficientFunds:
return http.StatusUnprocessableEntity
case kindInvalidAmount, kindInvalidIdempotencyKey:
return http.StatusBadRequest
case kindAccountNotFound, kindTransferNotFound:
return http.StatusNotFound
default:
return http.StatusInternalServerError
}
}
No handler decides its own status code. The error carries the information; HTTPStatus reads it.
Phase 2 -- Database layer: queries and the balance invariant
LockAccounts acquires row locks on both accounts inside the current transaction. Accounts are always locked in ascending UUID order:
func LockAccounts(ctx context.Context, tx pgx.Tx, a, b uuid.UUID) error {
first, second := a, b
if a.String() > b.String() {
first, second = b, a
}
rows, err := tx.Query(ctx,
"SELECT id FROM accounts WHERE id = ANY($1) ORDER BY id FOR UPDATE",
[]uuid.UUID{first, second},
)
if err != nil {
return ledgererrors.Database(err)
}
rows.Close()
return nil
}
Both rows are locked in a single query, ordered by UUID. Any concurrent transfer involving the same accounts waits at the lock rather than deadlocking.
ApplyEntry updates the balance and inserts a ledger record atomically within the transaction:
func ApplyEntry(ctx context.Context, tx pgx.Tx, accountID, transferID uuid.UUID, amount int64) error {
_, err := tx.Exec(ctx,
"UPDATE accounts SET balance = balance + \(1 WHERE id = \)2",
amount, accountID,
)
if err != nil {
if isCheckViolation(err) {
return ledgererrors.InsufficientFunds(accountID)
}
return ledgererrors.Database(err)
}
_, err = tx.Exec(ctx,
"INSERT INTO ledger_entries (account_id, amount, transfer_id) VALUES (\(1, \)2, $3)",
accountID, amount, transferID,
)
if err != nil {
return ledgererrors.Database(err)
}
return nil
}
The application does not check the balance before attempting the update. It attempts the update and maps the constraint violation to a domain error. This eliminates a check-then-act race where the balance could change between the check and the update.
Error code 23514 is a Postgres check constraint violation:
func isCheckViolation(err error) bool {
var pgErr *pgconn.PgError
return errors.As(err, &pgErr) && pgErr.Code == "23514"
}
Phase 3 -- Core transfer: atomicity and the balance invariant
The transfer does four things inside a single transaction:
1. Lock both accounts in UUID order (prevents deadlocks)
2. Insert debit entry (amount negative) (sender loses money)
3. Insert credit entry (amount positive) (receiver gains money)
4. Commit (all or nothing)
func (s *LedgerService) Transfer(ctx context.Context, req types.TransferRequest) (types.TransferResult, error) {
tx, err := s.db.BeginTx(ctx)
if err != nil {
return types.TransferResult{}, err
}
defer tx.Rollback(ctx)
if err := db.LockAccounts(ctx, tx, uuid.UUID(req.FromAccount), uuid.UUID(req.ToAccount)); err != nil {
return types.TransferResult{}, err
}
if err := db.ApplyEntry(ctx, tx, uuid.UUID(req.FromAccount), uuid.UUID(req.TransferID), -req.Amount.Cents()); err != nil {
return types.TransferResult{}, err
}
if err := db.ApplyEntry(ctx, tx, uuid.UUID(req.ToAccount), uuid.UUID(req.TransferID), req.Amount.Cents()); err != nil {
return types.TransferResult{}, err
}
if err := tx.Commit(ctx); err != nil {
return types.TransferResult{}, err
}
return result, nil
}
defer tx.Rollback(ctx) is the transaction safety net. If any error causes an early return before tx.Commit, the deferred rollback fires. Once Commit succeeds, the subsequent Rollback is a no-op commit, pgx handles this correctly.
Phase 4 -- Idempotency: exactly one transfer per key
The naive implementation has a race condition:
Goroutine A: check key -> not found
Goroutine B: check key -> not found <- both pass the guard
Goroutine A: execute transfer
Goroutine B: execute transfer <- duplicate
Goroutine A: store key
Goroutine B: store key
Moving the key check inside the transaction, after locking, closes the race:
func (s *LedgerService) Transfer(ctx context.Context, req types.TransferRequest) (types.TransferResult, error) {
tx, err := s.db.BeginTx(ctx)
if err != nil {
return types.TransferResult{}, err
}
defer tx.Rollback(ctx)
// lock accounts FIRST -- concurrent requests on the same accounts wait here
if err := db.LockAccounts(ctx, tx, uuid.UUID(req.FromAccount), uuid.UUID(req.ToAccount)); err != nil {
return types.TransferResult{}, err
}
// check idempotency key INSIDE the transaction, AFTER locking
cached, err := db.GetCachedResult(ctx, tx, req.IdempotencyKey)
if err != nil {
return types.TransferResult{}, err
}
if cached != nil {
// replay -- return cached result without re-executing
if err := tx.Commit(ctx); err != nil {
return types.TransferResult{}, err
}
return *cached, nil
}
// new request -- execute and store atomically
if err := db.ApplyEntry(ctx, tx, uuid.UUID(req.FromAccount), uuid.UUID(req.TransferID), -req.Amount.Cents()); err != nil {
return types.TransferResult{}, err
}
if err := db.ApplyEntry(ctx, tx, uuid.UUID(req.ToAccount), uuid.UUID(req.TransferID), req.Amount.Cents()); err != nil {
return types.TransferResult{}, err
}
result := types.TransferResult{ /* ... */ }
if err := db.CacheResult(ctx, tx, req.IdempotencyKey, result); err != nil {
return types.TransferResult{}, err
}
if err := tx.Commit(ctx); err != nil {
return types.TransferResult{}, err
}
return result, nil
}
Why does this work under READ COMMITTED (Postgres default)?
When goroutine B acquires the FOR UPDATE lock, goroutine A has already committed. Goroutine B takes a fresh snapshot at the point it acquires the lock, sees the committed idempotency key, and returns the cached result. No duplicate transfer.
CacheResult uses ON CONFLICT DO NOTHING as a secondary safety net:
_, err = tx.Exec(ctx,
`INSERT INTO idempotency_keys (key, response)
VALUES (\(1, \)2)
ON CONFLICT (key) DO NOTHING`,
key.String(), responseJSON,
)
If two requests somehow both reach the insert which the lock prevents, but defence in depth, only one succeeds. The PRIMARY KEY constraint is the final word.
Phase 5 -- HTTP layer and tests
The HTTP handler decodes the request, validates inputs, and delegates to the service:
func (h *Handler) HandleTransfer(w http.ResponseWriter, r *http.Request) {
var body struct {
IdempotencyKey string `json:"idempotency_key"`
FromAccount uuid.UUID `json:"from_account"`
ToAccount uuid.UUID `json:"to_account"`
Amount int64 `json:"amount"`
}
if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
http.Error(w, "invalid request body", http.StatusBadRequest)
return
}
key, err := types.NewIdempotencyKey(body.IdempotencyKey)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
amount, err := types.FromCents(body.Amount)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
result, err := h.svc.Transfer(r.Context(), types.TransferRequest{
IdempotencyKey: key,
FromAccount: types.AccountID(body.FromAccount),
ToAccount: types.AccountID(body.ToAccount),
Amount: amount,
TransferID: types.NewTransferID(),
})
if err != nil {
http.Error(w, err.Error(), ledgererrors.HTTPStatus(err))
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(result)
}
All validation happens before Transfer is called. Transfer never receives invalid inputs. HTTPStatus(err) maps domain errors to status codes without any if err == ErrX chains in the handler.
Proving it works -- three adversarial tests
The tests do not test the happy path. They test the invariants.
Test 1 -- Concurrent same key produces exactly one transfer
func TestSameKeyConcurrentProducesOneTransfer(t *testing.T) {
svc, alice, bob := setup(t)
ctx := context.Background()
key := mustKey(t, fmt.Sprintf("test-%s", uuid.New()))
amount := mustMoney(t, 1000)
results := make([]result, 10)
var wg sync.WaitGroup
for i := range 10 {
wg.Add(1)
go func(i int) {
defer wg.Done()
req := types.TransferRequest{
IdempotencyKey: key,
FromAccount: alice,
ToAccount: bob,
Amount: amount,
TransferID: types.NewTransferID(),
}
res, err := svc.Transfer(ctx, req)
results[i] = result{res, err}
}(i)
}
wg.Wait()
ids := make(map[types.TransferID]struct{})
for _, r := range results {
if r.err == nil {
ids[r.res.TransferID] = struct{}{}
}
}
if len(ids) != 1 {
t.Errorf("expected exactly 1 unique transfer_id, got %d", len(ids))
}
balance, _ := svc.GetBalance(ctx, alice)
if balance != 9000 {
t.Errorf("alice balance = %d, want 9000", balance)
}
}
Ten concurrent goroutines fire the same idempotency key at the same time. The map collapsing to a single element is the proof, every response carries the same TransferID. The balance assertion is the second proof, exactly one debit occurred.
If the idempotency check were outside the transaction, all ten would pass the guard simultaneously and all ten would debit Alice. The balance would be 0. The map would contain ten distinct IDs. The test would fail loudly.
Test 2 -- Insufficient funds leaves balances unchanged
func TestTransferExceedingBalanceIsRejected(t *testing.T) {
svc, alice, _ := setup(t)
ctx := context.Background()
_, err := svc.Transfer(ctx, types.TransferRequest{
IdempotencyKey: mustKey(t, fmt.Sprintf("test-%s", uuid.New())),
FromAccount: alice,
ToAccount: types.AccountID(uuid.New()),
Amount: mustMoney(t, 99_999), // more than 10 000
TransferID: types.NewTransferID(),
})
if !ledgererrors.IsInsufficientFunds(err) {
t.Fatalf("expected InsufficientFunds, got %v", err)
}
balance, _ := svc.GetBalance(ctx, alice)
if balance != 10_000 {
t.Errorf("alice balance = %d, want 10000 (unchanged)", balance)
}
}
This test proves atomicity. If the debit entry were written before the constraint check fired, even partially, Alice's balance would change despite the error. It does not.
Test 3 -- Sequential transfers maintain the balance invariant
func TestSequentialTransfersMaintainBalanceInvariant(t *testing.T) {
svc, alice, bob := setup(t)
ctx := context.Background()
for i := range 5 {
_, err := svc.Transfer(ctx, types.TransferRequest{
IdempotencyKey: mustKey(t, fmt.Sprintf("seq-%d", i)),
FromAccount: alice,
ToAccount: bob,
Amount: mustMoney(t, 1000),
TransferID: types.NewTransferID(),
})
if err != nil {
t.Fatalf("transfer %d: %v", i, err)
}
}
aliceBal, _ := svc.GetBalance(ctx, alice)
bobBal, _ := svc.GetBalance(ctx, bob)
if aliceBal != 5_000 {
t.Errorf("alice = %d, want 5000", aliceBal)
}
if bobBal != 5_000 {
t.Errorf("bob = %d, want 5000", bobBal)
}
if aliceBal+bobBal != 10_000 {
t.Errorf("total balance = %d, want 10000", aliceBal+bobBal)
}
}
Money is conserved. 10_000 cents entered the system. After five transfers, 10_000 cents remain, distributed differently, but not created or destroyed.
Running the tests
docker compose up -d postgres
DATABASE_URL=postgres://postgres:password@localhost:5432/ledger \
go test ./tests/... -v
--- PASS: TestTransferExceedingBalanceIsRejected (0.09s)
--- PASS: TestSequentialTransfersMaintainBalanceInvariant (0.11s)
--- PASS: TestSameKeyConcurrentProducesOneTransfer (0.14s)
PASS
ok github.com/lethuzulu/idempotent-ledger-go/tests
The Go vs Rust comparison
Both languages can implement this system correctly. They reach correctness differently.
| Guarantee | Go mechanism | Rust mechanism |
|---|---|---|
| Money type safety | Money struct, private cents field |
Money(i64) newtype, private field |
| Invalid amount rejected | FromCents returns error -- caller can _ it |
from_cents returns Result -- compiler enforces handling |
| Wrong ID type | AccountID vs TransferID -- distinct named types, cast required |
AccountId vs TransferId -- distinct types, compile error, no cast |
| Error handling | error return value -- if err != nil can be omitted |
Result<T, E> with ? -- cannot be silently ignored |
| Exhaustive errors | switch le.kind -- new kindX constant silently falls through default |
enum LedgerError -- unhandled variant = compile error |
| Transaction rollback | defer tx.Rollback(ctx) -- must be written, easy to forget |
tx dropped on error -- automatic rollback, nothing to forget |
| SQL correctness | Runtime errors from pgx |
sqlx::query! -- compile-time verification |
What Go gets right here
Go's explicit if err != nil is verbose, but it makes error paths visible at every call site. The flow of a Transfer function in Go reads top-to-bottom -- you can see exactly where each error comes from. Rust's ? operator is terser but requires understanding the early-return semantics.
Go's defer tx.Rollback(ctx) pattern is a convention that must be learned and consistently applied. Once learned, it is mechanical, every function that opens a transaction adds the same line immediately after. The risk is a new team member who skips it.
Go compiles in seconds. Rust takes minutes on a cold build.
What the type system cannot enforce
In Go, a caller can write:
// this compiles -- the cast is explicit but not prohibited
db.ApplyEntry(ctx, tx, uuid.UUID(req.ToAccount), uuid.UUID(req.TransferID), ...)
Swapping ToAccount and FromAccount is a bug the compiler does not catch. In Rust, AccountId and TransferId are genuinely different types no cast syntax exists to confuse them. The test suite catches this class of bug in Go; the compiler catches it in Rust.
Similarly, if err != nil can be omitted. FromCents returns an error. A caller can write money, _ := types.FromCents(amount) and silently discard the validation. Go provides no mechanism to make error handling mandatory.
What I learned building this
The most surprising thing was not the concurrency problem, I expected that to be the hard part. The hard part was a single SQL clause: the placement of the idempotency check relative to the FOR UPDATE lock.
The first implementation checked for the idempotency key before opening the transaction, then stored it after. It looked correct. It passed every sequential test. It failed the concurrent test, ten goroutines with the same key would all pass the guard simultaneously, all execute the transfer, and Alice would lose £1000 instead of £100. The map in the test collected ten distinct transfer IDs. The fix was one change: move the GetCachedResult call to after LockAccounts, inside the transaction.
The lesson: when a constraint must hold across concurrent operations, the database is the right place to enforce it. Application-level checks are subject to race conditions. The lock and the constraint together are not.
defer tx.Rollback(ctx) is the second lesson. The pattern is simple: open transaction, immediately defer rollback, proceed. If Commit is never reached because of any early return the rollback fires. Once Commit succeeds, the deferred rollback is a no-op. Writing it wrong (forgetting the defer, or placing it after other calls) is a class of bug that Rust's ownership model eliminates entirely.
Where to go next
Postgres transaction isolation levels -- understanding
READ COMMITTEDvsSERIALIZABLEand when each matterspgx documentation -- particularly transaction handling and error inspection
Stripe's idempotency keys -- how a production payment system implements the same pattern at scale
Source code -- the full implementation
Part of a series on building backend systems in Go and Rust.




