Early Learnings Building Apps on Postgres in the Claude Code Era: The Importance of the Iteration Loop

Paul Whalen

whoami

Senior Staff Software Engineer @ PEAK6 Capital Management
- Proprietary options trading firm in Chicago
~12 YOE as primarily-backend application engineer (Java, Go, Python)
- Data engineering experience as well, with lots of Kafka and SQL
Database nerd

This is a presentation about AI

What this presentation is not

… about building software that uses LLMs
- It’s about building deterministic software with LLMs
… about administering/tuning Postgres (mostly)
- It’s about building apps on it
… informed by experience with coding agents other than Claude
- Codex, etc. I’m sure are great too, but I have experience with Claude
… given by an expert
- Is anyone an expert in this stuff yet?
… particularly long
- Let’s leave time for discussion and experience sharing!

Agenda

Agentic Engineering and Claude Code
Claude Code + Postgres Experiences
Discussion

Agentic Engineering

Coding with AI 18 Months Ago

Ask the model questions via chat
- Output code snippets
- No context about your codebase
- Little ability to verify correctness
“Fancy auto-complete”

Coding with AI Today

Prompts don’t just trigger a text response, they start an “agentic loop”:
- Think about it
- Do something
- Check whether it worked
“Vibe coding”
- Agent writes code, cross fingers, don’t read ouput
“Agentic Engineering”
- Agent writes code, engineer evaluates, reviews, steers

Coding with AI Tomorrow

Levels of Agentic Engineering

Claude Code

Claude Code is an agentic harness around Claude models that implements the agentic loop

Claude Code Concepts

Tools
- How Claude does stuff
- Read/Write files
- Execute programs
- Search Web
Instructions and memory
- What you tell Claude about your project
- CLAUDE.md
- Skills
- Rules
Context
- Claude’s memory of the current task

The Iteration Loop

Claude will not stop until it thinks it has given correct output
You define what correct is
Correct can be about:
- Passing tests
- Adhering to style/conventions
- Running an app end to end
Claude is dumb but persistent: use this to your advantage

Claude Code + Postgres Experiences

Four Anecdotes

A stylized narrative of my experiences with Claude in the last 4 months

Core data modeling
Simple SQL migration
Complex constraints
Plan testing (❗)

Core data modeling

Prompt:
- “design a SQL schema to capture over-the-counter (OTC) traded instruments. it should track what positions you have in what specific accounts, and should track greek risk over time.”
This is a well-known problem space, that the model definitely has a lot of knowledge about
12 months ago, the output would have been bad, these days the output is “good”

Core data modeling schema

Six tables capture the full lifecycle from trade execution through risk reporting:

counterparty — legal entities you face (LEI, name)
account — books/funds that hold positions
instrument — master contract record per OTC deal
trade — individual fills that build positions
position — end-of-day snapshots per account
greek_risk — timestamped sensitivities per position

Supports swaps, forwards, options, CDS, swaptions, caps/floors, NDFs, and commodity derivatives.

-- Instruments carry product-specific nullable fields
CREATE TABLE instrument (
    instrument_type  instrument_type NOT NULL,
    notional         NUMERIC(20,4)   NOT NULL,
    maturity_date    DATE            NOT NULL,
    strike           NUMERIC(20,8),  -- options only
    fixed_rate       NUMERIC(10,6),  -- swaps / CDS
    floating_index   TEXT,           -- e.g. 'SOFR'
    counterparty_id  BIGINT NOT NULL REFERENCES ...
);

-- Greek snapshots track risk evolution over time
CREATE TABLE greek_risk (
    position_id      BIGINT      NOT NULL,
    as_of_timestamp  TIMESTAMPTZ NOT NULL,
    delta            NUMERIC(20,6),
    gamma            NUMERIC(20,6),
    vega             NUMERIC(20,6),
    theta            NUMERIC(20,6),
    dv01             NUMERIC(20,6),
    cs01             NUMERIC(20,6),
    pricing_model    TEXT,
    model_params     JSONB  -- full reproducibility
);

Core data modeling: “issues”

The schema includes a “trade” table, but in our case that’s unnecessary complexity
The greek risk values are NUMERIC rather than FLOAT, implying false precision of modeled values
Position table includes both quantity tracking and profit/loss, which are decoupled in our system
These are not dealbreakers, but they would not pass code review: the human needs to be in the loop

Core data modeling: learnings

The model understand most domains superficially
But domain understanding is about your specific system and organiation needs
When designing a data model and schema, these details are important and hard to outsource to the model

Simple SQL migration

Adding a feature to a Java application
Prompt:
- “add a client_id to the ‘accounts’ table in a flyway migration. include the client ID on outgoing orders for that account”

ALTER TABLE accounts
    ADD COLUMN IF NOT EXISTS client_id TEXT CHECK (client_id <> '');

Testcontainers + Flyway (⛭ harness engineering ⛭ )

Testcontainers: Docker containers in JUnit tests
- A real instance of postgres running in your unit tests
Flyway: SQL migration management
- The real version of your schema, applied to that postgres
If you let Claude run (and create) these tests on its own, you get extraordinary confidence that it’s solving the problem fully
- The key insight: these are “wider” tests than you might normally write with JUnit, but are still really easy to run

Best Practices and Style

The model may have been trained on “something”, but probably need to tell it directly what you care about
- Claude just wants to finish, so the first pass might not be great
In the client_id migration example, it chose a non-nullable column with an empty string default
There are lots “skills” out there to improve postgres output:
- pg-aiguide from Timescale/Tiger Data (time series postgres)
- ai-rules from Neon (serverless postgres)
- agent-skills from ParadeDB (full text search postgres)
So far I’ve been happy with the prompt “review these changes and make sure they adhere to https://wiki.postgresql.org/wiki/Don’t_Do_This”

Simple SQL migration: learnings

Harness engineering: Flyway + Testcontainers is a superpower
Be specific about the “best practices” you care about
Ask Claude to review its own output, in different ways

Complex constraints

Prompt (shortened):
- “the dsl column in the custom_datapoint table can refer to rows in the datapoint table. add a join table and check constraint to capture this mapping and ensure that no rows get deleted from datapoint if they are referenced by the DSL definition of a row in custom_datapoint.”
The DSL definition is just saved as TEXT in the database - Claude needs to understand the DSL

CREATE TABLE datapoint (
    id   BIGINT PRIMARY KEY,
    name TEXT NOT NULL UNIQUE
);

CREATE TABLE custom_datapoint (
    id  BIGINT PRIMARY KEY,
    name TEXT NOT NULL,
    -- references datapoint rows,
    -- but only as free text!
    dsl TEXT NOT NULL
    -- e.g. 'price("AAPL") * 2
    --        + price("GOOG")'
);

-- No FK possible — the references
-- are buried inside the DSL string

Complex constraints: solution

-- Join table: extracted references from DSL text
CREATE TABLE custom_datapoint_reference (
    custom_datapoint_id BIGINT NOT NULL REFERENCES custom_datapoint(id),
    datapoint_id        BIGINT NOT NULL REFERENCES datapoint(id) RESTRICT,
    PRIMARY KEY (custom_datapoint_id, datapoint_id)
);

-- Trigger: parse DSL and populate join table on insert/update
CREATE FUNCTION extract_dsl_references()
RETURNS TRIGGER AS $$
BEGIN
    DELETE FROM custom_datapoint_reference
        WHERE custom_datapoint_id = NEW.id;

    INSERT INTO custom_datapoint_reference
        (custom_datapoint_id, datapoint_id)
    SELECT NEW.id, d.id FROM datapoint d
    WHERE NEW.dsl ~ ('\m' || d.name || '\M');
        -- regex word-boundary match (much more complicated in my version)

    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_extract_dsl_refs
    AFTER INSERT OR UPDATE OF dsl ON custom_datapoint
    FOR EACH ROW EXECUTE FUNCTION extract_dsl_references();

Real data (⛭ harness engineering ⛭)

This worked because I gave Claude 500 real examples of the DSL definitions
It was able to craft a relatively complicated function through trial and error with the data
- Claude is dumb but persistent
I also told it a little bit more about the DSL: structure, keywords, etc.
And it was all possible because Claude could fully validate with the testcontainers + flyway tests

Complex constraints: learnings

Claude can write postgres functions mind-blowingly better than I can
Testcontainers + Flyway is still a superpower!
- The only way I got confidence it worked is by hooking it into application unit tests
Claude solves database problems better (or at all) when you give examples of the real data

Plan testing

Restore stats

New functions in Postgres 18: pg_restore_relation_stats() and pg_restore_attribute_stats()
Load prod statistics into dev (or testcontainers!) database without needing the full dataset

SELECT pg_restore_attribute_stats(
    'schemaname', 'public',
    'relname', 'test_orders',
    'attname', 'status',
    'inherited', false::boolean,
    'null_frac', 0.0::real,
    'avg_width', 9::integer,
    'n_distinct', 5::real,
    'most_common_vals',
      '{delivered,shipped,cancelled,
        pending,returned}'::text,
    'most_common_freqs',
      '{0.95,0.015,0.015,
        0.015,0.005}'::real[]
);

¹ ²

Plan testing ideas

Claude could…
- Verify in tests that a certain execution plan is used for a certain query, even when new indexes are added
- Add indexes to target an execution plan you’re looking for
- Search for distributions of data that would result in a change to a bad plan
- … all at development time, before production!
Gotchas from my experiences playing around with this on a side project:
- Execution plan is not purely a function of objects, query, parameters, and stats
- Postgres will still do sequential scans of empty or very small tables - you can’t totally fool it

In Closing

My biggest learnings so far

Build a strong iteration loop for Claude and tell it what correct is:
- Give it a way to test your full application
- Give it real data
- Tell it what your best practices look like, really specifically
You know your domain and data model best

Thanks For Listening!

What have your biggest learnings been with Claude + Postgres been so far? Let’s discuss!

pgwhalen.com/early-learnings-with-postgres-and-claude