Early Learnings Building Apps on Postgres in the Claude Code Era: The Importance of the Iteration Loop

Paul Whalen

whoami

  • Senior Staff Software Engineer @ PEAK6 Capital Management
    • Proprietary options trading firm in Chicago
  • ~12 YOE as primarily-backend application engineer (Java, Go, Python)
    • Data engineering experience as well, with lots of Kafka and SQL
  • Database nerd

This is a presentation about AI

What this presentation is not

  • … about building software that uses LLMs
    • It’s about building deterministic software with LLMs
  • … about administering/tuning Postgres (mostly)
    • It’s about building apps on it
  • … informed by experience with coding agents other than Claude
    • Codex, etc. I’m sure are great too, but I have experience with Claude
  • … given by an expert
    • Is anyone an expert in this stuff yet?
  • … particularly long
    • Let’s leave time for discussion and experience sharing!

Agenda

  • Agentic Engineering and Claude Code
  • Claude Code + Postgres Experiences
  • Discussion

Agentic Engineering

Coding with AI 18 Months Ago

  • Ask the model questions via chat
    • Output code snippets
    • No context about your codebase
    • Little ability to verify correctness
  • “Fancy auto-complete”

Coding with AI Today

  • Prompts don’t just trigger a text response, they start an “agentic loop”:
    • Think about it
    • Do something
    • Check whether it worked
  • “Vibe coding”
    • Agent writes code, cross fingers, don’t read ouput
  • “Agentic Engineering”
    • Agent writes code, engineer evaluates, reviews, steers

Coding with AI Tomorrow

Levels of Agentic Engineering

1

Claude Code

  • Claude Code is an agentic harness around Claude models that implements the agentic loop

1

Claude Code Concepts

  • Tools
    • How Claude does stuff
    • Read/Write files
    • Execute programs
    • Search Web
  • Instructions and memory
    • What you tell Claude about your project
    • CLAUDE.md
    • Skills
    • Rules
  • Context
    • Claude’s memory of the current task

The Iteration Loop

  • Claude will not stop until it thinks it has given correct output
  • You define what correct is
  • Correct can be about:
    • Passing tests
    • Adhering to style/conventions
    • Running an app end to end
  • Claude is dumb but persistent: use this to your advantage

Claude Code + Postgres Experiences

Four Anecdotes

A stylized narrative of my experiences with Claude in the last 4 months

  1. Core data modeling
  2. Simple SQL migration
  3. Complex constraints
  4. Plan testing (❗)

Core data modeling

Core data modeling

  • Prompt:
    • “design a SQL schema to capture over-the-counter (OTC) traded instruments. it should track what positions you have in what specific accounts, and should track greek risk over time.”
  • This is a well-known problem space, that the model definitely has a lot of knowledge about
  • 12 months ago, the output would have been bad, these days the output is “good”

Core data modeling schema

Six tables capture the full lifecycle from trade execution through risk reporting:

  • counterparty — legal entities you face (LEI, name)
  • account — books/funds that hold positions
  • instrument — master contract record per OTC deal
  • trade — individual fills that build positions
  • position — end-of-day snapshots per account
  • greek_risk — timestamped sensitivities per position

Supports swaps, forwards, options, CDS, swaptions, caps/floors, NDFs, and commodity derivatives.

-- Instruments carry product-specific nullable fields
CREATE TABLE instrument (
    instrument_type  instrument_type NOT NULL,
    notional         NUMERIC(20,4)   NOT NULL,
    maturity_date    DATE            NOT NULL,
    strike           NUMERIC(20,8),  -- options only
    fixed_rate       NUMERIC(10,6),  -- swaps / CDS
    floating_index   TEXT,           -- e.g. 'SOFR'
    counterparty_id  BIGINT NOT NULL REFERENCES ...
);

-- Greek snapshots track risk evolution over time
CREATE TABLE greek_risk (
    position_id      BIGINT      NOT NULL,
    as_of_timestamp  TIMESTAMPTZ NOT NULL,
    delta            NUMERIC(20,6),
    gamma            NUMERIC(20,6),
    vega             NUMERIC(20,6),
    theta            NUMERIC(20,6),
    dv01             NUMERIC(20,6),
    cs01             NUMERIC(20,6),
    pricing_model    TEXT,
    model_params     JSONB  -- full reproducibility
);

Core data modeling: “issues”

  • The schema includes a “trade” table, but in our case that’s unnecessary complexity
  • The greek risk values are NUMERIC rather than FLOAT, implying false precision of modeled values
  • Position table includes both quantity tracking and profit/loss, which are decoupled in our system
  • These are not dealbreakers, but they would not pass code review: the human needs to be in the loop

Core data modeling: learnings

  • The model understand most domains superficially
  • But domain understanding is about your specific system and organiation needs
  • When designing a data model and schema, these details are important and hard to outsource to the model

Simple SQL migration

Simple SQL migration

  • Adding a feature to a Java application
  • Prompt:
    • “add a client_id to the ‘accounts’ table in a flyway migration. include the client ID on outgoing orders for that account”
ALTER TABLE accounts
    ADD COLUMN IF NOT EXISTS client_id TEXT CHECK (client_id <> '');

Testcontainers + Flyway (⛭ harness engineering ⛭ )

  • Testcontainers: Docker containers in JUnit tests
    • A real instance of postgres running in your unit tests
  • Flyway: SQL migration management
    • The real version of your schema, applied to that postgres
  • If you let Claude run (and create) these tests on its own, you get extraordinary confidence that it’s solving the problem fully
    • The key insight: these are “wider” tests than you might normally write with JUnit, but are still really easy to run

Best Practices and Style

  • The model may have been trained on “something”, but probably need to tell it directly what you care about
    • Claude just wants to finish, so the first pass might not be great
  • In the client_id migration example, it chose a non-nullable column with an empty string default
  • There are lots “skills” out there to improve postgres output:
    • pg-aiguide from Timescale/Tiger Data (time series postgres)
    • ai-rules from Neon (serverless postgres)
    • agent-skills from ParadeDB (full text search postgres)
  • So far I’ve been happy with the prompt “review these changes and make sure they adhere to https://wiki.postgresql.org/wiki/Don’t_Do_This”

Simple SQL migration: learnings

  • Harness engineering: Flyway + Testcontainers is a superpower
  • Be specific about the “best practices” you care about
  • Ask Claude to review its own output, in different ways

Complex constraints

Complex constraints

  • Prompt (shortened):
    • “the dsl column in the custom_datapoint table can refer to rows in the datapoint table. add a join table and check constraint to capture this mapping and ensure that no rows get deleted from datapoint if they are referenced by the DSL definition of a row in custom_datapoint.”
  • The DSL definition is just saved as TEXT in the database - Claude needs to understand the DSL
CREATE TABLE datapoint (
    id   BIGINT PRIMARY KEY,
    name TEXT NOT NULL UNIQUE
);

CREATE TABLE custom_datapoint (
    id  BIGINT PRIMARY KEY,
    name TEXT NOT NULL,
    -- references datapoint rows,
    -- but only as free text!
    dsl TEXT NOT NULL
    -- e.g. 'price("AAPL") * 2
    --        + price("GOOG")'
);

-- No FK possible — the references
-- are buried inside the DSL string

Complex constraints: solution

-- Join table: extracted references from DSL text
CREATE TABLE custom_datapoint_reference (
    custom_datapoint_id BIGINT NOT NULL REFERENCES custom_datapoint(id),
    datapoint_id        BIGINT NOT NULL REFERENCES datapoint(id) RESTRICT,
    PRIMARY KEY (custom_datapoint_id, datapoint_id)
);

-- Trigger: parse DSL and populate join table on insert/update
CREATE FUNCTION extract_dsl_references()
RETURNS TRIGGER AS $$
BEGIN
    DELETE FROM custom_datapoint_reference
        WHERE custom_datapoint_id = NEW.id;

    INSERT INTO custom_datapoint_reference
        (custom_datapoint_id, datapoint_id)
    SELECT NEW.id, d.id FROM datapoint d
    WHERE NEW.dsl ~ ('\m' || d.name || '\M');
        -- regex word-boundary match (much more complicated in my version)

    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_extract_dsl_refs
    AFTER INSERT OR UPDATE OF dsl ON custom_datapoint
    FOR EACH ROW EXECUTE FUNCTION extract_dsl_references();

Real data (⛭ harness engineering ⛭)

  • This worked because I gave Claude 500 real examples of the DSL definitions
  • It was able to craft a relatively complicated function through trial and error with the data
    • Claude is dumb but persistent
  • I also told it a little bit more about the DSL: structure, keywords, etc.
  • And it was all possible because Claude could fully validate with the testcontainers + flyway tests

Complex constraints: learnings

  • Claude can write postgres functions mind-blowingly better than I can
  • Testcontainers + Flyway is still a superpower!
    • The only way I got confidence it worked is by hooking it into application unit tests
  • Claude solves database problems better (or at all) when you give examples of the real data

Plan testing

Restore stats

  • New functions in Postgres 18: pg_restore_relation_stats() and pg_restore_attribute_stats()
  • Load prod statistics into dev (or testcontainers!) database without needing the full dataset
SELECT pg_restore_attribute_stats(
    'schemaname', 'public',
    'relname', 'test_orders',
    'attname', 'status',
    'inherited', false::boolean,
    'null_frac', 0.0::real,
    'avg_width', 9::integer,
    'n_distinct', 5::real,
    'most_common_vals',
      '{delivered,shipped,cancelled,
        pending,returned}'::text,
    'most_common_freqs',
      '{0.95,0.015,0.015,
        0.015,0.005}'::real[]
);

1 2

Plan testing ideas

  • Claude could…
    • Verify in tests that a certain execution plan is used for a certain query, even when new indexes are added
    • Add indexes to target an execution plan you’re looking for
    • Search for distributions of data that would result in a change to a bad plan
    • … all at development time, before production!
  • Gotchas from my experiences playing around with this on a side project:
    • Execution plan is not purely a function of objects, query, parameters, and stats
    • Postgres will still do sequential scans of empty or very small tables - you can’t totally fool it

In Closing

My biggest learnings so far

  • Build a strong iteration loop for Claude and tell it what correct is:
    • Give it a way to test your full application
    • Give it real data
    • Tell it what your best practices look like, really specifically
  • You know your domain and data model best

Thanks For Listening!

What have your biggest learnings been with Claude + Postgres been so far? Let’s discuss!

pgwhalen.com/early-learnings-with-postgres-and-claude