Schema-baked read models with SQLite generated columns

The projection you don't build

Geschrieben von Timo Rieber am 23. Dezember 2025

In cloudapps and mainzelmen, every aggregate lives as a JSON blob. A registration, a receivable, an attachment outcome - each serialized into a single state or data column. Repositories write the blob whole and never touch individual fields. The write model is one column wide, by design.

The read side was less deliberate. When I needed to list registrations with filtering by course, sorting by name, and payment JOINs across bounded contexts, every operation went through json_extract:

LEFT JOIN receivable rcv
    ON json_extract(rcv.state, '$.correlation.correlation')
     = json_extract(r.state, '$.uuid')
WHERE json_extract(r.state, '$.course_id') = ?
ORDER BY lower(json_extract(r.state, '$.person.last_name'))
Sql

SQLite evaluates each call at query time, on every row. No index can reach inside JSON. A full-text search concatenating six json_extract calls inside a LIKE was the worst of it - not slow with a few hundred registrations, but the kind of code where you're building a query engine by hand without noticing.

Let the schema do the extraction

SQLite's STORED generated columns compute a value from another column and materialize it on disk, updated on every write. One migration recreated four tables - registration, receivable, refund, course - with 19 generated columns between them:

CREATE TABLE registration (
    id TEXT PRIMARY KEY,
    state JSON,
    reg_first_name TEXT GENERATED ALWAYS AS (
        lower(json_extract(state, '$.person.first_name'))
    ) STORED,
    reg_email TEXT GENERATED ALWAYS AS (
        lower(json_extract(state, '$.person.email'))
    ) STORED,
    -- ... 8 more generated columns
);
Sql

The lower() call happens once, at write time. Composite indexes on generated columns handle the rest - a three-column index on first name, last name, and email covers the search that previously concatenated six json_extract calls. The same join from earlier:

LEFT JOIN receivable rcv ON rcv.rcv_correlation = r.reg_uuid
WHERE r.reg_course_id = ?
ORDER BY r.reg_last_name
Sql

Plain column access, fully indexable. The query doesn't know the data originates from JSON. The payment overdue detection had been the ugliest spot: a CASE expression with nested json_extract calls reaching into due dates, paid amounts, and cancellation flags across two tables. After the migration, four column comparisons.

One catch: SQLite can't add STORED generated columns via ALTER TABLE. The migration creates a new table with the full schema, copies the data in, drops the original, renames. Four tables, four times that dance. Verbose, but it runs once.

Where constraints replace code

In mainzelmen, an email attachment processor, the same pattern pulled its weight differently. I needed to guarantee that no two outcomes share a message ID and filename:

CREATE TABLE attachment_outcomes (
    id TEXT PRIMARY KEY,
    data TEXT NOT NULL,
    message_id TEXT GENERATED ALWAYS AS (
        json_extract(data, '$.message_id')
    ) STORED NOT NULL,
    filename TEXT GENERATED ALWAYS AS (
        json_extract(data, '$.filename')
    ) STORED NOT NULL,
    outcome TEXT GENERATED ALWAYS AS (
        json_extract(data, '$.outcome')
    ) STORED NOT NULL,
    UNIQUE(message_id, filename)
);
Sql

Without generated columns, enforcing that constraint means a SELECT before every INSERT. The generated column turns it into a schema rule. SQLite rejects the duplicate at write time; the application never runs the check.

Across both projects, 23 generated columns now serve as the read model. The extraction that queries used to do at read time, the schema does at write time - same transaction, no pipeline between them.