Data Model

The AMTP schema is managed by Flyway and consists of six versioned migrations (V1–V6). All tables reside in the public schema of the amtp database. Reference: migrations/flyway.conf.

Entity Relationship Summary #

projects PK: project_id
1 ──► N
runs PK: run_id FK: project_id
1 ──► N
stages PK: stage_id FK: run_id
runs ↓ 1:N (mandatory)
artifacts FK: run_id FK: stage_id (nullable)
approvals FK: run_id FK: stage_id (nullable)
stages ↓ 1:N (optional)

artifacts.stage_id and approvals.stage_id are both nullable. A row always belongs to a run; optional stage ownership is set when the artifact or decision is scoped to a specific Temporal activity invocation.

All primary keys are UUID v4 generated by gen_random_uuid(). All timestamps are TIMESTAMPTZ (time-zone-aware).

V1 — projects #

Registers a GitHub repository as an AMTP project.

CREATE TABLE projects (
    project_id            UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    name                  TEXT        NOT NULL UNIQUE,
    github_repo_full_name TEXT        NOT NULL,
    description           TEXT,
    created_at            TIMESTAMPTZ NOT NULL DEFAULT now()
);
migrations/sql/V1__projects.sql
Column Type Constraints Notes
project_id UUID PRIMARY KEY, DEFAULT gen_random_uuid() Stable identifier for the project.
name TEXT NOT NULL, UNIQUE Human-readable project name; must be unique across all projects.
github_repo_full_name TEXT NOT NULL GitHub repository in owner/name format, e.g. acme/api.
description TEXT nullable Optional free-text description.
created_at TIMESTAMPTZ NOT NULL, DEFAULT now() Row creation timestamp.

V2 — runs #

Represents a single end-to-end pipeline execution. The status column is the authoritative lifecycle state for a run; it is managed by the Temporal orchestrator and the webhook receiver.

CREATE TABLE runs (
    run_id      UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    project_id  UUID        NOT NULL REFERENCES projects(project_id) ON DELETE CASCADE,
    status      TEXT        NOT NULL DEFAULT 'pending'
                            CHECK (status IN ('pending','running','passed','failed','cancelled')),
    depth_level TEXT        NOT NULL CHECK (depth_level IN ('smoke','core','standard','deep')),
    branch_name TEXT        NOT NULL,
    commit_sha  TEXT,
    started_at  TIMESTAMPTZ,
    finished_at TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
migrations/sql/V2__runs.sql
Column Type Constraints Notes
run_id UUID PRIMARY KEY Correlation key carried through all agent activities and Valkey keys.
project_id UUID NOT NULL, FK → projects, CASCADE Deleting a project cascades to all its runs.
status TEXT CHECK: pending | running | passed | failed | cancelled See run lifecycle.
depth_level TEXT CHECK: smoke | core | standard | deep Controls crawl breadth passed to the Repo Crawler agent.
branch_name TEXT NOT NULL Target branch to crawl and use as the PR base branch.
commit_sha TEXT nullable Populated by the Repo Crawler after crawling; the exact ref crawled.
started_at TIMESTAMPTZ nullable Set when the Temporal workflow starts.
finished_at TIMESTAMPTZ nullable Set on terminal state (passed, failed, cancelled).
created_at TIMESTAMPTZ NOT NULL, DEFAULT now() Row creation timestamp (webhook receipt time).

V3 — stages #

Maps to individual Temporal activity invocations within a run. The awaiting_approval status is used when a stage is blocked on a human decision (e.g. BranchProtectionViolation).

CREATE TABLE stages (
    stage_id    UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    run_id      UUID        NOT NULL REFERENCES runs(run_id) ON DELETE CASCADE,
    name        TEXT        NOT NULL,
    sequence    INTEGER     NOT NULL,
    attempt     INTEGER     NOT NULL DEFAULT 1,
    status      TEXT        NOT NULL DEFAULT 'pending'
                            CHECK (status IN
                              ('pending','running','awaiting_approval',
                               'passed','failed','skipped','cancelled')),
    started_at  TIMESTAMPTZ,
    finished_at TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    UNIQUE (run_id, sequence, attempt)
);
migrations/sql/V3__stages.sql
Column Type Constraints Notes
stage_id UUID PRIMARY KEY
run_id UUID NOT NULL, FK → runs, CASCADE
name TEXT NOT NULL Human-readable name, e.g. CrawlRepo, GenerateTestCases.
sequence INTEGER NOT NULL Execution order within the run (1-based).
attempt INTEGER NOT NULL DEFAULT 1 Retry attempt number; a new row is inserted for each retry.
status TEXT CHECK: 7 values awaiting_approval gates workflow on a human decision. See stage lifecycle.
started_at / finished_at TIMESTAMPTZ nullable Activity-level timing.
UNIQUE(run_id, sequence, attempt) Prevents duplicate stage records for the same retry of the same step.

V4 — artifacts #

Stores structured JSON output from each agent activity. The content JSONB column is the source of truth for agent handoffs; Valkey never stores handoff payloads.

CREATE TABLE artifacts (
    artifact_id UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    run_id      UUID        NOT NULL REFERENCES runs(run_id)    ON DELETE CASCADE,
    stage_id    UUID                 REFERENCES stages(stage_id) ON DELETE SET NULL,
    kind        TEXT        NOT NULL,
    content     JSONB       NOT NULL DEFAULT '{}'::jsonb
                            CHECK (jsonb_typeof(content) = 'object'),
    storage_uri TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
migrations/sql/V4__artifacts.sql
Column Type Constraints Notes
artifact_id UUID PRIMARY KEY
run_id UUID NOT NULL, FK → runs, CASCADE
stage_id UUID nullable, FK → stages, SET NULL Nullable so that run-level artifacts (e.g. failure_report) do not require a stage.
kind TEXT NOT NULL Semantic type: repo_crawler_output, test_case_generator_output, test_engineer_output, failure_report.
content JSONB NOT NULL, CHECK: must be object Validated agent output or structured failure details. GIN-indexed (V6) for containment queries.
storage_uri TEXT nullable Garage S3 URI for large binary artifacts stored externally, e.g. s3://amtp-bundles/{run_id}/bundle.zip.
created_at TIMESTAMPTZ NOT NULL, DEFAULT now()

JSONB constraint. jsonb_typeof(content) = 'object' prevents accidental insertion of JSON arrays or scalar values. Agent output schemas all produce top-level JSON objects; this is enforced at both the schema-validation layer and the database level.

V5 — approvals #

Records human or automated decision gates. Created when a stage transitions to awaiting_approval (e.g. on BranchProtectionViolation).

CREATE TABLE approvals (
    approval_id UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    run_id      UUID        NOT NULL REFERENCES runs(run_id)    ON DELETE CASCADE,
    stage_id    UUID                 REFERENCES stages(stage_id) ON DELETE CASCADE,
    approver    TEXT        NOT NULL,
    decision    TEXT        NOT NULL DEFAULT 'pending'
                            CHECK (decision IN ('pending','approved','rejected')),
    comment     TEXT,
    decided_at  TIMESTAMPTZ,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
migrations/sql/V5__approvals.sql
Column Type Constraints Notes
approval_id UUID PRIMARY KEY
run_id UUID NOT NULL, FK → runs, CASCADE
stage_id UUID nullable, FK → stages, CASCADE Nullable for run-level approval records not tied to a specific stage.
approver TEXT NOT NULL Identity of the decision maker: a user ID, system name (e.g. branch-protection), or external agent.
decision TEXT CHECK: pending | approved | rejected Initial value is always pending; updated externally when a decision is made.
comment TEXT nullable Free-text decision rationale or GitHub API error body (for BranchProtectionViolation).
decided_at TIMESTAMPTZ nullable Set when decision transitions from pending.
created_at TIMESTAMPTZ NOT NULL

V6 — Indexes #

All indexes are created in a separate migration executed outside a transaction (migrations/sql/V6__indexes.sql.conf sets executeInTransaction=false), because CREATE INDEX cannot run inside a transaction block in Postgres.

-- FK lookup indexes
CREATE INDEX idx_runs_project_id       ON runs(project_id);
CREATE INDEX idx_stages_run_id         ON stages(run_id);
CREATE INDEX idx_artifacts_run_id      ON artifacts(run_id);
CREATE INDEX idx_artifacts_stage_id    ON artifacts(stage_id);
CREATE INDEX idx_approvals_run_id      ON approvals(run_id);
CREATE INDEX idx_approvals_stage_id    ON approvals(stage_id);

-- Query-path indexes
CREATE INDEX idx_runs_status_created   ON runs(status, created_at DESC);
CREATE INDEX idx_runs_depth_level      ON runs(depth_level);
CREATE INDEX idx_stages_status         ON stages(status);
CREATE INDEX idx_artifacts_kind        ON artifacts(kind);
CREATE INDEX idx_approvals_decision    ON approvals(decision);

-- JSONB GIN indexes for containment/key queries
CREATE INDEX idx_artifacts_content_gin ON artifacts USING GIN (content jsonb_path_ops);
migrations/sql/V6__indexes.sql

Index strategy notes #

Index Type Purpose
idx_runs_project_id
idx_stages_run_id
idx_artifacts_run_id
idx_approvals_run_id
B-tree Foreign-key lookup performance. Without these, cascading deletes and JOIN queries on the FK columns perform full sequential scans.
idx_runs_status_created B-tree (composite) Supports the common query pattern: list active runs ordered by creation time. Composite order matches the ORDER BY status, created_at DESC pattern.
idx_artifacts_content_gin GIN (jsonb_path_ops) Supports JSONB containment (@>) and key-exists (?) queries on artifacts.content. jsonb_path_ops is smaller and faster than jsonb_ops for containment queries.

Flyway Conventions #

Convention Value / Behavior
Migration file naming V{n}__{description}.sql — double underscore separator. Version numbers are integers, strictly increasing.
Checksum validation validateOnMigrate=true. Applied migration files must never be modified after deployment.
cleanDisabled true. flyway clean is permanently disabled; prevents accidental schema wipe in any environment.
baselineOnMigrate true. Allows Flyway to baseline a pre-existing schema on first run without requiring an empty database.
Non-transactional migrations V6__indexes.sql.conf sets executeInTransaction=false. Required for CREATE INDEX on Postgres.
Schema public. All tables reside in the default schema.
Flyway history table flyway_schema_history (default). Created automatically on first run.

Documented Future Schema #

-- Future V7 migration (not yet applied)
CREATE TABLE activity_idempotency (
    idem_key           TEXT        PRIMARY KEY,
    run_id             UUID        NOT NULL REFERENCES runs(run_id) ON DELETE CASCADE,
    activity           TEXT        NOT NULL,
    result_artifact_id UUID        REFERENCES artifacts(artifact_id) ON DELETE SET NULL,
    created_at         TIMESTAMPTZ NOT NULL DEFAULT now()
);
Proposed activity_idempotency table — not yet a migration file.

The idem_key is derived as: sha256(run_id || ":" || activity_name || ":" || canonical_json(input)). On activity retry, the worker checks for an existing row before re-executing the LLM call or GitHub API write.