Data Model
The AMTP schema is managed by
Flyway and consists of
six versioned migrations (V1–V6). All tables reside in the
public schema of the amtp database. Reference:
migrations/flyway.conf.
Entity Relationship Summary #
PK: project_id
PK: run_id
FK: project_id
PK: stage_id
FK: run_id
FK: run_id
FK: stage_id (nullable)
FK: run_id
FK: stage_id (nullable)
artifacts.stage_id and
approvals.stage_id are both nullable. A row always
belongs to a run; optional stage ownership is
set when the artifact or decision is scoped to a specific
Temporal activity invocation.
- A project registers a GitHub repository; it has many runs.
- A run represents one full pipeline execution; it has many stages, artifacts, and approvals.
- A stage maps to one Temporal activity invocation (e.g. CrawlRepo, GenerateTestCases). It optionally owns artifacts and approvals.
- An artifact stores structured JSON output persisted between agent boundaries.
- An approval records a human or automated decision gate for a stage.
All primary keys are UUID v4 generated by
gen_random_uuid(). All timestamps are
TIMESTAMPTZ (time-zone-aware).
V1 — projects
#
Registers a GitHub repository as an AMTP project.
CREATE TABLE projects (
project_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE,
github_repo_full_name TEXT NOT NULL,
description TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
| Column | Type | Constraints | Notes |
|---|---|---|---|
project_id |
UUID |
PRIMARY KEY, DEFAULT gen_random_uuid() |
Stable identifier for the project. |
name |
TEXT |
NOT NULL, UNIQUE | Human-readable project name; must be unique across all projects. |
github_repo_full_name |
TEXT |
NOT NULL |
GitHub repository in owner/name format, e.g.
acme/api.
|
description |
TEXT |
nullable | Optional free-text description. |
created_at |
TIMESTAMPTZ |
NOT NULL, DEFAULT now() |
Row creation timestamp. |
V2 — runs
#
Represents a single end-to-end pipeline execution. The
status column is the authoritative lifecycle state for a
run; it is managed by the Temporal orchestrator and the webhook
receiver.
CREATE TABLE runs (
run_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id UUID NOT NULL REFERENCES projects(project_id) ON DELETE CASCADE,
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending','running','passed','failed','cancelled')),
depth_level TEXT NOT NULL CHECK (depth_level IN ('smoke','core','standard','deep')),
branch_name TEXT NOT NULL,
commit_sha TEXT,
started_at TIMESTAMPTZ,
finished_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
| Column | Type | Constraints | Notes |
|---|---|---|---|
run_id |
UUID |
PRIMARY KEY | Correlation key carried through all agent activities and Valkey keys. |
project_id |
UUID |
NOT NULL, FK → projects, CASCADE | Deleting a project cascades to all its runs. |
status |
TEXT |
CHECK:
pending | running | passed | failed | cancelled
|
See run lifecycle. |
depth_level |
TEXT |
CHECK: smoke | core | standard | deep |
Controls crawl breadth passed to the Repo Crawler agent. |
branch_name |
TEXT |
NOT NULL | Target branch to crawl and use as the PR base branch. |
commit_sha |
TEXT |
nullable | Populated by the Repo Crawler after crawling; the exact ref crawled. |
started_at |
TIMESTAMPTZ |
nullable | Set when the Temporal workflow starts. |
finished_at |
TIMESTAMPTZ |
nullable |
Set on terminal state (passed,
failed, cancelled).
|
created_at |
TIMESTAMPTZ |
NOT NULL, DEFAULT now() |
Row creation timestamp (webhook receipt time). |
V3 — stages
#
Maps to individual Temporal activity invocations within a run. The
awaiting_approval status is used when a stage is blocked
on a human decision (e.g. BranchProtectionViolation).
CREATE TABLE stages (
stage_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
run_id UUID NOT NULL REFERENCES runs(run_id) ON DELETE CASCADE,
name TEXT NOT NULL,
sequence INTEGER NOT NULL,
attempt INTEGER NOT NULL DEFAULT 1,
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN
('pending','running','awaiting_approval',
'passed','failed','skipped','cancelled')),
started_at TIMESTAMPTZ,
finished_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (run_id, sequence, attempt)
);
| Column | Type | Constraints | Notes |
|---|---|---|---|
stage_id |
UUID |
PRIMARY KEY | |
run_id |
UUID |
NOT NULL, FK → runs, CASCADE | |
name |
TEXT |
NOT NULL |
Human-readable name, e.g. CrawlRepo,
GenerateTestCases.
|
sequence |
INTEGER |
NOT NULL | Execution order within the run (1-based). |
attempt |
INTEGER |
NOT NULL DEFAULT 1 | Retry attempt number; a new row is inserted for each retry. |
status |
TEXT |
CHECK: 7 values |
awaiting_approval gates workflow on a human
decision. See
stage lifecycle.
|
started_at / finished_at |
TIMESTAMPTZ |
nullable | Activity-level timing. |
UNIQUE(run_id, sequence, attempt) |
Prevents duplicate stage records for the same retry of the same step. | ||
V4 — artifacts
#
Stores structured JSON output from each agent activity. The
content JSONB column is the source of truth for agent
handoffs; Valkey never stores handoff payloads.
CREATE TABLE artifacts (
artifact_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
run_id UUID NOT NULL REFERENCES runs(run_id) ON DELETE CASCADE,
stage_id UUID REFERENCES stages(stage_id) ON DELETE SET NULL,
kind TEXT NOT NULL,
content JSONB NOT NULL DEFAULT '{}'::jsonb
CHECK (jsonb_typeof(content) = 'object'),
storage_uri TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
| Column | Type | Constraints | Notes |
|---|---|---|---|
artifact_id |
UUID |
PRIMARY KEY | |
run_id |
UUID |
NOT NULL, FK → runs, CASCADE | |
stage_id |
UUID |
nullable, FK → stages, SET NULL |
Nullable so that run-level artifacts (e.g.
failure_report) do not require a stage.
|
kind |
TEXT |
NOT NULL |
Semantic type: repo_crawler_output,
test_case_generator_output,
test_engineer_output,
failure_report.
|
content |
JSONB |
NOT NULL, CHECK: must be object | Validated agent output or structured failure details. GIN-indexed (V6) for containment queries. |
storage_uri |
TEXT |
nullable |
Garage S3 URI for large binary artifacts stored externally,
e.g. s3://amtp-bundles/{run_id}/bundle.zip.
|
created_at |
TIMESTAMPTZ |
NOT NULL, DEFAULT now() |
JSONB constraint.
jsonb_typeof(content) = 'object' prevents accidental
insertion of JSON arrays or scalar values. Agent output schemas all
produce top-level JSON objects; this is enforced at both the
schema-validation layer and the database level.
V5 — approvals
#
Records human or automated decision gates. Created when a stage
transitions to
awaiting_approval (e.g. on
BranchProtectionViolation).
CREATE TABLE approvals (
approval_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
run_id UUID NOT NULL REFERENCES runs(run_id) ON DELETE CASCADE,
stage_id UUID REFERENCES stages(stage_id) ON DELETE CASCADE,
approver TEXT NOT NULL,
decision TEXT NOT NULL DEFAULT 'pending'
CHECK (decision IN ('pending','approved','rejected')),
comment TEXT,
decided_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
| Column | Type | Constraints | Notes |
|---|---|---|---|
approval_id |
UUID |
PRIMARY KEY | |
run_id |
UUID |
NOT NULL, FK → runs, CASCADE | |
stage_id |
UUID |
nullable, FK → stages, CASCADE | Nullable for run-level approval records not tied to a specific stage. |
approver |
TEXT |
NOT NULL |
Identity of the decision maker: a user ID, system name (e.g.
branch-protection), or external agent.
|
decision |
TEXT |
CHECK: pending | approved | rejected |
Initial value is always pending; updated
externally when a decision is made.
|
comment |
TEXT |
nullable |
Free-text decision rationale or GitHub API error body (for
BranchProtectionViolation).
|
decided_at |
TIMESTAMPTZ |
nullable |
Set when decision transitions from
pending.
|
created_at |
TIMESTAMPTZ |
NOT NULL |
V6 — Indexes #
All indexes are created in a separate migration executed outside a
transaction (migrations/sql/V6__indexes.sql.conf sets
executeInTransaction=false), because
CREATE INDEX cannot run inside a transaction block in
Postgres.
-- FK lookup indexes
CREATE INDEX idx_runs_project_id ON runs(project_id);
CREATE INDEX idx_stages_run_id ON stages(run_id);
CREATE INDEX idx_artifacts_run_id ON artifacts(run_id);
CREATE INDEX idx_artifacts_stage_id ON artifacts(stage_id);
CREATE INDEX idx_approvals_run_id ON approvals(run_id);
CREATE INDEX idx_approvals_stage_id ON approvals(stage_id);
-- Query-path indexes
CREATE INDEX idx_runs_status_created ON runs(status, created_at DESC);
CREATE INDEX idx_runs_depth_level ON runs(depth_level);
CREATE INDEX idx_stages_status ON stages(status);
CREATE INDEX idx_artifacts_kind ON artifacts(kind);
CREATE INDEX idx_approvals_decision ON approvals(decision);
-- JSONB GIN indexes for containment/key queries
CREATE INDEX idx_artifacts_content_gin ON artifacts USING GIN (content jsonb_path_ops);
Index strategy notes #
| Index | Type | Purpose |
|---|---|---|
idx_runs_project_ididx_stages_run_ididx_artifacts_run_ididx_approvals_run_id
|
B-tree | Foreign-key lookup performance. Without these, cascading deletes and JOIN queries on the FK columns perform full sequential scans. |
idx_runs_status_created |
B-tree (composite) |
Supports the common query pattern:
list active runs ordered by creation time. Composite
order matches the
ORDER BY status, created_at DESC pattern.
|
idx_artifacts_content_gin |
GIN (jsonb_path_ops) |
Supports JSONB containment (@>) and key-exists
(?) queries on artifacts.content.
jsonb_path_ops is smaller and faster than
jsonb_ops for containment queries.
|
Flyway Conventions #
| Convention | Value / Behavior |
|---|---|
| Migration file naming |
V{n}__{description}.sql — double underscore
separator. Version numbers are integers, strictly increasing.
|
| Checksum validation |
validateOnMigrate=true. Applied migration files
must never be modified after deployment.
|
cleanDisabled |
true. flyway clean is permanently
disabled; prevents accidental schema wipe in any environment.
|
baselineOnMigrate |
true. Allows Flyway to baseline a pre-existing
schema on first run without requiring an empty database.
|
| Non-transactional migrations |
V6__indexes.sql.conf sets
executeInTransaction=false. Required for
CREATE INDEX on Postgres.
|
| Schema |
public. All tables reside in the default schema.
|
| Flyway history table |
flyway_schema_history (default). Created
automatically on first run.
|
Documented Future Schema #
-- Future V7 migration (not yet applied)
CREATE TABLE activity_idempotency (
idem_key TEXT PRIMARY KEY,
run_id UUID NOT NULL REFERENCES runs(run_id) ON DELETE CASCADE,
activity TEXT NOT NULL,
result_artifact_id UUID REFERENCES artifacts(artifact_id) ON DELETE SET NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
activity_idempotency table — not yet a
migration file.
The idem_key is derived as:
sha256(run_id || ":" || activity_name || ":" ||
canonical_json(input)). On activity retry, the worker checks for an existing row before
re-executing the LLM call or GitHub API write.