Agent API

Stable, dual-auth API + CLI for terminals, scripts, and agents. Base URL: https://api.getpromethic.com/api/v2/public

Quickstart

Issue a key in the web app at app.getpromethic.com → Settings → Developer Keys, then:

curl -H "X-API-Key: pmk_..." \
  https://api.getpromethic.com/api/v2/public/prompts

Or via the CLI:

npm install -g @soulwarestudio/promethic-cli
promethic auth login                              # paste pmk_... key
promethic prompts list
promethic run <prompt-id> --input "summarize this article"

For write-scope keys, an agent can author a whole prompt declaratively via a YAML manifest. Note the nested parameters object — the server requires { model_id, parameters } shape so per-model parameter values stay distinct from envelope fields.

Step one: discover a real model_id via GET /api/v2/public/models. Catalog IDs are opaque (e.g., gpt54nano_2c6f9b4d); the wire name isn't the marketing name. Pick one from the response:

curl -H "X-API-Key: pmk_..." \
  https://api.getpromethic.com/api/v2/public/models \
  | jq '.recommended_defaults'
# → {"model_id": "gpt55_8e2b1d4f", "reasoning_effort": "medium", ...}

Step two: paste it directly into your manifest — the catalog response is round-trippable. Copy model_id into modelSettings.model_id; pick parameter values from each param's values / min / max / provider_default.

# prompt.yaml
name: Article summarizer
promptText: |
  Summarize the input in three bullets.
modelSettings:
  model_id: gpt54nano_2c6f9b4d   # ← from /models response above
  parameters:
    reasoning_effort: low
attachments:
  - type: text
    file: ./examples.txt

promethic prompts create --manifest prompt.yaml

Authentication

All requests carry an API key in the X-API-Key header. Keys start with pmk_ and are scoped to one user. Three scopes are available in V1.0.1:

Scope	Grants
read	list/get prompts, versions, records, attachments, record images, models catalog
execute	run, revise (runId), finalize
write	create + update + delete prompts, versions, attachments, records (V1.0.1 + V1.1 Phases 2 & 3)

Scopes are checked literally; an execute-only key cannot list prompts. To do all three, request all three: scopes: ["read", "execute", "write"]. DELETE is now part of write as of V1.1; per-prompt grants gate leaked-key blast radius.

Per-prompt grants V1.1

On top of scopes, each key can be optionally restricted to a specific set of prompts. Managed via the web app at app.getpromethic.com → Settings → Developer Keys → Manage; agents do not configure their own restrictions.

Unrestricted (default) — a freshly-minted key has zero per-prompt grants and can access ALL of your prompts, gated only by its scopes.
Restricted — once you add ANY prompt to a key's grant list, the key is restricted: only listed prompts are accessible. Calls to other prompts return 403 grant_required.
Removing the last prompt from a restricted key returns it to unrestricted (the web UI confirms this transition).

Enforcement is per-call: every endpoint that resolves to a single prompt (run, revise on runId, finalize, prompt and version reads, prompt PATCH/version create/attachment upload, record GET/image/DELETE/PATCH) re-checks the grant list at the time of the call. If you revoke a grant mid-workflow, the next call to the affected prompt 403s.

Mixed credentials: a request carrying both a session token and an API key MUST resolve to the same user, else the server returns 400 mixed_credentials_principal_mismatch.

Endpoints

Timestamps: all *AtUtc fields are UTC ISO 8601 strings (e.g. "2026-05-29T14:32:00Z"). Parse with new Date(str) or a library date parser; do not strip the Z suffix.

Read (`read` scope)

GET	`/prompts?limit=&cursor=`	list (slim DTO)
GET	`/prompts/{id}`	+ current version
GET	`/prompts/{id}/versions`	paginated history — sorted by `versionNumber` descending (highest first); cursor is version-number-based, not timestamp-based
GET	`/prompts/{id}/versions/{vid}`	single version
GET	`/prompts/{id}/attachments`	prompt attachments
GET	`/attachments/{id}`	attachment download
GET	`/records?promptId=&versionId=&source=&createdBy=&limit=&maxOutputChars=&maxInputChars=`	list (cursor-paginated); each record includes `inputText` (truncated at `maxInputChars`, default 4096, 0–32768) and `outputText` (truncated at `maxOutputChars`); `truncated` (output) / `inputTruncated` booleans indicate truncation; `source` is a string label — `"API"` (MCP `run_prompt` or REST API key execution), `"App"` (Avalonia desktop client), `"Manual"` (created via `create_record`, no LLM — `versionId` is `null`), `"Headless"` (Expo mobile client — not MCP)
GET	`/records/{id}`	slim record DTO
GET	`/records/{id}/image?index=N`	image PNG (binary)
GET	`/models`	catalog: model_id, supportedOutputModalities, costs (V1.0.1)

Execute (`execute` scope)

POST	`/prompts/{id}/run`	creates RunSession; SSE
POST	`/runs/{runId}/revise`	append turn; SSE
POST	`/runs/{runId}/finalize`	session→record (V1.1: `?persist=` removed; always commits)
GET	`/runs/{runId}/images/{N}`	session image (active sessions)
POST	`/records/{id}/revise` V1.2	rehydrate fresh run from record + revise (replaces V1.1 `/revise-again`); per-record advisory lock; SSE
DELETE	`/records/{id}` V1.1	self-delete (ApiKey-owned, 24h window)
PATCH	`/records/{id}` V1.1	amend notes/tag (RFC 7396; no time window)

Run lifecycle body shapes V1.1

The execute surface mirrors the desktop Avalonia app's Convert / Revise / Copy / Copy & Tag flow. Body shapes for each call:

`POST /runs/{runId}/revise`

{
  "instruction": "make it more concise",        // required
  "intermediateOutput": "edited prior output"   // optional
}

intermediateOutput is the user's edited prior-turn output. When passed, the model sees this (instead of the prior turn's actual output) as context for this revision, AND it lands in the resulting ConversionDelta.IntermediateOutput. Mirrors the Avalonia "user edited the textbox before hitting Revise" flow. Omit for vanilla revise-from-prior-output. Image- modality runs reject this field (400 intermediate_output_not_supported_image) — image revisions always source from the model's actual prior output. Pass null or omit to use the prior output; an empty string or whitespace-only value is rejected (400 invalid_params). 32 KB cap. Record impact: because get_record reconstructs each turn's output as "what the next turn saw as prior context," passing intermediateOutput means turns[N-1].output in the resulting record will reflect your override value, not the raw model output from that turn. Only use this when you have genuinely edited the prior output and want the record to capture that edited state as the correction baseline. Finalized sessions are transparently reopened — the server transitions Finalized → Active before the LLM call; no need to switch to POST /records/{id}/revise after a finalize. Expired or Failed runs cannot be revised; start a fresh /run. If Abandoned, returns 404 run_not_found.

`POST /runs/{runId}/finalize`

{
  "finalText": "edited final output",   // optional; creates edit delta if differs from model output
  "tag": "exemplar",                    // optional; attaches to edit delta (requires finalText)
  "notes": "context for this run"       // optional; record-level
}

Three-axis surface that mirrors Avalonia's Copy / Copy & Tag semantic exactly:

Plain finalize (no body): equivalent to Avalonia Copy with no edits. Saves the record from the model's last output. No edit delta.
finalText only: Avalonia Copy after editing the output box. If finalText differs from the model's last output, server creates an edit delta with empty tag. If it matches, no edit delta (treated as a clean Copy).
finalText + tag: Avalonia Copy & Tag. Tag attaches to the edit delta — the Refine wizard signal. Server returns 400 tag_without_delta if finalText is omitted or matches the model output (no edit delta to tag). Avalonia disables the Copy & Tag button in the same situation.
notes: independent of the above. Record-level free text. Available without finalText on plain finalize, or alongside finalText/tag.

Image-modality runs reject finalText (400 final_text_not_supported_image) — record.finalCopiedOutput for image records is server-derived from the per-turn effectivePromptForImage accumulation chain (training-data invariant; CLAUDE.md "Image Records: FinalCopiedOutput as Accumulated Prompt").

Caps: finalText 256 KB (413 final_text_too_large); if provided, finalText cannot be an empty string or whitespace-only value (400 invalid_params); whitespace-only tag is also rejected with 400 invalid_params; notes 64 KB (413 notes_too_large); tag 64 KB (413 tag_too_large).

Record DTO `turns[]` field V1.2

Records returned by GET /records, GET /records/{id}, and POST /finalize include a turns[] array that is a synthesized linear history of the record's states (run + revisions + optional edit). Each entry has a stable index matching the fromTurn parameter accepted by /revise and /finalize.

{
  "id": "...",
  "promptId": "...",
  "inputText": "Summarize this article: ...",
  "finalCopiedOutput": "Y_edited",
  "turns": [
    { "index": 0, "kind": "run",      "input": "Summarize this article: ...", "output": "X" },
    { "index": 1, "kind": "revision", "instruction": "make formal",
      "intermediateOutput": "X", "output": "Y", "modelId": "...", "costMilliCents": 1234 },
    { "index": 2, "kind": "edit",     "intermediateOutput": "Y",
      "output": "Y_edited", "tag": "user-edit" }
  ],
  ...
}

Three kinds:

kind: "run" — always index 0. Carries input (the prompt's user input). NOTE: output for the run turn is "what the next turn saw as its prior context, OR the record's finalCopiedOutput if no later turn." If a user edited the textbox client-side before pressing Revise, the edit was committed forward as the next revision's intermediateOutput; the model's literal output text is not preserved.
kind: "revision" — each /revise call appends one. instruction is the user's revise instruction. intermediateOutput is what the model saw as prior context.
kind: "edit" — at most one per record; always the last entry. Created when /finalize was called with finalText differing from the model's last output. intermediateOutput is the model's actual last output before the edit; output is the user's edited text (= record's finalCopiedOutput).

Use turns[].index to pass fromTurn on /revise or /finalize for rewind-and-redo (V1.2). Index space: fromTurn is 0-based and uses the same index as turns[].index in get_record. Valid range is 0 to turns.length - 2 (exclusive upper bound — passing the last turn index would revert the record to an empty state). from_turn_invalid is returned for negative values; from_turn_out_of_range is returned if the value exceeds the valid range.

Record self-management (V1.1)

DELETE /api/v2/public/records/{id} hard-deletes a record if and only if (a) the caller is an API key, (b) the record was created by the same API key (credentialPrincipalType=ApiKey + matching id), and (c) the record is less than 24h old (anchored to record.createdAt). Returns 204 No Content on success, or 403 record_not_owned_by_api_key / 409 record_self_delete_window_expired / 404 record_not_found otherwise. Delete is hard (cascade to deltas; FK ON DELETE SET NULL on RunSession.FinalizedRecordId auto-clears any finalize-replay rendezvous so the next /finalize replay returns 410 record_was_deleted). Image bytes are best-effort cleaned from blob storage. Retries after a successful DELETE return 404 — the row is gone, so HTTP-level idempotency is by-construction (the second call returns 404 record_not_found). Note: the MCP delete_record tool uses soft-delete and is idempotent — retries return {status:"deleted"} instead of a 404.

PATCH /api/v2/public/records/{id} updates mutable fields on a record. Same ApiKey-owned check as DELETE, but no time window. Body is RFC 7396 JSON Merge Patch (Content-Type: application/merge-patch+json): missing key = unchanged, explicit null = clear, value = set.

Behavior depends on the record's source field:

Manual records (source = "Manual", created via create_record): notes, input, and output are freely editable with no delta ever created. tag and fromTurn return 400 invalid_request — manual records are pure seed-data pairs with no delta chain.

Non-manual records (App / API / Headless): writing output auto-creates or updates the record's single edit delta. intermediateOutput is set once on first edit and is immutable (anchors the model's last returned output); subsequent output patches update finalCopiedOutput only. Revert to model output: if output equals the existing edit delta's intermediateOutput (the model's last returned output), the edit delta is removed and editCount returns to 0 — you are undoing the correction. Providing tag alongside a reverting output returns 400 tag_would_be_lost_on_revert (the delta is being removed; omit tag to revert cleanly, or use a different output to keep the delta and anchor the tag). tag is optional annotation on the delta and requires output; write it as a generalizable rule or vibe that applies to other inputs too (e.g. “Less formal”, “Remove em-dashes”), not a description of this specific diff — it feeds prompt refinement; if an edit delta already exists and output is unchanged, tag-only update is allowed. Pass fromTurn: N to revert to delta[N].intermediateOutput and soft-delete all later deltas. fromTurn: 0 restores the state before the first surviving delta: the original model output on unrevised records, or the pre-first-revision state on revised records (which may include prior user edits). Intermediate edit states between patches are not recoverable. Mutually exclusive with output and tag. notes is a record-level label, always valid on all record types. Revisions squash edit deltas: when revise_run / POST /runs/{runId}/revise finalizes, all prior deltas (including any edit delta from patch_record) are replaced by the new revision delta set. After a revision, fromTurn: 0 therefore restores the pre-revision desired output, not the original conversion output.

Response: { id, notes, tag, input, output, lastPatchedAtUtc }. HIPAA §164.312(b) audit row is written with field presence/length/SHA-256 prefix metadata only — never raw content.

Write (`write` scope) V1.0.1

All mutating POSTs honour the Idempotency-Key header — see Idempotency. PATCH /prompts/{id} follows RFC 7396 JSON Merge Patch: missing keys leave server state untouched; explicit null clears.

POST	`/prompts`	create prompt + initial version
PATCH	`/prompts/{id}`	RFC 7396 merge patch — `application/merge-patch+json`
PUT	`/prompts/{id}/current-version`	switch which version is "current"; returns `{ currentVersionId }` reflecting actual post-update DB state (not just echoing the request)
POST	`/prompts/{id}/versions`	append new version (auto-increments versionNumber)
PATCH	`/prompts/{id}/versions/{vid}` V1.1	RFC 7396 merge-patch of an existing version — `promptText`, `modelSettingsJson`, `versionDescription`, `description`, `descriptionMode`. Versions are mutable in place; create a new version only when you want an explicit audit checkpoint. Writing `description` auto-flips to Manual; pass `""` to clear and revert to Auto. Optional `If-Match` header (hex Ticks of `UpdatedAt`) for optimistic concurrency.
POST	`/prompts/{id}/attachments`	upload file (multipart/form-data)
DELETE	`/prompts/{id}` V1.1	soft-delete prompt; cascades hide records/versions/attachments under it; idempotent on already-deleted — returns `{promptId, status:"deleted"}`; subsequent reads of a deleted prompt return `prompt_not_found`
DELETE	`/prompts/{id}/versions/{vid}` V1.1	soft-delete version; if the version is current, auto-switches `currentVersionId` to the lowest-VersionNumber remaining version and returns `newCurrentVersionId` in the response; rejects only if it is the sole remaining version (`409 cannot_delete_only_version`)
DELETE	`/attachments/{id}` V1.1	soft-delete + storage refund; idempotent — double-delete of an already-deleted attachment returns 204 (only first delete decrements storage); 409 `attachment_referenced_by_active_run` if any active run's snapshot references the blob

Field caps: name ≤ 256 chars, promptText ≤ 256 KB, modelSettings JSON ≤ 64 KB, text attachments ≤ 5 MB, image attachments ≤ 10 MB, ≤ 20 attachments per prompt, ≤ 50 MB per prompt, 1 GB per user.

Cursor pagination

List endpoints return { items: [...], nextCursor?: string }. nextCursor is present only when more pages exist — it is absent (not null) on the final page. Stop paginating when the field is not in the response. Cursors are signed (HMAC-SHA256) with a server-side root key; tampered cursors return 400 invalid_cursor; cursors issued for one user can't be replayed by another (400 invalid_cursor); changing query filters mid-pagination returns 400 cursor_filter_mismatch.

Page size (limit query parameter): The REST API accepts limit in the range 1–500 (default 100). Values below 1 return 400 param_out_of_range; values above 500 are silently clamped to 500. The MCP tool accepts limit in the range 1–100 (default 25); values outside that range return 400 param_out_of_range.

SSE protocol

All run-producing endpoints (/run, /runs/{runId}/revise, /records/{id}/revise) return 200 OK with Content-Type: text/event-stream. Errors are in-stream events, not HTTP status codes — clients should NOT branch on HTTP status for these endpoints.

Event taxonomy (protocol v1)

Event	When	Payload
`run_session`	First event	`{protocolVersion, runId, turnIndex, modelId, outputModality?, seededFromRecordId?}`
`client_hint` V1.1 Phase 0	Image-modality runs only, immediately after `run_session`	`{filter: "partial_image_b64", reason: "image_modality_inline_payloads", canonicalAccess: "GET /api/v2/public/runs/{runId}/images/{n}"}`
(upstream events)	Middle	OpenAI's `response.output_text.delta` etc., passed through verbatim
`run_completed`	Success terminator	`{runId, turnIndex, modelId, costMilliCents, imageCount?}`
`run_failed`	Failure terminator	`{runId, reasonCode, message?, charged, costMilliCents?, usageLogId?}`
`run_replayed` V1.1	Idempotency-Key replay terminator	`{runId, turnIndex, modelId, outputModality?, state, streamingInProgress, recordId?, hint}`
`record_finalized` V1.2	Chained auto-finalize succeeded; emitted AFTER `run_completed` on the same stream when `?autoFinalize=true`	`{runId, recordId, turns, costMilliCents?}`
`record_finalize_failed` V1.2	Chained auto-finalize failed; the upstream run already succeeded. If `retryable`, agent calls `POST /finalize` manually.	`{runId, reasonCode, retryable}`
`record_finalize_skipped` V1.2	Informational; emitted AFTER `run_failed` when `?autoFinalize=true` was set. Agents MUST NOT trigger separate failure handling — the failure is already reported in `run_failed`.	`{runId, reason: "run_failed", reasonCode}`

Forward-compat: clients MUST ignore unknown event names. Servers MUST NOT remove or rename existing event names within a protocol version. Adding new event names is OK. Check protocolVersion in run_session against your parser's expected version.

Image-modality runs:

HTTP callers (direct REST to /api/v2/public): the SSE stream includes inline base64 image payloads on response.image_generation_call.partial_image and response.output_item.done. These can be tens of KB to several MB per frame. Most clients will want to filter them out and fetch the canonical bytes via GET /api/v2/public/runs/{runId}/images/{n} after run_completed. A client_hint event is emitted right after run_session to flag this.
MCP callers (mcp.getpromethic.com/v1): the MCP transport drops partial_image_* events and redacts large base64 from kept frames automatically. Final image bytes are returned inline as MCP image content blocks in the tools/call result alongside the text transcript — no follow-up fetch needed. Clients that don't render image content can still call promethic_get_run_image by index, or GET /records/{id}/image?index=N after finalize.
run_completed.imageCount reports how many images THIS turn produced (per-turn, not session-aggregate). Iterate n in [0, imageCount). A text-only revise on an image session emits imageCount: 0.
Pre-finalize, GET /runs/{runId}/images/{n} reads the LATEST turn's images. To access prior-turn images after the run ends, finalize first and read record.imageStoredPath via GET /records/{id}/image?index=N — every per-turn image is preserved on the record.
If image generation completed upstream but blob storage failed (3× retries exhausted), the run terminates with run_failed { reasonCode: "image_upload_failed", charged: true }. Replay with the same Idempotency-Key returns this same failure (no re-attempt, no double-bill).

run_replayed does not reflect post-replay state. When you retry a /run or /revise with the same Idempotency-Key, the server returns the runId from the original call (the billable work is already in flight or done) and terminates this stream with run_replayed instead of run_completed. The payload's state field is a snapshot from when the replay was recorded — it does NOT track later state changes on the same run. For the live state of a replayed run, the agent must derive it from its own bookkeeping of the original call (and, once GET /api/v2/public/runs/{runId} ships in V1.2, poll that). The CLI's RunCallResult surfaces this as {succeeded: false, reasonCode: "replayed_state_unknown"} rather than masquerading as success.

Charge visibility

run_failed carries charged: true when the upstream call was billed despite the local failure (usage_log_write_failed or session_lost_mid_* after a successful upstream call). Agents should record the cost from costMilliCents for reconciliation.

Cost units V1.1

All cost fields on the public API surface use millicents (1 cent = 1,000 millicents; 1 USD = 100,000 millicents). The integer wire format preserves sub-cent precision for image-token / reasoning-heavy calls that previously truncated. Render as cents-with-decimals via costMilliCents / 1000; as USD via costMilliCents / 100000; e.g. costMilliCents: 503 = 0.503¢ = $0.00503. Field was renamed from costMicros (2026-05-11) to costMicroCents, then again to costMilliCents (2026-05-19) — the stored unit is 1/1000 of a cent (millicents), and the name now matches the value.

Affected fields: record.costMilliCents, run_completed.costMilliCents, run_failed.costMilliCents, record_finalized.costMilliCents. The pre-V1.1 costCents field is removed from public DTOs; pre-V1.1 records backfill via costCents * 1000 so historical rows still surface a cost.

Rate-limit headers V1.1

Every public-API response (success + 429) carries:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1746201600
X-RateLimit-Bucket: key=47/60,user=120/300

The standard pair (Limit / Remaining) reports the most-restrictive bucket — the per-key bucket when the caller is API-key-attributed (always smaller than per-user), else the per-user bucket. The diagnostic X-RateLimit-Bucket reports both (key=N/L,user=N/L) so agents can observe per-key vs per-user pressure separately. Session-only callers see X-RateLimit-Bucket: user=N/L.

CLI

The promethic Node CLI (@soulwarestudio/promethic-cli on npm) ships every public-API endpoint behind ergonomic commands.

npm install -g @soulwarestudio/promethic-cli

promethic auth login                       # paste pmk_... key
promethic auth status
promethic auth logout

promethic prompts list [--limit N] [--cursor C] [--json]
promethic prompts get <id> [--json]
promethic prompts delete <id>                                                # V1.1
promethic prompts delete-version <promptId> <versionId>                      # V1.1

promethic run <prompt-id> [--input "..."] [--input-file path] [--no-accept] [--auto-finalize true|false] [--json]   # V1.2: --auto-finalize
promethic revise <handle> --instruction "..." [--intermediate-output "..."] [--from-turn N] [--no-accept] [--json]
                                                                                                                  # V1.2: handle is run-id or record-id; --from-turn rewinds
promethic finalize <run-id> [--final-text "..."] [--tag "..."] [--notes "..."] [--from-turn N] [--json]
                                                                                                                  # V1.2: --from-turn rewinds; finalize on Finalized session amends in place

promethic records list [--prompt <id>] [--source API] [--json]
promethic records get <id> [--json]
promethic records image <id> --index N --output path.png
promethic records delete <id>                                                # V1.1
promethic records patch <id> [--notes "..."|--clear-notes] [--tag "..."|--clear-tag] [--json]   # V1.1

promethic run <promptId> --image <file> [--image <file>]                     # V1.1 Phase 3 — vision input

promethic attachments add <promptId> <file> [--type image|text] [--filename ...] [--json]
promethic attachments list <promptId> [--json]                               # V1.1 Phase 3
promethic attachments get <attachmentId> <outputPath>                        # V1.1 Phase 3
promethic attachments delete <id>                                            # V1.1

Agent tools (hosted MCP) V1.3

Promethic exposes its agent tool surface through the hosted MCP server at mcp.getpromethic.com/v1 — remote, OAuth, no install. It exposes 26 core tools covering the full Avalonia desktop / Expo web workspace surface (plus any per-prompt synthesized tools you've enabled). (The CLI's local-stdio MCP server was retired in CLI 0.7.0; the hosted MCP is the single agent surface.)

read: list_prompts (returns name + description + outputModality + mcpToolName — non-null when the prompt is exposed as a per-prompt synthesized tool; agents can call it directly by that name without run_prompt), get_prompt, list_records, get_record (full detail: inputText, outputText, notes, modelId, token counts, costMilliCents, revisionCount, imageCount, isOneShot, and turns[] — the complete per-turn history reconstructed from the delta chain; turns[].kind: "run" / "revision" / "edit"), list_versions, get_version, list_attachments, get_attachment, get_catalog
execute: run_prompt (V1.2: optional autoFinalize auto-creates a record on success), revise_run (V1.2: accepts runId — auto-finalized runs can be revised by the same runId, no need to set autoFinalize=false to "keep editing"; finalized sessions are reopened transparently; fromTurn rewinds), finalize_run (V1.2: fromTurn rewinds; can amend a Finalized session; returns {recordId, turns, costMilliCents?}), get_run_image, delete_record, patch_record (notes / input / output / tag / fromTurn — manual records allow notes/input/output only; non-manual records auto-create edit delta on output change), create_record (V1.4: agent-curated training data — {promptId, input, output, notes?, idempotencyKey?} creates a finalized record with no LLM call, no spend; input and output must be non-empty, non-whitespace strings — whitespace-only values are rejected with invalid_params; optional idempotencyKey (string) enables safe retry — same key + same payload returns the original record instead of creating a duplicate, scoped per-user with 24h TTL; for "voice prompt" workflows where the user collaborates with the agent to seed examples before running Generate Prompt on the desktop)
write (V1.1 Phase 2 + Phase 3): create_prompt, update_prompt, delete_prompt, create_version (with optional setAsCurrent in one transaction), update_version (versionDescription + description), switch_current_version, delete_version, upload_attachment, delete_attachment

All tools mirror the workspace flow agents would otherwise need a desktop or web browser to drive: author prompts, manage versions, attach reference files, run with vision, revise, finalize, edit notes/tags.

Connect the hosted MCP at https://mcp.getpromethic.com/v1 with your pmk_… key as a Bearer token (or via OAuth where the client supports it). See getpromethic.com/agents for client-specific config (Claude Desktop / Cursor JSON, Codex TOML, and the custom-connector UI for iOS / web / ChatGPT).

Image input/output V1.1 Phase 3

Output: image-modality runs return image bytes inline as base64 (≤ 16 MB raw) or as a local-file path (larger). Same shape via get_run_image for in-flight runs and get_attachment for prompt attachments.

Input: run_prompt.images accepts the SAME media-ref shape as the output. Agents can pipe a prior run's image straight back in:

{
  "images": [
    { "inline": true,  "base64": "...",      "mimeType": "image/png" },
    { "inline": false, "localPath": "/tmp/photo.png", "mimeType": "image/png" }
  ]
}

Up to 16 images per run, 10 MB each. Requires the prompt's model to declare input_image capability (GPT-5.x via Responses API; gpt-image-1.x as edit inputs). Trust note: localPath is read by the MCP server with the user's permissions — only use paths the agent is authorized to read.

Attachment management

upload_attachment takes either inline base64 or a localPath; idempotency keys are derived from (promptId, filename, content) so retries replay the original upload (no double-billing). list_attachments, get_attachment, and delete_attachment round out the surface. Per-file: 10 MB image / 5 MB text. Per-prompt: 50 MB total. Path traversal in filenames: filenames with .. are intercepted by the Cloudflare WAF with a 403 before reaching the app (which would reject them as invalid_filename anyway); the error shape is a raw 403, not a JSON envelope.

Cancelled runs release their RunSession TTL naturally; the server expires them after 1 h of inactivity.

Same pattern works for Cursor, Zed, Continue, Cline — any desktop-class MCP-aware client.

Hosted MCP V1.3

For agents that can't (or shouldn't) run a local stdio server — Claude iOS, claude.ai web, sandboxed automations — Promethic hosts the same 27-core-tool surface at https://mcp.getpromethic.com/v1 speaking MCP Streamable HTTP transport (spec 2025-03-26). Same tools, same scopes, same pmk_ keys.

Why hosted MCP exists: an agent calling tools via stdio needs a local CLI install + a long-lived process. A hosted endpoint replaces both with one HTTP URL — Claude iOS just adds a connector, no local binary. The wire shape is identical to the local CLI, so existing scripts don't change.

Connect Claude Desktop / Cursor

{
  "mcpServers": {
    "promethic": {
      "url": "https://mcp.getpromethic.com/v1",
      "headers": {
        "Authorization": "Bearer pmk_..."
      }
    }
  }
}

Claude Desktop config path is the same as the stdio install above (claude_desktop_config.json). Cursor: settings JSON, same shape. The pmk_ key supplies auth; the server exchanges it at initialize for a connection-bound mcps_ session token (24-hour idle TTL, renewed on each tool call). pmk_ keys never expire.

Connect Claude Code (VS Code / Cursor extension)

Claude Code stores MCP servers in ~/.claude.json. Add the hosted server via the CLI:

claude mcp add promethic --transport http https://mcp.getpromethic.com/v1 \
  --header "Authorization: Bearer pmk_..."

Or add it manually to ~/.claude.json under mcpServers:

{
  "mcpServers": {
    "promethic": {
      "type": "http",
      "url": "https://mcp.getpromethic.com/v1",
      "headers": {
        "Authorization": "Bearer pmk_..."
      }
    }
  }
}

Reload the VS Code / Cursor window after editing. Tools appear under the mcp__promethic__* namespace.

Anthropic API — agents via MCP Connector beta

Programmatic agents using the Anthropic Messages API can connect to the hosted MCP server without any local setup:

import anthropic

client = anthropic.Anthropic()

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=[{"role": "user", "content": "List my prompts and run one"}],
    mcp_servers=[{
        "type": "url",
        "url": "https://mcp.getpromethic.com/v1",
        "name": "promethic",
        "authorization_token": "pmk_..."   # pmk_ key, no "Bearer" prefix here
    }],
    betas=["mcp-client-2025-11-20"]
)

Claude discovers all tools automatically and calls them to satisfy the request. No separate tools/list call needed.

Connect Claude iOS / claude.ai web / ChatGPT (OAuth)

These clients use OAuth 2.1 + PKCE instead of bearer-key paste — their connector UI doesn't accept a static token. Add a custom connector pointing at https://mcp.getpromethic.com/v1 and leave the OAuth fields blank; the client auto-discovers them via /.well-known/oauth-protected-resource. On Connect, a popup opens to the Promethic consent screen — sign in, click Allow, the connector activates. Revoke any time at app.getpromethic.com → Settings → Connected Apps. OAuth access tokens (pmoa_) have a 30-day TTL; refresh tokens (pmor_) also expire after 30 days. After expiry the client will prompt for re-authorization automatically. The full 27-core-tool surface (plus per-prompt synthesized tools) appears in the agent's tool tray. Tools are namespaced promethic_<name> on the wire (underscore separator per Anthropic's Tool API name regex).

Image-modality runs over MCP

promethic_run_prompt on an image-modality prompt returns the generated image bytes inline as MCP image content blocks in the tools/call result, alongside the text transcript. Claude.ai / ChatGPT render these directly. The text transcript shows [image bytes elided ...] for events that originally carried base64 — those bytes live in the image content blocks instead. Use promethic_get_run_image only as a fallback (e.g., when re-fetching after losing the original tool result).

Per-tool grants (opt-in)

Hosted MCP supports an optional per-tool allow-list on top of the read/execute/write scopes. New keys are unconfigured by default and can call any tool the key's scopes permit — useful while the per-tool config UI is being built. Once you opt in (set an explicit allow-list on the key), only those tool names succeed; anything else returns tool_grant_required. The wildcard ["*"] explicitly reverts to allow-all, and the empty array [] blocks every non-discovery tool.

Discovery surfaces (list_prompts, get_catalog) always bypass the per-tool gate so agents can always discover what tools exist.

Differences from local CLI MCP

upload_attachment requires inline bytes_base64; localPath is rejected (the server has no agent's filesystem). 10 MB raw cap matches the local CLI; chunked upload (upload_id / chunk_index / chunk_total) is reserved for V2.
Idempotency keys: vendor-prefixed _meta namespace — _meta["com.getpromethic/idempotency-key"] on the JSON-RPC request. Same byte-identity replay semantics as the HTTP Idempotency-Key header.
Streaming tools (run_prompt, revise_run) use notifications/progress (MCP spec) for live model output. Path B durable resume: if the live stream drops, GET /v1/sessions/{sid}/calls/{toolCallId} returns the final result once available.

Run lifecycle (sessions vs records, auto-finalize)

A run is a transient session: the model receives your input, streams an output, and the server tracks state in RunSession (1h sliding TTL). A record is the persisted artifact: input + final output + any revision turns + edits + cost — the structured data Promethic uses to refine your prompt over time.

To go from session to record you call finalize_run. To do nothing and let the session expire naturally, simply don't finalize — the TTL reaper sweeps it after 1 h.

Auto-finalize (default ON since V1.2) chains a finalize_run after a successful run_prompt on the same SSE stream. One call, one round trip, one saved record. The new recordId arrives via the record_finalized SSE event. This is what most agents want — call run_prompt, use the returned record.

Pass autoFinalize: false on a single run_prompt call to opt out — useful when you want to revise_run the output before saving, or just inspect it before committing. Then revise with the runId and finalize manually with --final-text / --tag / --notes when ready. Finalized runs can be revised again by the same runId — the session reopens transparently.

To set the persistent default (so all your runs from any client behave the same way), use one of:

Avalonia desktop: Settings → Appearance → Hosted MCP toggle.
Expo (web / iOS): Settings → toggle "Auto-save MCP runs as records".
CLI: promethic config set auto-finalize-mcp-runs <true|false> (V1.3+).

Per-call autoFinalize always overrides the persistent default. The persistent default in turn overrides the V1.2 server default of true.

--no-accept on the CLI saves a JSON artifact at ~/.promethic/runs/<runId>.json (mode 0600) so the runId + state survives across shell sessions.

Override the API URL

export PROMETHIC_API_URL=http://localhost:8080
promethic auth status

Only https://..., http://localhost, and http://127.0.0.1 are accepted — the CLI refuses to send your key to other http:// hosts.

Errors

All 4xx/5xx responses follow RFC 7807 application/problem+json. Designed for self-healing agents — every error names what went wrong, what to do about it, and (where applicable) which exact field tripped:

{
  "type": "https://api.getpromethic.com/problems/invalid_model_settings",
  "title": "Model settings reference an unknown or inactive model.",
  "status": 400,
  "detail": "The model_id is not in the catalog, or it has been retired.",
  "reason_code": "invalid_model_settings",
  "action_hint": "List models via GET /api/v2/public/models, then retry with a current model_id.",
  "request_id": "req_01HX...",
  "invalid_params": [
    { "name": "modelSettings.model_id", "reason": "unknown_or_inactive_model" }
  ]
}

Each type URL is also a working redirect: GET /problems/{reason_code} 302s to the matching section of these docs (e.g. /problems/idempotency_key_reused). Agents that follow the link land on a human description plus the resolution steps for that specific reason.

Reason codes — each row's id is the redirect target for /problems/{reason_code}:

HTTP	reason_code	Meaning
400	`invalid_request`	shape error — see `invalid_params` array for field-level detail (subsumes null-not-allowed via `required_field_clear` reason)
400	`invalid_params`	also a top-level `reason_code` (e.g. out-of-range `limit` on list endpoints, wrong-type parameter). Distinct from the `invalid_params` array sub-field that appears on `invalid_request` responses.
400	`invalid_filename`	filename contains path separators (`/`, `\`), null bytes, dots-only names (`.`, `..`), whitespace-only value, or leading/trailing whitespace. Use a plain trimmed name like `report.txt`. Note: a completely missing or empty filename returns `invalid_params` ("filename is required"), not this code.
400	`image_dimensions_too_small`	uploaded image dimensions are below the minimum required (128×128 px). Resize or use a larger image.
400	`invalid_image_format`	uploaded bytes are not a recognized image format. Only PNG, JPEG, and WebP are accepted. Re-encode the file before uploading.
400	`invalid_model_settings`	unknown / inactive model_id, or missing `parameters` object; `action_hint` tells you to `GET /api/v2/public/models`
400/413	`field_too_large`	see V1.0.1 caps in the Write section
400	`idempotency_key_invalid`	missing, > 255 chars, comma-joined, or non-visible-ASCII
400	`stream_required`	POST /run/revise must include `stream: true`
400	`invalid_cursor`	tampered or cross-user cursor
400	`cursor_filter_mismatch`	filter params changed mid-pagination
400	`persist_query_param_removed`	V1.1 — `?persist` query param removed; `/finalize` always commits. Use `DELETE /records/{id}` within 24h to undo.
412	`precondition_failed`	REST-R04-001/002 (2026-05-24) — `If-Match` token mismatch or malformed token. Re-fetch the resource to obtain a fresh `updatedAtUtc`-derived token. Accepts both RFC 7232 quoted form (`"abc123"`) and bare hex (`abc123`).
410	`record_was_deleted`	V1.1 — this run finalized to a record that has since been self-deleted. Mint a new run.
500	`snapshot_corrupt`	V1.1 — server-side data integrity: the run session's version snapshot failed to parse. Mint a new run.
409	`reopen_limit_exceeded`	V1.1 — session has been reopened more than 100 times via /revise after /finalize. Mint a new run.
409	`no_prior_image_for_revision`	image-edit revisions require at least one prior image from a completed run. Call `run_prompt` (or POST /run) on this session first to generate a base image, then call `revise_run`.
400	`intermediate_output_not_supported_image`	V1.1 — image-modality runs reject `intermediateOutput`. Mirrors desktop: image revisions always use the model's actual output.
413	`intermediate_output_too_large`	V1.1 — `intermediateOutput` exceeds 32 KB per-turn cap.
413	`final_text_too_large`	V1.1 — `finalText` exceeds 256 KB cap.
413	`notes_too_large`	V1.1 — `notes` exceeds 64 KB cap.
413	`tag_too_large`	V1.1 — `tag` exceeds 64 KB cap.
413	`input_too_large`	PATCH /records — `input` exceeds the allowed character cap.
413	`output_too_large`	PATCH /records — `output` exceeds the allowed character cap.
500	`internal_error`	Unexpected server error. Retry; if the error persists, contact support.
400	`tag_without_delta`	V1.1 — /finalize received `tag` but no edit delta was produced (finalText omitted or matches model output). Use `notes` for record-level labels, or PATCH /records/{id} to re-tag.
403	`record_not_owned_by_api_key`	V1.1 — DELETE/PATCH /records/{id} on the public API is restricted to the API key that created the record. Mutate via the Promethic web/desktop app, or use the original API key.
409	`record_self_delete_window_expired`	V1.1 — DELETE /records/{id} on the public API is restricted to the first 24h after record creation. Delete via the Promethic web/desktop app instead.
409	`record_no_edit_delta`	V1.1 — /finalize or PATCH /records/{id} attempted to set `tag` but no edit delta exists to anchor it. For /finalize: include `finalText`. For MCP `patch_record`: provide `output` alongside `tag` to create the edit delta first.
400	`tag_requires_distinct_output`	MCP / REST `patch_record` — `tag` was provided with `output`, but `output` equals the record's current value and no edit delta exists yet. Provide a different `output` value, or use `notes` for a label with no edit.
400	`tag_requires_output`	MCP / REST `patch_record` — `tag` was provided without `output` and no edit delta exists yet — there is nothing to annotate. Provide an `output` value to create an edit delta first, or use `notes` for a record-level label with no edit.
400	`tag_would_be_lost_on_revert`	MCP `patch_record` — `output` matches the model's last returned output, which removes the edit delta. A `tag` was also provided but has nothing to anchor. Omit `tag` to revert cleanly, or provide a different `output` to keep the edit delta and anchor the tag.
400	`param_out_of_range`	A numeric parameter (e.g. `limit`) is outside the allowed range. Check the tool description for valid bounds.
403	`grant_required`	V1.1 — API key is restricted to a specific set of prompts and the requested prompt is not in that set. Manage at `Settings → Developer Keys → Manage prompts`, or use an unrestricted key.
409	`version_is_current`	Retired 2026-05-19. DELETE /versions/{vid} no longer rejects the current version — it auto-switches to the lowest-VersionNumber remaining version instead. This code is no longer emitted.
409	`cannot_delete_only_version`	DELETE /prompts/{id}/versions/{vid} attempted on the only remaining version. Create another version before deleting.
409	`version_switch_conflict`	Concurrent request changed version state mid-delete; retry delete_version.
409	`attachment_referenced_by_active_run`	V1.1 — DELETE /attachments/{id} blocked because an active RunSession's snapshot references this attachment. Finalize the listed run(s) first, or wait for the 1h TTL to expire.
409	`prompt_referenced_by_active_run`	V1.1 (MCP only) — MCP `delete_prompt` blocked because at least one non-terminal RunSession still references the prompt. The REST DELETE endpoint force-terminates active sessions automatically and does not return this code. If returned from MCP, finalize the listed run(s) first.
409	`version_referenced_by_active_run`	V1.1 (MCP only) — MCP `delete_version` blocked because at least one non-terminal RunSession pins this version. The REST DELETE endpoint force-terminates active sessions automatically and does not return this code. If returned from MCP, finalize the listed run(s) first.
409 / lifted	`image_runs_not_supported_v1`	V1.1 Phase 7: lifted. Image-modality runs via API key are now supported on /run, /revise, /revise-again. The accumulated effective prompt is persisted per-turn and surfaces as `record.finalCopiedOutput` after /finalize. The reason code is kept in the table for back-compat with old SDKs but is no longer emitted.
400	`final_text_not_supported_image`	V1.1 — image-modality runs reject `finalText`. record.FinalCopiedOutput for image records is server-derived from the image-prompt accumulation chain (training-data invariant).
413	`session_deltas_too_large`	V1.1 — total `session.Deltas` jsonb exceeds the 2 MB cap. Finalize and start fresh.
413	`cost_incurred_no_delta_persisted`	V1.1 — upstream model call billed but the resulting turn couldn't persist (post-upstream cap exceeded). UsageLog has the charge.
400	`invalid_image_base64`	bad base64 in `images[].data`
400	`invalid_index`	index ≥ 0 violation (e.g. `?index=` on record image)
400	`invalid_source`	`?source=` not in {App, Manual, API, Headless} (case-insensitive)
400	`invalid_type`	MCP `report_issue`: `type` not one of `"bug"` / `"feature_request"` (MCP-only, not on REST surface)
400	`invalid_report`	MCP `report_issue`: `report` body is empty (MCP-only, not on REST surface)
400	`instruction_required`	POST /revise needs a non-empty instruction
400	`from_turn_invalid`	V1.2 — `fromTurn` not a valid non-negative integer
400	`from_turn_out_of_range`	V1.2 — `fromTurn` exceeds current turn count; re-read `turns[]`
400	`mixed_credentials_principal_mismatch`	session + key resolve to different users
400	`mixed_credentials_key_mismatch`	two API keys present that don't match
401	`key_unauthorized`	missing / invalid / expired / revoked API key
403	`scope_required`	key lacks the required scope
403	`api_key_not_permitted`	endpoint requires a session, not a key
404	`prompt_not_found`	no prompt with that id is visible to this caller
404	`version_not_found`	no matching version on the prompt; also returned when attempting to revise from a record whose original prompt version has been deleted — this is expected behavior, not a bug. Use `run_prompt` to start a fresh run on the current version, or `create_record` to seed new training examples.
409	`current_version_missing`	prompt's currentVersionId points to a deleted version; no fallback could be self-healed
409	`manual_record_modality_not_supported`	`create_record` supports text-modality prompts only; this prompt produces image or structured output
409	`manual_record_modality_undetermined`	`create_record`: prompt's modelSettingsJson is unparseable, so output modality cannot be determined
409	`prompt_record_cap_reached`	`create_record`: this prompt has reached the per-prompt manual-record cap
404	`record_not_found`	no record with that id is visible to this caller
404	`run_not_found`	run expired, never existed, or not yours
404	`attachment_not_found`	no attachment with that id is visible
404	`no_image_stored`	this record has no stored images
404	`image_index_out_of_range`	`?index` past the count of stored images
404	`invalid_image_reference`	defense-in-depth validation refused the path
409	`idempotency_key_reused`	same key on a different body — won't replay; mint a new key
409	`idempotency_in_flight`	same key still being processed; retry after `Retry-After`
409	`session_busy`	another /revise or /finalize in flight
409	`session_not_active`	rare CAS-race; re-fetch run state
409	`session_already_finalized`	V1.2+: `/revise` no longer fires this — it reopens finalized sessions (Finalized → Active) automatically. This code is now only returned in edge cases where a finalized session cannot be reopened (deleted record, expired, reopen-limit exceeded). `/finalize` with the same `Idempotency-Key` always replays the original 200 + `recordId`. With a new key on a Finalized session: passing `notes` alone patches the record's notes directly (no reopen, not idempotent per call); passing `finalText`, `tag`, or `fromTurn` reopens the session and re-finalizes.
409	`session_expired`	past 1h TTL
409	`session_failed`	terminal — see `reason_code`
409	`session_abandoned`	session was abandoned by API key revocation (bulk abandon via RevokeAsync)
409	`revision_chain_too_long`	25 turns/session cap
409	`record_revise_in_progress`	V1.2 — another caller holds the per-record rehydrate lock for this recordId; retry after a short backoff
409	`snapshot_modality_unreadable`	internal snapshot data unreadable
409	`finalize_completion_failed`	internal: finalize transaction failed
500	`finalize_conflict`	unexpected record conflict during finalize; session reset to Active — retry /finalize
500	`image_upload_failed`	V1.1 Phase 0 — image generated upstream but blob storage write failed after retries; upstream charged (charged: true); replay returns this failure for the Idempotency-Key
500	`image_extraction_overflow`	V1.1 Phase 0 — upstream produced more images than the 16-per-turn cap; reduce `n` or split runs
409	`run_already_terminal`	the run is in a non-interactable terminal state. Returned by: `/revise` on Expired, Failed, or Abandoned runs; `/finalize` and `GET /run-image` on Abandoned runs. Finalized is NOT terminal for `/revise` — the server reopens it transparently.
409	`version_create_contention`	concurrent version inserts; retry
413	`storage_quota_exceeded`	per-prompt or per-user storage cap reached
429	`rate_limited`	per-key or per-user bucket overflow; honour `Retry-After`
501	`not_implemented`	endpoint is documented but not yet implemented in this API version; check `action_hint` for the planned version and any workaround
500	`stream_setup_failed`	SSE response failed to initialize before the proxy call
503	`auth_store_unavailable`	transient idempotency-store race; retry
500	`idempotency_outcome_unknown`	V1.3 Phase 4b — process died mid-flight after possibly committing the domain mutation but before recording Complete; retries of the same Idempotency-Key replay this body until 24h TTL. Verify via GET before retry — see Recovery from idempotency_outcome_unknown for per-tool recipes.

Idempotency V1.0.1

Every mutating POST (the five Write endpoints above, plus /prompts/{id}/run when its body is the same as a prior attempt) accepts an Idempotency-Key header. This is a Stripe-style guarantee: a network glitch mid-call is safe to retry — the server replays the original response byte-identically instead of double-applying the side effect.

Contract

Header value: 1–255 visible-ASCII characters (0x21–0x7E), no commas, sent at most once.
Same key + same body + same route → server replays the original status, headers, and body.
Same key + different body → 409 idempotency_key_reused (the agent picked a key it already used for a different request — generate a new one).
Same key, original still in flight → 409 idempotency_in_flight + Retry-After: 1.
Records expire 24 h after the original call completes (Stripe parity). After expiry the same key is fresh again.
Replay returns the response shape from the original call. If we ship a new field on (e.g.) POST /prompts between your first call and your retry, the retry returns the OLD shape — not the new one. This is intentional Stripe parity: replays are byte-identical snapshots. The 24 h TTL bounds staleness; for the freshest shape, mint a new key.

How the CLI uses it

The CLI auto-generates a fresh UUIDv4 per invocation by default — each promethic prompts create call is a distinct attempt. Pass --idempotency-key <uuid> to pin one if you want a manual retry to be a no-op. upload_attachment deduplicates on content: re-uploading identical bytes under the same filename returns the existing attachment without creating a duplicate or consuming quota — safe to retry on network failure. Pass an explicit idempotencyKey for stronger cross-session replay stability (same key returns the byte-identical original response for 24 hours regardless of content changes).

Filename participates in attachment identity. Both the CLI's deterministic key and the server's body hash include the filename. A retry of the same content under a different filename (e.g. --filename newname.txt) is treated as a fresh upload and consumes storage twice. If you want to rename an existing attachment, delete the original through the web app first (DELETE on attachments is V1.1 — see "Not in V1.0.1" below).

Recovery from `idempotency_outcome_unknown` V1.3 Phase 4b

If the server process dies between committing the domain mutation and recording the idempotency Complete, a sweep flips the row to state=failed with a synthetic body:

{
  "type": "https://api.getpromethic.com/errors/idempotency_outcome_unknown",
  "title": "Idempotent run outcome unknown",
  "status": 500,
  "reasonCode": "idempotency_outcome_unknown",
  "detail": "The original request died mid-flight (process crash or lease expired without heartbeat). The domain change MAY OR MAY NOT have landed. Verify via a GET before any retry — replaying the same Idempotency-Key returns this body verbatim, and a NEW key may duplicate the original mutation.",
  "route": "POST /api/v2/public/prompts"
}

Replays of the same key continue to return this body until the row's 24 h TTL expires. The atomicity refactor in PR #18 (per-endpoint BeginTransactionAsync wrapping the domain mutation + Complete) makes this case much rarer post-2026-05-09 — for tools that landed BEFORE PR #18 (or future tools added without the wrapper), this recovery is still load-bearing.

Per-tool verify recipes (use these BEFORE retrying with the same OR a new key):

Phase 6 timing note (2026-Q3): the recipe for run_prompt below uses a future clientIdempotencyKey field on the record DTO as the authoritative disambiguator. That field is not yet shipped — the explicit "(when available)" framing in the recipe handles this. Until Phase 6 lands, agents will fall back to the heuristic match (createdAt + inputText). The heuristic is unreliable for repeated identical inputs in the same window — verify carefully OR mint a new key + accept the duplicate cost when in doubt.

Tool / route	Verify recipe
`POST /prompts` + MCP `create_prompt`	`GET /api/v2/public/prompts` (or MCP `list_prompts`) then match by the `name` field you submitted — names are user-chosen + likely unique within your set. If found: the create succeeded; do NOT retry. If not found: safe to mint a new key + retry.
`POST /prompts/{id}/versions` + MCP `create_version`	`GET /api/v2/public/prompts/{id}/versions` (or MCP `list_versions`) then match by `versionNumber` = (highest from your pre-call read) + 1. If a version with that number exists with your prompt text: succeeded. If not: safe to retry with a new key.
`POST /prompts/{id}/attachments` + MCP `upload_attachment`	`GET /api/v2/public/prompts/{id}/attachments` (or MCP `list_attachments`) then match by `filename` + `sizeBytes`. If found: succeeded. If not: safe to retry. Note: storage quota was reserved at Begin time; a lost call leaves the quota reserved until the idempotency row's 24 h TTL refunds it via the orphan-blob sweep.
`POST /prompts/{id}/run` + MCP `run_prompt`	Run records auto-finalize by default. Authoritative disambiguator (when available): filter `list_records` by `clientIdempotencyKey` — every record carries the originating Idempotency-Key from the call that created it. If found: the run succeeded and the record exists. Cost was billed; you've paid for it. If not found: the run did not complete; safe to mint a new key + retry. Heuristic fallback (use ONLY when the authoritative path isn't available — e.g., a tool that doesn't yet expose `clientIdempotencyKey`): match by `createdAt` in your call window AND `inputText`. Be aware that an agent calling `run_prompt` with the same input multiple times in 24h cannot disambiguate via `output` / `cost_micros` alone — those are nearly identical for deterministic prompts. The heuristic is a guess; do not blindly retry on a match-of-many.
MCP `finalize_run` / `revise_run`	These take a `runId`. Step 1: GET the run state via `GET /api/v2/public/runs/{runId}` (or let MCP `list_records` filter by `runSessionId`). If a record exists with your `runId`: finalize succeeded. If `RunSession.State == Finalized` with a `finalizedRecordId`: ditto, succeeded. If `State == Active`: the run is back to a state you can retry from — mint a new key + retry. If `State ∈ {Running, Finalizing}`: a concurrent attempt is in flight or recovering — wait + re-poll. If `State == Failed`: terminal; do not retry. Do NOT mint a new key without GET-checking state first — retrying a fresh-key finalize against a Finalizing session 409s (`session_busy`) or races the Phase 4b finalize-failure→Active reset.
MCP `delete_*` / `patch_record`	`GET` the resource by id. If 404 (delete) or fields match your patch (patch): succeeded. Otherwise safe to retry with a new key.

Do NOT blindly retry with a new key. The crash body explicitly says "verify before retry" because domain state may have landed. A naive retry-with-new-key duplicates whatever did land — the exact silent-double-execute footgun the flip-to-failed semantic exists to surface.

Rate limits

Per-minute fixed-window buckets, evaluated AFTER auth (so an unauthenticated burst can't drain a per-user bucket the caller doesn't own):

Scope	Per key	Per user
read	60/min	300/min
execute	30/min	90/min

On overflow: 429 with a Retry-After header and an RFC 7807 problem document carrying reason_code: "rate_limited" plus an action_hint describing whether the key or the user bucket overflowed.

Versioning

The URL path carries the major version (/api/v2/public). The SSE protocol carries an in-band protocolVersion for forward-compat extension within the same path version.

Removing or renaming an existing endpoint or event = major bump.
Adding a new endpoint, event, or response field = minor (no bump).
Changing field semantics on an existing field = major bump.

Catalog stability

GET /api/v2/public/models is an agent-facing contract. What's safe (no major bump) for us to do:

Add a new model.
Add a new value to a parameter's values enum (e.g. reasoning_effort: ["none","low","medium","high"] → [..., "xhigh"]). Strict-validating agents should treat unknown enum values as forward-compat additions, not errors.
Add a new capability bit, parameter, or cost field.
Retire a model. Once retired the model_id is no longer in the catalog and any prompt referencing it gets 400 invalid_model_settings with action_hint directing the agent to fetch the live catalog and pick a current model.

Known V1.0.1 limitations resolved in V1.1

~~cost_cents on records is integer-truncated; sub-cent costs round to 0.~~ Resolved in V1.1 Phase 8: public DTOs now expose costMilliCents (millicents, 1/1000 cent) for sub-cent precision. costCents is removed from the public surface; render via costMilliCents / 1000.

Not in V1.1

DELETE on prompts / versions / records / attachments — leaked-key blast radius too high without per-prompt grants. Resolved in V1.1 Phase 5 (records) + Phase 6b (prompts/versions/attachments): per-prompt grants gate every mutation; record self-delete restricted to the originating API key + 24h window; attachment delete blocked while an active RunSession references the blob.
Image-output runs for API-key callers — 409 image_runs_not_supported_v1. Resolved in V1.1 Phase 7 + Phase 0: image-modality runs are supported on /run, /revise, /revise-again. Per-turn effectivePromptForImage accumulation persists into record.finalCopiedOutput on /finalize, restoring the desktop accumulated-prompt invariant. V1.1 Phase 0 wired the actual blob upload (Phase 7 lifted the gate but left ImageBlobKeys: null hardcoded — pre-Phase-0 records came back with imageStoredPath: null). All images now persist to blob storage, retrievable via GET /runs/{runId}/images/{n} in-flight and GET /records/{id}/image?index=N post-finalize. Records preserve every per-turn image (re-finalize merges, never shrinks).
GET /api/v2/public/runs/{runId} polling endpoint — V1.2. Until then, agents derive run state from their own bookkeeping of the original call. The run_replayed event on idempotent retries surfaces {succeeded: false, reasonCode: "replayed_state_unknown"} rather than masquerading as success.
CLI run --output-dir <dir> — V1.2. Image bytes are fetchable today via GET /runs/{runId}/images/{N} (in-flight) or GET /records/{id}/image?index=N (post-finalize); the auto-save UX is a CLI ergonomics improvement.
CLI grants management (keys grants list/add/remove) — V1.2. Per-prompt restrictions are configured by the user via Settings → Developer Keys → Manage in the web/desktop apps; agents do not configure their own restrictions.
Searchable prompt picker in Manage view — V1.2. V1.1 ships a plain scrollable checkbox list; search arrives once a user has 30+ prompts.
Webhooks, OAuth, PAT, team keys — V2.
Streaming on the CLI — CLI internally buffers SSE for revise chain. V2 may surface raw streaming.
?fromTurn=N rewind — RunSession.Deltas is turn-indexed today; surface in V2. Resolved in V1.2: /runs/{runId}/revise and /runs/{runId}/finalize accept fromTurn in the request body. Drops session.Deltas entries with turnIndex > fromTurn before applying the operation. Image blobs orphaned by the rewind enqueue into blob_cleanup_queue (drained by a background worker with reference-count guard).

Changelog

v1.3 — 2026-05-11 — BREAKING (per-prompt MCP tools)

Prompt-level description dropped from every wire surface. POST /api/v2/public/prompts no longer accepts description; PATCH /api/v2/public/prompts/{id} accepts only name and abbreviation. PublicPromptCreatedResponse drops the field. The cloud_prompts.Description column is dropped from the database with no data preservation. Capability descriptions live on versions only.
Version description is the agent-facing capability description. Every version carries a one-sentence summary that describes what the prompt does — what it expects as input and what it returns. Surfaced in tools/list synthesized tool descriptions and list_prompts, so agents can pick a prompt in one round-trip.
New descriptionMode field on versions. Numeric on the wire (0=Auto, 1=Manual). In Auto, the server regenerates Description with gpt-5.4-nano on every PUT version that changes promptText (fire-and-forget worker, ~$0.0003 per fire, conditional UPDATE that no-ops on stale starts). In Manual, the user/agent owns the field.
Description-write rule. Writing description at PATCH /api/v2/public/prompts/{id}/versions/{vid} (or update_version on MCP, or PUT version on the private cloud surface) is treated as the caller taking ownership — Mode auto-flips to Manual if it isn't already. Pass descriptionMode: 0 in the same request to revert to Auto and let the server worker resume regenerating; explicit descriptionMode wins over the implicit description-presence flip. JSON null for description is a no-op (send "" to clear deliberately). Earlier (pre-2026-05-13) silent-ignore-in-Auto + explicit-only-flip rule was a footgun and is no longer in force.
If-Match precondition (optional) on PUT/PATCH version endpoints. Token format: UpdatedAt.Ticks as lowercase hex (e.g. If-Match: 8db7e12c0e7c100). Mismatch returns 412 Precondition Failed. Absent header keeps last-write-wins legacy semantics. PUT version now returns 200 OK + VersionResponse (was 204) so the client gets the new UpdatedAt for the next If-Match token. Future PUT/PATCH endpoints will follow this convention.
Per-prompt MCP tools (opt-in). Toggle via POST /api/v2/prompts/{id}/mcp-toggle with { "expose": true }. Each exposed prompt appears in your agent's MCP tools/list as promethic_{slug} (e.g. promethic_clay_cuties). Agents invoke by name in one round-trip — no list_prompts + get_prompt + run_prompt dance. Cap = 50 per account; cap-hit returns 409 mcp_tool_cap_reached. Tool name is stable across prompt renames so hardcoded agent code keeps working. To re-derive the tool name from the new prompt name, call POST /api/v2/prompts/{id}/mcp-rename; collision returns 409 tool_name_taken or 409 tool_name_reserved.

v1.2 — 2026-05-07

Auto-finalize on /run: pass ?autoFinalize=true (now the default; toggle per-user via the autoFinalizeMcpRuns setting) and the server chains an internal /finalize after a successful run. The new recordId arrives via the record_finalized SSE event on the same stream as run_completed. Three new SSE events: record_finalized, record_finalize_failed (chain failed; agent decides whether to call POST /finalize manually based on retryable), record_finalize_skipped (informational, after run_failed).
fromTurn rewind primitive: /runs/{runId}/revise and /runs/{runId}/finalize accept fromTurn. Drops session turns > fromTurn, then applies the operation. Negative → 400 from_turn_invalid; out-of-range → 400 from_turn_out_of_range.
Finalize-on-Finalized amend: calling /finalize with finalText, tag, or fromTurn on a Finalized session reopens the session, bumps RunGeneration, and re-finalizes — same record ID, same handle. Fresh idempotency boundary for the new gen. Passing notes alone is a distinct case: it patches the record's notes directly without reopening (no new generation, not idempotent per call — each call updates the stored notes value). Passing nothing (no body / all fields absent) replays the most recent finalize response unchanged.
Unified turns[] on PublicRecordResponse: every record DTO carries a synthesized turns array (run / revision / edit, indexed contiguously) reconstructed from the input + delta chain + final-copied-output. Resolves the V1.1 stitching gap where agents had to mentally combine inputText + finalCopiedOutput + deltas[].
POST /records/{id}/revise replaces /revise-again: rehydrate a fresh RunSession from a finalized record's snapshot and revise. Same body shape as /runs/{runId}/revise (carries intermediateOutput + fromTurn). Per-record advisory lock serializes concurrent rehydrate attempts (409 record_revise_in_progress on contention). Old /revise-again route HARD-REMOVED.
Image blob cleanup queue: fromTurn rewinds legitimately shrink record image history. Dropped per-turn blobs enqueue into blob_cleanup_queue (background worker, single-leader via pg_advisory_lock(3), reference-count guard against both ImageStoredPath storage formats before S3 DELETE).
Spend audit discriminator: UsageLog.Discriminator column reserved for billing-eligibility tagging; SpendQueryFilters.Billable drives all SUM rollups (admin LIST endpoints intentionally show every row for audit visibility).
MCP CLI surface: revise_again tool COLLAPSED into revise_run (accepts runId; finalized sessions reopen transparently). run_prompt grows autoFinalize?: boolean. finalize_run + revise_run grow fromTurn?: number. Tool count: 25 → 24.

v1.1 — 2026-05-03

Per-prompt grants (Phase 6a): API keys can be restricted to a specific set of prompts. Configured via Settings → Developer Keys → Manage in the web/desktop apps. Three new session-only endpoints: GET/POST/DELETE /api/v2/keys/{keyId}/grants. Restricted-key access to non-granted prompts returns 403 grant_required.
Server-stateful runs: RunSession table replaces the V1 echo-back signed-blob model. Agents hold an opaque runId; the server keeps prompt + version snapshot frozen at /run time, immune to mid-flight prompt edits. POST /runs/{runId}/revise, /finalize; GET /runs/{runId}/images/{N} for in-flight image fetch.
Idempotency-Key on the full execute surface (Phase 3e/3f): /run, /revise, /finalize all replay byte-identically on retry. Route-signature composition with @gen{N} on /finalize so reopen-on-revise creates a fresh idempotency boundary. SSE replay protocol via the new run_replayed event (terminal-with-info).
Record self-management (Phase 5): DELETE /records/{id} (24h, ApiKey-owned, hard-delete + cascade) and PATCH /records/{id} (notes + tag, no time window). HIPAA §164.312(b) audit row on every mutation with PHI-aware presence/length/SHA-256-prefix metadata.
DELETE on prompts/versions/attachments (Phase 6b): write-scope + grant check. DELETE /attachments/{id} blocks if any active RunSession references the blob → 409 attachment_referenced_by_active_run. DELETE /prompts/{id}/versions/{vid} rejects current version atomically. DELETE /prompts/{id} + versions/{vid} force-terminate any active RunSessions before proceeding (no 409 for active runs on the REST surface; MCP tools return prompt_referenced_by_active_run / version_referenced_by_active_run instead).
Image-output runs for API-key callers (Phase 7): 409 image_runs_not_supported_v1 gate lifted on /run, /revise, /revise-again. Per-turn effectivePromptForImage accumulation persists into record.finalCopiedOutput on /finalize, restoring the desktop accumulated-prompt invariant.
Catalog enforcement (Phase 4): ModelSettingsValidator wired ADDITIVELY into POST /prompts + POST /prompts/{id}/versions + PATCH /prompts/{id}. Out-of-enum values like reasoning_effort: "xtreme" now 400 invalid_model_settings at write time instead of silently failing at /run.
Observability + cost precision (Phase 8): X-RateLimit-* headers on every response (both buckets reported). cost_micros (1/1000 cent) replaces cost_cents on public DTOs for sub-cent precision. response.usage SSE event reasoning_tokens fix for the Responses API shape.

v1.0.1 — 2026-04-29

Write scope + 5 new endpoints: prompt create / patch (RFC 7396) / current-version switch, version create, attachment upload.
Idempotency-Key header on all mutating POSTs (Stripe parity, 24 h TTL).
RFC 7807 problem+json errors with action_hint + invalid_params for self-healing agents.
GET /models — slim catalog endpoint with supportedOutputModalities.
CLI: prompts create / prompts patch / prompts switch-current / versions create / attachments add, plus YAML manifest mode.
Developer Keys management UI in the web + desktop apps.

v1 (alpha) — 2026-04-27

Initial public surface: read + execute scopes.
SSE protocol v1 with run_session / run_completed / run_failed taxonomy.
promethic CLI alpha (Node 18+).

Agent API

Quickstart

Authentication

Per-prompt grants V1.1

Endpoints

Read (read scope)

Execute (execute scope)

Run lifecycle body shapes V1.1

POST /runs/{runId}/revise

POST /runs/{runId}/finalize

Record DTO turns[] field V1.2

Record self-management (V1.1)

Write (write scope) V1.0.1

Cursor pagination

SSE protocol

Event taxonomy (protocol v1)

Charge visibility

Cost units V1.1

Rate-limit headers V1.1

CLI

Agent tools (hosted MCP) V1.3

Image input/output V1.1 Phase 3

Attachment management

Hosted MCP V1.3

Connect Claude Desktop / Cursor

Connect Claude Code (VS Code / Cursor extension)

Anthropic API — agents via MCP Connector beta

Connect Claude iOS / claude.ai web / ChatGPT (OAuth)

Image-modality runs over MCP

Per-tool grants (opt-in)

Differences from local CLI MCP

Run lifecycle (sessions vs records, auto-finalize)

Override the API URL

Errors

Idempotency V1.0.1

Contract

How the CLI uses it

Recovery from idempotency_outcome_unknown V1.3 Phase 4b

Rate limits

Versioning

Catalog stability

Known V1.0.1 limitations resolved in V1.1

Not in V1.1

Changelog

v1.3 — 2026-05-11 — BREAKING (per-prompt MCP tools)

v1.2 — 2026-05-07

v1.1 — 2026-05-03

v1.0.1 — 2026-04-29

v1 (alpha) — 2026-04-27

Read (`read` scope)

Execute (`execute` scope)

`POST /runs/{runId}/revise`

`POST /runs/{runId}/finalize`

Record DTO `turns[]` field V1.2

Write (`write` scope) V1.0.1

Recovery from `idempotency_outcome_unknown` V1.3 Phase 4b