Agent API
Stable, dual-auth API + CLI for terminals, scripts, and agents.
Base URL: https://api.getpromethic.com/api/v2/public
Quickstart
Issue a key in the web app at app.getpromethic.com → Settings → Developer Keys, then:
curl -H "X-API-Key: pmk_..." \
https://api.getpromethic.com/api/v2/public/prompts
Or via the CLI:
npm install -g @soulwarestudio/promethic-cli
promethic auth login # paste pmk_... key
promethic prompts list
promethic run <prompt-id> --input "summarize this article"
For write-scope keys, an agent can author a whole prompt
declaratively via a YAML manifest. Note the nested
parameters object — the server requires
{ model_id, parameters } shape so per-model
parameter values stay distinct from envelope fields.
Step one: discover a real model_id via
GET /api/v2/public/models. Catalog IDs are
opaque (e.g., gpt54nano_2c6f9b4d); the wire
name isn't the marketing name. Pick one from the response:
curl -H "X-API-Key: pmk_..." \
https://api.getpromethic.com/api/v2/public/models \
| jq '.recommended_defaults'
# → {"model_id": "gpt55_8e2b1d4f", "reasoning_effort": "medium", ...}
Step two: paste it directly into your manifest — the
catalog response is round-trippable. Copy model_id
into modelSettings.model_id; pick parameter
values from each param's values /
min / max /
provider_default.
# prompt.yaml
name: Article summarizer
promptText: |
Summarize the input in three bullets.
modelSettings:
model_id: gpt54nano_2c6f9b4d # ← from /models response above
parameters:
reasoning_effort: low
attachments:
- type: text
file: ./examples.txt
promethic prompts create --manifest prompt.yaml
Authentication
All requests carry an API key in the X-API-Key header. Keys
start with pmk_ and are scoped to one user. Three scopes are
available in V1.0.1:
| Scope | Grants |
|---|---|
| read | list/get prompts, versions, records, attachments, record images, models catalog |
| execute | run, revise (runId), finalize |
| write | create + update + delete prompts, versions, attachments, records (V1.0.1 + V1.1 Phases 2 & 3) |
Scopes are checked literally; an execute-only key cannot
list prompts. To do all three, request all three:
scopes: ["read", "execute", "write"]. DELETE
is now part of write as of V1.1; per-prompt grants gate
leaked-key blast radius.
Per-prompt grants V1.1
On top of scopes, each key can be optionally restricted to a specific set of prompts. Managed via the web app at app.getpromethic.com → Settings → Developer Keys → Manage; agents do not configure their own restrictions.
- Unrestricted (default) — a freshly-minted key has zero per-prompt grants and can access ALL of your prompts, gated only by its scopes.
- Restricted — once you add ANY prompt to a key's grant list, the key is restricted: only listed prompts are accessible. Calls to other prompts return
403 grant_required. - Removing the last prompt from a restricted key returns it to unrestricted (the web UI confirms this transition).
Enforcement is per-call: every endpoint that resolves to a single prompt (run, revise on runId, finalize, prompt and version reads, prompt PATCH/version create/attachment upload, record GET/image/DELETE/PATCH) re-checks the grant list at the time of the call. If you revoke a grant mid-workflow, the next call to the affected prompt 403s.
400 mixed_credentials_principal_mismatch.
Endpoints
Timestamps: all *AtUtc fields are UTC ISO 8601 strings (e.g. "2026-05-29T14:32:00Z"). Parse with new Date(str) or a library date parser; do not strip the Z suffix.
Read (read scope)
| GET | /prompts?limit=&cursor= | list (slim DTO) |
| GET | /prompts/{id} | + current version |
| GET | /prompts/{id}/versions | paginated history — sorted by versionNumber descending (highest first); cursor is version-number-based, not timestamp-based |
| GET | /prompts/{id}/versions/{vid} | single version |
| GET | /prompts/{id}/attachments | prompt attachments |
| GET | /attachments/{id} | attachment download |
| GET | /records?promptId=&versionId=&source=&createdBy=&limit=&maxOutputChars=&maxInputChars= | list (cursor-paginated); each record includes inputText (truncated at maxInputChars, default 4096, 0–32768) and outputText (truncated at maxOutputChars); truncated (output) / inputTruncated booleans indicate truncation; source is a string label — "API" (MCP run_prompt or REST API key execution), "App" (Avalonia desktop client), "Manual" (created via create_record, no LLM — versionId is null), "Headless" (Expo mobile client — not MCP) |
| GET | /records/{id} | slim record DTO |
| GET | /records/{id}/image?index=N | image PNG (binary) |
| GET | /models | catalog: model_id, supportedOutputModalities, costs (V1.0.1) |
Execute (execute scope)
| POST | /prompts/{id}/run | creates RunSession; SSE |
| POST | /runs/{runId}/revise | append turn; SSE |
| POST | /runs/{runId}/finalize | session→record (V1.1: ?persist= removed; always commits) |
| GET | /runs/{runId}/images/{N} | session image (active sessions) |
| POST | /records/{id}/revise V1.2 | rehydrate fresh run from record + revise (replaces V1.1 /revise-again); per-record advisory lock; SSE |
| DELETE | /records/{id} V1.1 | self-delete (ApiKey-owned, 24h window) |
| PATCH | /records/{id} V1.1 | amend notes/tag (RFC 7396; no time window) |
Run lifecycle body shapes V1.1
The execute surface mirrors the desktop Avalonia app's Convert / Revise / Copy / Copy & Tag flow. Body shapes for each call:
POST /runs/{runId}/revise
{
"instruction": "make it more concise", // required
"intermediateOutput": "edited prior output" // optional
}
intermediateOutput is the user's edited
prior-turn output. When passed, the model sees this (instead
of the prior turn's actual output) as context for this
revision, AND it lands in the resulting
ConversionDelta.IntermediateOutput. Mirrors the
Avalonia "user edited the textbox before hitting Revise"
flow. Omit for vanilla revise-from-prior-output. Image-
modality runs reject this field (400 intermediate_output_not_supported_image) —
image revisions always source from the model's actual prior
output. Pass null or omit to use the prior output; an empty string or whitespace-only value is rejected (400 invalid_params). 32 KB cap.
Record impact: because get_record reconstructs each turn's output as "what the next turn saw as prior context," passing intermediateOutput means turns[N-1].output in the resulting record will reflect your override value, not the raw model output from that turn. Only use this when you have genuinely edited the prior output and want the record to capture that edited state as the correction baseline.
Finalized sessions are transparently reopened — the server transitions Finalized → Active before the LLM call; no need to switch to POST /records/{id}/revise after a finalize. Expired or Failed runs cannot be revised; start a fresh /run. If Abandoned, returns 404 run_not_found.
POST /runs/{runId}/finalize
{
"finalText": "edited final output", // optional; creates edit delta if differs from model output
"tag": "exemplar", // optional; attaches to edit delta (requires finalText)
"notes": "context for this run" // optional; record-level
}
Three-axis surface that mirrors Avalonia's Copy / Copy & Tag semantic exactly:
- Plain finalize (no body): equivalent to Avalonia Copy with no edits. Saves the record from the model's last output. No edit delta.
- finalText only: Avalonia
Copy after editing the output box.
If
finalTextdiffers from the model's last output, server creates an edit delta with empty tag. If it matches, no edit delta (treated as a clean Copy). - finalText + tag: Avalonia
Copy & Tag. Tag attaches to the
edit delta — the Refine wizard signal. Server returns
400 tag_without_deltaiffinalTextis omitted or matches the model output (no edit delta to tag). Avalonia disables the Copy & Tag button in the same situation. - notes: independent of the above.
Record-level free text. Available without
finalTexton plain finalize, or alongside finalText/tag.
Image-modality runs reject finalText
(400 final_text_not_supported_image) —
record.finalCopiedOutput for image records is
server-derived from the per-turn effectivePromptForImage
accumulation chain (training-data invariant; CLAUDE.md
"Image Records: FinalCopiedOutput as Accumulated Prompt").
Caps: finalText 256 KB
(413 final_text_too_large); if provided, finalText cannot be an empty string or whitespace-only value (400 invalid_params); whitespace-only tag is also rejected with 400 invalid_params;
notes 64 KB
(413 notes_too_large);
tag 64 KB
(413 tag_too_large).
Record DTO turns[] field V1.2
Records returned by GET /records, GET /records/{id},
and POST /finalize include a turns[] array that is
a synthesized linear history of the record's states (run + revisions + optional
edit). Each entry has a stable index matching the
fromTurn parameter accepted by /revise and /finalize.
{
"id": "...",
"promptId": "...",
"inputText": "Summarize this article: ...",
"finalCopiedOutput": "Y_edited",
"turns": [
{ "index": 0, "kind": "run", "input": "Summarize this article: ...", "output": "X" },
{ "index": 1, "kind": "revision", "instruction": "make formal",
"intermediateOutput": "X", "output": "Y", "modelId": "...", "costMilliCents": 1234 },
{ "index": 2, "kind": "edit", "intermediateOutput": "Y",
"output": "Y_edited", "tag": "user-edit" }
],
...
}
Three kinds:
kind: "run"— always index 0. Carriesinput(the prompt's user input). NOTE:outputfor the run turn is "what the next turn saw as its prior context, OR the record'sfinalCopiedOutputif no later turn." If a user edited the textbox client-side before pressing Revise, the edit was committed forward as the next revision'sintermediateOutput; the model's literal output text is not preserved.kind: "revision"— each /revise call appends one.instructionis the user's revise instruction.intermediateOutputis what the model saw as prior context.kind: "edit"— at most one per record; always the last entry. Created when /finalize was called withfinalTextdiffering from the model's last output.intermediateOutputis the model's actual last output before the edit;outputis the user's edited text (= record'sfinalCopiedOutput).
Use turns[].index to pass fromTurn on /revise or
/finalize for rewind-and-redo (V1.2).
Index space: fromTurn is 0-based and uses the
same index as turns[].index in get_record.
Valid range is 0 to turns.length - 2 (exclusive upper
bound — passing the last turn index would revert the record to an empty state).
from_turn_invalid is returned for negative values; from_turn_out_of_range is returned if the value exceeds the valid range.
Record self-management (V1.1)
DELETE /api/v2/public/records/{id} hard-deletes a record
if and only if (a) the caller is an API key, (b) the record was
created by the same API key (credentialPrincipalType=ApiKey +
matching id), and (c) the record is less than 24h old (anchored to
record.createdAt). Returns 204 No Content on
success, or 403 record_not_owned_by_api_key /
409 record_self_delete_window_expired /
404 record_not_found otherwise. Delete is hard
(cascade to deltas; FK ON DELETE SET NULL on
RunSession.FinalizedRecordId auto-clears any
finalize-replay rendezvous so the next /finalize replay returns
410 record_was_deleted). Image bytes are best-effort
cleaned from blob storage. Retries after a successful DELETE
return 404 — the row is gone, so HTTP-level
idempotency is by-construction (the second call returns 404 record_not_found).
Note: the MCP delete_record tool uses soft-delete and is idempotent —
retries return {status:"deleted"} instead of a 404.
PATCH /api/v2/public/records/{id} updates mutable fields
on a record. Same ApiKey-owned check as DELETE, but no time
window. Body is RFC 7396 JSON Merge Patch
(Content-Type: application/merge-patch+json):
missing key = unchanged, explicit null = clear, value = set.
Behavior depends on the record's source field:
Manual records (source = "Manual", created via create_record): notes, input, and output are freely editable with no delta ever created. tag and fromTurn return 400 invalid_request — manual records are pure seed-data pairs with no delta chain.
Non-manual records (App / API / Headless): writing output auto-creates or updates the record's single edit delta. intermediateOutput is set once on first edit and is immutable (anchors the model's last returned output); subsequent output patches update finalCopiedOutput only. Revert to model output: if output equals the existing edit delta's intermediateOutput (the model's last returned output), the edit delta is removed and editCount returns to 0 — you are undoing the correction. Providing tag alongside a reverting output returns 400 tag_would_be_lost_on_revert (the delta is being removed; omit tag to revert cleanly, or use a different output to keep the delta and anchor the tag). tag is optional annotation on the delta and requires output; write it as a generalizable rule or vibe that applies to other inputs too (e.g. “Less formal”, “Remove em-dashes”), not a description of this specific diff — it feeds prompt refinement; if an edit delta already exists and output is unchanged, tag-only update is allowed. Pass fromTurn: N to revert to delta[N].intermediateOutput and soft-delete all later deltas. fromTurn: 0 restores the state before the first surviving delta: the original model output on unrevised records, or the pre-first-revision state on revised records (which may include prior user edits). Intermediate edit states between patches are not recoverable. Mutually exclusive with output and tag. notes is a record-level label, always valid on all record types. Revisions squash edit deltas: when revise_run / POST /runs/{runId}/revise finalizes, all prior deltas (including any edit delta from patch_record) are replaced by the new revision delta set. After a revision, fromTurn: 0 therefore restores the pre-revision desired output, not the original conversion output.
Response: { id, notes, tag, input, output, lastPatchedAtUtc }.
HIPAA §164.312(b) audit row is written with field
presence/length/SHA-256 prefix metadata only —
never raw content.
Write (write scope) V1.0.1
All mutating POSTs honour the Idempotency-Key header — see
Idempotency. PATCH /prompts/{id}
follows RFC 7396 JSON Merge Patch:
missing keys leave server state untouched; explicit null clears.
| POST | /prompts | create prompt + initial version |
| PATCH | /prompts/{id} | RFC 7396 merge patch — application/merge-patch+json |
| PUT | /prompts/{id}/current-version | switch which version is "current"; returns { currentVersionId } reflecting actual post-update DB state (not just echoing the request) |
| POST | /prompts/{id}/versions | append new version (auto-increments versionNumber) |
| PATCH | /prompts/{id}/versions/{vid} V1.1 | RFC 7396 merge-patch of an existing version — promptText, modelSettingsJson, versionDescription, description, descriptionMode. Versions are mutable in place; create a new version only when you want an explicit audit checkpoint. Writing description auto-flips to Manual; pass "" to clear and revert to Auto. Optional If-Match header (hex Ticks of UpdatedAt) for optimistic concurrency. |
| POST | /prompts/{id}/attachments | upload file (multipart/form-data) |
| DELETE | /prompts/{id} V1.1 | soft-delete prompt; cascades hide records/versions/attachments under it; idempotent on already-deleted — returns {promptId, status:"deleted"}; subsequent reads of a deleted prompt return prompt_not_found |
| DELETE | /prompts/{id}/versions/{vid} V1.1 | soft-delete version; if the version is current, auto-switches currentVersionId to the lowest-VersionNumber remaining version and returns newCurrentVersionId in the response; rejects only if it is the sole remaining version (409 cannot_delete_only_version) |
| DELETE | /attachments/{id} V1.1 | soft-delete + storage refund; idempotent — double-delete of an already-deleted attachment returns 204 (only first delete decrements storage); 409 attachment_referenced_by_active_run if any active run's snapshot references the blob |
Field caps: name ≤ 256 chars, promptText ≤ 256 KB,
modelSettings JSON ≤ 64 KB, text attachments ≤ 5 MB,
image attachments ≤ 10 MB, ≤ 20 attachments per prompt, ≤ 50 MB per prompt,
1 GB per user.
Cursor pagination
List endpoints return { items: [...], nextCursor?: string }. nextCursor is present only when more pages exist — it is absent (not null) on the final page. Stop paginating when the field is not in the response.
Cursors are signed (HMAC-SHA256) with a server-side root key; tampered
cursors return 400 invalid_cursor; cursors issued for one
user can't be replayed by another (400 invalid_cursor);
changing query filters mid-pagination returns 400 cursor_filter_mismatch.
Page size (limit query parameter):
The REST API accepts limit in the range 1–500
(default 100). Values below 1 return 400 param_out_of_range;
values above 500 are silently clamped to 500.
The MCP tool accepts limit in the range 1–100
(default 25); values outside that range return 400 param_out_of_range.
SSE protocol
All run-producing endpoints (/run,
/runs/{runId}/revise,
/records/{id}/revise) return 200 OK with
Content-Type: text/event-stream. Errors are in-stream
events, not HTTP status codes — clients should NOT branch on
HTTP status for these endpoints.
Event taxonomy (protocol v1)
| Event | When | Payload |
|---|---|---|
run_session | First event | {protocolVersion, runId, turnIndex, modelId, outputModality?, seededFromRecordId?} |
client_hint V1.1 Phase 0 | Image-modality runs only, immediately after run_session | {filter: "partial_image_b64", reason: "image_modality_inline_payloads", canonicalAccess: "GET /api/v2/public/runs/{runId}/images/{n}"} |
| (upstream events) | Middle | OpenAI's response.output_text.delta etc., passed through verbatim |
run_completed | Success terminator | {runId, turnIndex, modelId, costMilliCents, imageCount?} |
run_failed | Failure terminator | {runId, reasonCode, message?, charged, costMilliCents?, usageLogId?} |
run_replayed V1.1 | Idempotency-Key replay terminator | {runId, turnIndex, modelId, outputModality?, state, streamingInProgress, recordId?, hint} |
record_finalized V1.2 | Chained auto-finalize succeeded; emitted AFTER run_completed on the same stream when ?autoFinalize=true | {runId, recordId, turns, costMilliCents?} |
record_finalize_failed V1.2 | Chained auto-finalize failed; the upstream run already succeeded. If retryable, agent calls POST /finalize manually. | {runId, reasonCode, retryable} |
record_finalize_skipped V1.2 | Informational; emitted AFTER run_failed when ?autoFinalize=true was set. Agents MUST NOT trigger separate failure handling — the failure is already reported in run_failed. | {runId, reason: "run_failed", reasonCode} |
protocolVersion in run_session against your
parser's expected version.
- HTTP callers (direct REST to
/api/v2/public): the SSE stream includes inline base64 image payloads onresponse.image_generation_call.partial_imageandresponse.output_item.done. These can be tens of KB to several MB per frame. Most clients will want to filter them out and fetch the canonical bytes viaGET /api/v2/public/runs/{runId}/images/{n}afterrun_completed. Aclient_hintevent is emitted right afterrun_sessionto flag this. - MCP callers (
mcp.getpromethic.com/v1): the MCP transport dropspartial_image_*events and redacts large base64 from kept frames automatically. Final image bytes are returned inline as MCPimagecontent blocks in thetools/callresult alongside the text transcript — no follow-up fetch needed. Clients that don't renderimagecontent can still callpromethic_get_run_imageby index, orGET /records/{id}/image?index=Nafter finalize. run_completed.imageCountreports how many images THIS turn produced (per-turn, not session-aggregate). Iteratenin[0, imageCount). A text-only revise on an image session emitsimageCount: 0.- Pre-finalize,
GET /runs/{runId}/images/{n}reads the LATEST turn's images. To access prior-turn images after the run ends, finalize first and readrecord.imageStoredPathviaGET /records/{id}/image?index=N— every per-turn image is preserved on the record. - If image generation completed upstream but blob storage failed (3× retries exhausted), the run terminates with
run_failed { reasonCode: "image_upload_failed", charged: true }. Replay with the same Idempotency-Key returns this same failure (no re-attempt, no double-bill).
run_replayed does not reflect post-replay state.
When you retry a /run or /revise with the
same Idempotency-Key, the server returns the
runId from the original call (the billable work is
already in flight or done) and terminates this stream with
run_replayed instead of run_completed.
The payload's state field is a snapshot from when
the replay was recorded — it does NOT track later state
changes on the same run. For the live state of a replayed
run, the agent must derive it from its own bookkeeping of
the original call (and, once GET /api/v2/public/runs/{runId}
ships in V1.2, poll that). The CLI's RunCallResult
surfaces this as {succeeded: false, reasonCode: "replayed_state_unknown"}
rather than masquerading as success.
Charge visibility
run_failed carries charged: true when the
upstream call was billed despite the local failure
(usage_log_write_failed or session_lost_mid_*
after a successful upstream call). Agents should record the cost from
costMilliCents for reconciliation.
Cost units V1.1
All cost fields on the public API surface use millicents
(1 cent = 1,000 millicents; 1 USD = 100,000 millicents).
The integer wire format preserves sub-cent precision for
image-token / reasoning-heavy calls that previously truncated.
Render as cents-with-decimals via costMilliCents / 1000;
as USD via costMilliCents / 100000; e.g.
costMilliCents: 503 = 0.503¢ = $0.00503.
Field was renamed from costMicros (2026-05-11) to
costMicroCents, then again to costMilliCents
(2026-05-19) — the stored unit is 1/1000 of a cent (millicents),
and the name now matches the value.
Affected fields: record.costMilliCents,
run_completed.costMilliCents,
run_failed.costMilliCents,
record_finalized.costMilliCents. The pre-V1.1
costCents field is removed from public DTOs;
pre-V1.1 records backfill via costCents * 1000
so historical rows still surface a cost.
Rate-limit headers V1.1
Every public-API response (success + 429) carries:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1746201600
X-RateLimit-Bucket: key=47/60,user=120/300
The standard pair (Limit / Remaining)
reports the most-restrictive bucket — the per-key bucket
when the caller is API-key-attributed (always smaller than
per-user), else the per-user bucket. The diagnostic
X-RateLimit-Bucket reports both
(key=N/L,user=N/L) so agents can observe per-key
vs per-user pressure separately. Session-only callers
see X-RateLimit-Bucket: user=N/L.
CLI
The promethic Node CLI (@soulwarestudio/promethic-cli
on npm) ships every public-API endpoint behind ergonomic commands.
npm install -g @soulwarestudio/promethic-cli
promethic auth login # paste pmk_... key
promethic auth status
promethic auth logout
promethic prompts list [--limit N] [--cursor C] [--json]
promethic prompts get <id> [--json]
promethic prompts delete <id> # V1.1
promethic prompts delete-version <promptId> <versionId> # V1.1
promethic run <prompt-id> [--input "..."] [--input-file path] [--no-accept] [--auto-finalize true|false] [--json] # V1.2: --auto-finalize
promethic revise <handle> --instruction "..." [--intermediate-output "..."] [--from-turn N] [--no-accept] [--json]
# V1.2: handle is run-id or record-id; --from-turn rewinds
promethic finalize <run-id> [--final-text "..."] [--tag "..."] [--notes "..."] [--from-turn N] [--json]
# V1.2: --from-turn rewinds; finalize on Finalized session amends in place
promethic records list [--prompt <id>] [--source API] [--json]
promethic records get <id> [--json]
promethic records image <id> --index N --output path.png
promethic records delete <id> # V1.1
promethic records patch <id> [--notes "..."|--clear-notes] [--tag "..."|--clear-tag] [--json] # V1.1
promethic run <promptId> --image <file> [--image <file>] # V1.1 Phase 3 — vision input
promethic attachments add <promptId> <file> [--type image|text] [--filename ...] [--json]
promethic attachments list <promptId> [--json] # V1.1 Phase 3
promethic attachments get <attachmentId> <outputPath> # V1.1 Phase 3
promethic attachments delete <id> # V1.1
Agent tools (hosted MCP) V1.3
Promethic exposes its agent tool surface through the hosted
MCP server at
mcp.getpromethic.com/v1 — remote, OAuth, no install. It exposes
26 core tools covering the full Avalonia desktop / Expo web
workspace surface (plus any per-prompt synthesized tools you've enabled).
(The CLI's local-stdio MCP server was retired in CLI 0.7.0; the hosted MCP is
the single agent surface.)
- read:
list_prompts(returns name + description +outputModality+mcpToolName— non-null when the prompt is exposed as a per-prompt synthesized tool; agents can call it directly by that name withoutrun_prompt),get_prompt,list_records,get_record(full detail: inputText, outputText, notes, modelId, token counts, costMilliCents, revisionCount, imageCount, isOneShot, andturns[]— the complete per-turn history reconstructed from the delta chain;turns[].kind:"run"/"revision"/"edit"),list_versions,get_version,list_attachments,get_attachment,get_catalog - execute:
run_prompt(V1.2: optionalautoFinalizeauto-creates a record on success),revise_run(V1.2: acceptsrunId— auto-finalized runs can be revised by the samerunId, no need to setautoFinalize=falseto "keep editing"; finalized sessions are reopened transparently;fromTurnrewinds),finalize_run(V1.2:fromTurnrewinds; can amend a Finalized session; returns{recordId, turns, costMilliCents?}),get_run_image,delete_record,patch_record(notes / input / output / tag / fromTurn — manual records allow notes/input/output only; non-manual records auto-create edit delta on output change),create_record(V1.4: agent-curated training data —{promptId, input, output, notes?, idempotencyKey?}creates a finalized record with no LLM call, no spend;inputandoutputmust be non-empty, non-whitespace strings — whitespace-only values are rejected withinvalid_params; optionalidempotencyKey(string) enables safe retry — same key + same payload returns the original record instead of creating a duplicate, scoped per-user with 24h TTL; for "voice prompt" workflows where the user collaborates with the agent to seed examples before running Generate Prompt on the desktop) - write (V1.1 Phase 2 + Phase 3):
create_prompt,update_prompt,delete_prompt,create_version(with optionalsetAsCurrentin one transaction),update_version(versionDescription + description),switch_current_version,delete_version,upload_attachment,delete_attachment
All tools mirror the workspace flow agents would otherwise need a desktop or web browser to drive: author prompts, manage versions, attach reference files, run with vision, revise, finalize, edit notes/tags.
Connect the hosted MCP at https://mcp.getpromethic.com/v1 with
your pmk_… key as a Bearer token (or via OAuth where the client
supports it). See getpromethic.com/agents
for client-specific config (Claude Desktop / Cursor JSON, Codex TOML, and the
custom-connector UI for iOS / web / ChatGPT).
Image input/output V1.1 Phase 3
Output: image-modality runs return image bytes inline as base64
(≤ 16 MB raw) or as a local-file path (larger). Same shape via
get_run_image for in-flight runs and
get_attachment for prompt attachments.
Input: run_prompt.images accepts the SAME
media-ref shape as the output. Agents can pipe a prior run's image straight
back in:
{
"images": [
{ "inline": true, "base64": "...", "mimeType": "image/png" },
{ "inline": false, "localPath": "/tmp/photo.png", "mimeType": "image/png" }
]
}
Up to 16 images per run, 10 MB each. Requires the prompt's model to declare
input_image capability (GPT-5.x via Responses API; gpt-image-1.x
as edit inputs). Trust note: localPath is read by the MCP server
with the user's permissions — only use paths the agent is authorized to read.
Attachment management
upload_attachment takes either inline base64 or a
localPath; idempotency keys are derived from
(promptId, filename, content) so retries replay the original
upload (no double-billing). list_attachments,
get_attachment, and delete_attachment round out
the surface. Per-file: 10 MB image / 5 MB text. Per-prompt: 50 MB total.
Path traversal in filenames: filenames with ..
are intercepted by the Cloudflare WAF with a 403 before reaching the app
(which would reject them as invalid_filename anyway); the
error shape is a raw 403, not a JSON envelope.
Cancelled runs release their RunSession TTL naturally; the
server expires them after 1 h of inactivity.
Same pattern works for Cursor, Zed, Continue, Cline — any desktop-class MCP-aware client.
Hosted MCP V1.3
For agents that can't (or shouldn't) run a local stdio server
— Claude iOS, claude.ai web, sandboxed automations — Promethic
hosts the same 27-core-tool surface at https://mcp.getpromethic.com/v1
speaking MCP
Streamable HTTP transport (spec 2025-03-26). Same tools,
same scopes, same pmk_ keys.
Why hosted MCP exists: an agent calling tools via stdio needs a local CLI install + a long-lived process. A hosted endpoint replaces both with one HTTP URL — Claude iOS just adds a connector, no local binary. The wire shape is identical to the local CLI, so existing scripts don't change.
Connect Claude Desktop / Cursor
{
"mcpServers": {
"promethic": {
"url": "https://mcp.getpromethic.com/v1",
"headers": {
"Authorization": "Bearer pmk_..."
}
}
}
}
Claude Desktop config path is the same as the stdio install
above (claude_desktop_config.json). Cursor: settings
JSON, same shape. The pmk_ key supplies auth; the
server exchanges it at initialize for a connection-bound
mcps_ session token (24-hour idle TTL, renewed on each
tool call). pmk_ keys never expire.
Connect Claude Code (VS Code / Cursor extension)
Claude Code stores MCP servers in ~/.claude.json.
Add the hosted server via the CLI:
claude mcp add promethic --transport http https://mcp.getpromethic.com/v1 \
--header "Authorization: Bearer pmk_..."
Or add it manually to ~/.claude.json under
mcpServers:
{
"mcpServers": {
"promethic": {
"type": "http",
"url": "https://mcp.getpromethic.com/v1",
"headers": {
"Authorization": "Bearer pmk_..."
}
}
}
}
Reload the VS Code / Cursor window after editing. Tools appear under
the mcp__promethic__* namespace.
Anthropic API — agents via MCP Connector beta
Programmatic agents using the Anthropic Messages API can connect to the hosted MCP server without any local setup:
import anthropic
client = anthropic.Anthropic()
response = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
messages=[{"role": "user", "content": "List my prompts and run one"}],
mcp_servers=[{
"type": "url",
"url": "https://mcp.getpromethic.com/v1",
"name": "promethic",
"authorization_token": "pmk_..." # pmk_ key, no "Bearer" prefix here
}],
betas=["mcp-client-2025-11-20"]
)
Claude discovers all tools automatically and calls them to satisfy
the request. No separate tools/list call needed.
Connect Claude iOS / claude.ai web / ChatGPT (OAuth)
These clients use OAuth 2.1 + PKCE instead of bearer-key paste —
their connector UI doesn't accept a static token. Add a custom
connector pointing at https://mcp.getpromethic.com/v1
and leave the OAuth fields blank; the client auto-discovers them
via /.well-known/oauth-protected-resource. On Connect,
a popup opens to the Promethic consent screen — sign in, click
Allow, the connector activates. Revoke any time at
app.getpromethic.com →
Settings → Connected Apps.
OAuth access tokens (pmoa_) have a 30-day TTL; refresh
tokens (pmor_) also expire after 30 days. After expiry
the client will prompt for re-authorization automatically. The full 27-core-tool surface (plus per-prompt synthesized tools) appears in
the agent's tool tray. Tools are namespaced
promethic_<name> on the wire (underscore separator
per Anthropic's Tool API name regex).
Image-modality runs over MCP
promethic_run_prompt on an image-modality prompt returns
the generated image bytes inline as MCP image
content blocks in the tools/call result, alongside
the text transcript. Claude.ai / ChatGPT render these directly. The
text transcript shows [image bytes elided ...] for events
that originally carried base64 — those bytes live in the
image content blocks instead. Use
promethic_get_run_image only as a fallback (e.g., when
re-fetching after losing the original tool result).
Per-tool grants (opt-in)
Hosted MCP supports an optional per-tool allow-list on top of the
read/execute/write scopes.
New keys are unconfigured by default and can call any tool the key's
scopes permit — useful while the per-tool config UI is being built.
Once you opt in (set an explicit allow-list on the key), only those
tool names succeed; anything else returns
tool_grant_required. The wildcard ["*"]
explicitly reverts to allow-all, and the empty array []
blocks every non-discovery tool.
Discovery surfaces (list_prompts, get_catalog)
always bypass the per-tool gate so agents can always discover what
tools exist.
Differences from local CLI MCP
upload_attachmentrequires inlinebytes_base64;localPathis rejected (the server has no agent's filesystem). 10 MB raw cap matches the local CLI; chunked upload (upload_id/chunk_index/chunk_total) is reserved for V2.- Idempotency keys: vendor-prefixed
_metanamespace —_meta["com.getpromethic/idempotency-key"]on the JSON-RPC request. Same byte-identity replay semantics as the HTTPIdempotency-Keyheader. - Streaming tools (
run_prompt,revise_run) usenotifications/progress(MCP spec) for live model output. Path B durable resume: if the live stream drops, GET/v1/sessions/{sid}/calls/{toolCallId}returns the final result once available.
Run lifecycle (sessions vs records, auto-finalize)
A run is a transient session: the model receives your
input, streams an output, and the server tracks state in
RunSession (1h sliding TTL). A record is the
persisted artifact: input + final output + any revision turns + edits +
cost — the structured data Promethic uses to refine your prompt over time.
To go from session to record you call finalize_run. To do
nothing and let the session expire naturally, simply don't finalize — the TTL
reaper sweeps it after 1 h.
Auto-finalize (default ON since V1.2) chains a
finalize_run after a successful run_prompt on
the same SSE stream. One call, one round trip, one saved record. The
new recordId arrives via the record_finalized
SSE event. This is what most agents want — call run_prompt,
use the returned record.
Pass autoFinalize: false on a single run_prompt
call to opt out — useful when you want to revise_run the
output before saving, or just inspect it before committing. Then revise with the
runId and finalize manually with
--final-text / --tag / --notes
when ready. Finalized runs can be revised again by the same runId —
the session reopens transparently.
To set the persistent default (so all your runs from any client behave the same way), use one of:
- Avalonia desktop: Settings → Appearance → Hosted MCP toggle.
- Expo (web / iOS): Settings → toggle "Auto-save MCP runs as records".
- CLI:
promethic config set auto-finalize-mcp-runs <true|false>(V1.3+).
Per-call autoFinalize always overrides the persistent
default. The persistent default in turn overrides the V1.2 server
default of true.
--no-accept on the CLI saves a JSON artifact at
~/.promethic/runs/<runId>.json (mode 0600) so the
runId + state survives across shell sessions.
Override the API URL
export PROMETHIC_API_URL=http://localhost:8080
promethic auth status
Only https://..., http://localhost, and
http://127.0.0.1 are accepted — the CLI refuses to send
your key to other http:// hosts.
Errors
All 4xx/5xx responses follow
RFC 7807
application/problem+json. Designed for self-healing
agents — every error names what went wrong, what to do about it,
and (where applicable) which exact field tripped:
{
"type": "https://api.getpromethic.com/problems/invalid_model_settings",
"title": "Model settings reference an unknown or inactive model.",
"status": 400,
"detail": "The model_id is not in the catalog, or it has been retired.",
"reason_code": "invalid_model_settings",
"action_hint": "List models via GET /api/v2/public/models, then retry with a current model_id.",
"request_id": "req_01HX...",
"invalid_params": [
{ "name": "modelSettings.model_id", "reason": "unknown_or_inactive_model" }
]
}
Each type URL is also a working redirect:
GET /problems/{reason_code} 302s to the matching
section of these docs (e.g.
/problems/idempotency_key_reused). Agents that
follow the link land on a human description plus the resolution
steps for that specific reason.
Reason codes — each row's id is the redirect target for /problems/{reason_code}:
| HTTP | reason_code | Meaning |
|---|---|---|
| 400 | invalid_request | shape error — see invalid_params array for field-level detail (subsumes null-not-allowed via required_field_clear reason) |
| 400 | invalid_params | also a top-level reason_code (e.g. out-of-range limit on list endpoints, wrong-type parameter). Distinct from the invalid_params array sub-field that appears on invalid_request responses. |
| 400 | invalid_filename | filename contains path separators (/, \), null bytes, dots-only names (., ..), whitespace-only value, or leading/trailing whitespace. Use a plain trimmed name like report.txt. Note: a completely missing or empty filename returns invalid_params ("filename is required"), not this code. |
| 400 | image_dimensions_too_small | uploaded image dimensions are below the minimum required (128×128 px). Resize or use a larger image. |
| 400 | invalid_image_format | uploaded bytes are not a recognized image format. Only PNG, JPEG, and WebP are accepted. Re-encode the file before uploading. |
| 400 | invalid_model_settings | unknown / inactive model_id, or missing parameters object; action_hint tells you to GET /api/v2/public/models |
| 400/413 | field_too_large | see V1.0.1 caps in the Write section |
| 400 | idempotency_key_invalid | missing, > 255 chars, comma-joined, or non-visible-ASCII |
| 400 | stream_required | POST /run/revise must include stream: true |
| 400 | invalid_cursor | tampered or cross-user cursor |
| 400 | cursor_filter_mismatch | filter params changed mid-pagination |
| 400 | persist_query_param_removed | V1.1 — ?persist query param removed; /finalize always commits. Use DELETE /records/{id} within 24h to undo. |
| 412 | precondition_failed | REST-R04-001/002 (2026-05-24) — If-Match token mismatch or malformed token. Re-fetch the resource to obtain a fresh updatedAtUtc-derived token. Accepts both RFC 7232 quoted form ("abc123") and bare hex (abc123). |
| 410 | record_was_deleted | V1.1 — this run finalized to a record that has since been self-deleted. Mint a new run. |
| 500 | snapshot_corrupt | V1.1 — server-side data integrity: the run session's version snapshot failed to parse. Mint a new run. |
| 409 | reopen_limit_exceeded | V1.1 — session has been reopened more than 100 times via /revise after /finalize. Mint a new run. |
| 409 | no_prior_image_for_revision | image-edit revisions require at least one prior image from a completed run. Call run_prompt (or POST /run) on this session first to generate a base image, then call revise_run. |
| 400 | intermediate_output_not_supported_image | V1.1 — image-modality runs reject intermediateOutput. Mirrors desktop: image revisions always use the model's actual output. |
| 413 | intermediate_output_too_large | V1.1 — intermediateOutput exceeds 32 KB per-turn cap. |
| 413 | final_text_too_large | V1.1 — finalText exceeds 256 KB cap. |
| 413 | notes_too_large | V1.1 — notes exceeds 64 KB cap. |
| 413 | tag_too_large | V1.1 — tag exceeds 64 KB cap. |
| 413 | input_too_large | PATCH /records — input exceeds the allowed character cap. |
| 413 | output_too_large | PATCH /records — output exceeds the allowed character cap. |
| 500 | internal_error | Unexpected server error. Retry; if the error persists, contact support. |
| 400 | tag_without_delta | V1.1 — /finalize received tag but no edit delta was produced (finalText omitted or matches model output). Use notes for record-level labels, or PATCH /records/{id} to re-tag. |
| 403 | record_not_owned_by_api_key | V1.1 — DELETE/PATCH /records/{id} on the public API is restricted to the API key that created the record. Mutate via the Promethic web/desktop app, or use the original API key. |
| 409 | record_self_delete_window_expired | V1.1 — DELETE /records/{id} on the public API is restricted to the first 24h after record creation. Delete via the Promethic web/desktop app instead. |
| 409 | record_no_edit_delta | V1.1 — /finalize or PATCH /records/{id} attempted to set tag but no edit delta exists to anchor it. For /finalize: include finalText. For MCP patch_record: provide output alongside tag to create the edit delta first. |
| 400 | tag_requires_distinct_output | MCP / REST patch_record — tag was provided with output, but output equals the record's current value and no edit delta exists yet. Provide a different output value, or use notes for a label with no edit. |
| 400 | tag_requires_output | MCP / REST patch_record — tag was provided without output and no edit delta exists yet — there is nothing to annotate. Provide an output value to create an edit delta first, or use notes for a record-level label with no edit. |
| 400 | tag_would_be_lost_on_revert | MCP patch_record — output matches the model's last returned output, which removes the edit delta. A tag was also provided but has nothing to anchor. Omit tag to revert cleanly, or provide a different output to keep the edit delta and anchor the tag. |
| 400 | param_out_of_range | A numeric parameter (e.g. limit) is outside the allowed range. Check the tool description for valid bounds. |
| 403 | grant_required | V1.1 — API key is restricted to a specific set of prompts and the requested prompt is not in that set. Manage at Settings → Developer Keys → Manage prompts, or use an unrestricted key. |
| 409 | version_is_current | Retired 2026-05-19. DELETE /versions/{vid} no longer rejects the current version — it auto-switches to the lowest-VersionNumber remaining version instead. This code is no longer emitted. |
| 409 | cannot_delete_only_version | DELETE /prompts/{id}/versions/{vid} attempted on the only remaining version. Create another version before deleting. |
| 409 | version_switch_conflict | Concurrent request changed version state mid-delete; retry delete_version. |
| 409 | attachment_referenced_by_active_run | V1.1 — DELETE /attachments/{id} blocked because an active RunSession's snapshot references this attachment. Finalize the listed run(s) first, or wait for the 1h TTL to expire. |
| 409 | prompt_referenced_by_active_run | V1.1 (MCP only) — MCP delete_prompt blocked because at least one non-terminal RunSession still references the prompt. The REST DELETE endpoint force-terminates active sessions automatically and does not return this code. If returned from MCP, finalize the listed run(s) first. |
| 409 | version_referenced_by_active_run | V1.1 (MCP only) — MCP delete_version blocked because at least one non-terminal RunSession pins this version. The REST DELETE endpoint force-terminates active sessions automatically and does not return this code. If returned from MCP, finalize the listed run(s) first. |
| 409 / lifted | image_runs_not_supported_v1 | V1.1 Phase 7: lifted. Image-modality runs via API key are now supported on /run, /revise, /revise-again. The accumulated effective prompt is persisted per-turn and surfaces as record.finalCopiedOutput after /finalize. The reason code is kept in the table for back-compat with old SDKs but is no longer emitted. |
| 400 | final_text_not_supported_image | V1.1 — image-modality runs reject finalText. record.FinalCopiedOutput for image records is server-derived from the image-prompt accumulation chain (training-data invariant). |
| 413 | session_deltas_too_large | V1.1 — total session.Deltas jsonb exceeds the 2 MB cap. Finalize and start fresh. |
| 413 | cost_incurred_no_delta_persisted | V1.1 — upstream model call billed but the resulting turn couldn't persist (post-upstream cap exceeded). UsageLog has the charge. |
| 400 | invalid_image_base64 | bad base64 in images[].data |
| 400 | invalid_index | index ≥ 0 violation (e.g. ?index= on record image) |
| 400 | invalid_source | ?source= not in {App, Manual, API, Headless} (case-insensitive) |
| 400 | invalid_type | MCP report_issue: type not one of "bug" / "feature_request" (MCP-only, not on REST surface) |
| 400 | invalid_report | MCP report_issue: report body is empty (MCP-only, not on REST surface) |
| 400 | instruction_required | POST /revise needs a non-empty instruction |
| 400 | from_turn_invalid | V1.2 — fromTurn not a valid non-negative integer |
| 400 | from_turn_out_of_range | V1.2 — fromTurn exceeds current turn count; re-read turns[] |
| 400 | mixed_credentials_principal_mismatch | session + key resolve to different users |
| 400 | mixed_credentials_key_mismatch | two API keys present that don't match |
| 401 | key_unauthorized | missing / invalid / expired / revoked API key |
| 403 | scope_required | key lacks the required scope |
| 403 | api_key_not_permitted | endpoint requires a session, not a key |
| 404 | prompt_not_found | no prompt with that id is visible to this caller |
| 404 | version_not_found | no matching version on the prompt; also returned when attempting to revise from a record whose original prompt version has been deleted — this is expected behavior, not a bug. Use run_prompt to start a fresh run on the current version, or create_record to seed new training examples. |
| 409 | current_version_missing | prompt's currentVersionId points to a deleted version; no fallback could be self-healed |
| 409 | manual_record_modality_not_supported | create_record supports text-modality prompts only; this prompt produces image or structured output |
| 409 | manual_record_modality_undetermined | create_record: prompt's modelSettingsJson is unparseable, so output modality cannot be determined |
| 409 | prompt_record_cap_reached | create_record: this prompt has reached the per-prompt manual-record cap |
| 404 | record_not_found | no record with that id is visible to this caller |
| 404 | run_not_found | run expired, never existed, or not yours |
| 404 | attachment_not_found | no attachment with that id is visible |
| 404 | no_image_stored | this record has no stored images |
| 404 | image_index_out_of_range | ?index past the count of stored images |
| 404 | invalid_image_reference | defense-in-depth validation refused the path |
| 409 | idempotency_key_reused | same key on a different body — won't replay; mint a new key |
| 409 | idempotency_in_flight | same key still being processed; retry after Retry-After |
| 409 | session_busy | another /revise or /finalize in flight |
| 409 | session_not_active | rare CAS-race; re-fetch run state |
| 409 | session_already_finalized | V1.2+: /revise no longer fires this — it reopens finalized sessions (Finalized → Active) automatically. This code is now only returned in edge cases where a finalized session cannot be reopened (deleted record, expired, reopen-limit exceeded). /finalize with the same Idempotency-Key always replays the original 200 + recordId. With a new key on a Finalized session: passing notes alone patches the record's notes directly (no reopen, not idempotent per call); passing finalText, tag, or fromTurn reopens the session and re-finalizes. |
| 409 | session_expired | past 1h TTL |
| 409 | session_failed | terminal — see reason_code |
| 409 | session_abandoned | session was abandoned by API key revocation (bulk abandon via RevokeAsync) |
| 409 | revision_chain_too_long | 25 turns/session cap |
| 409 | record_revise_in_progress | V1.2 — another caller holds the per-record rehydrate lock for this recordId; retry after a short backoff |
| 409 | snapshot_modality_unreadable | internal snapshot data unreadable |
| 409 | finalize_completion_failed | internal: finalize transaction failed |
| 500 | finalize_conflict | unexpected record conflict during finalize; session reset to Active — retry /finalize |
| 500 | image_upload_failed | V1.1 Phase 0 — image generated upstream but blob storage write failed after retries; upstream charged (charged: true); replay returns this failure for the Idempotency-Key |
| 500 | image_extraction_overflow | V1.1 Phase 0 — upstream produced more images than the 16-per-turn cap; reduce n or split runs |
| 409 | run_already_terminal | the run is in a non-interactable terminal state. Returned by: /revise on Expired, Failed, or Abandoned runs; /finalize and GET /run-image on Abandoned runs. Finalized is NOT terminal for /revise — the server reopens it transparently. |
| 409 | version_create_contention | concurrent version inserts; retry |
| 413 | storage_quota_exceeded | per-prompt or per-user storage cap reached |
| 429 | rate_limited | per-key or per-user bucket overflow; honour Retry-After |
| 501 | not_implemented | endpoint is documented but not yet implemented in this API version; check action_hint for the planned version and any workaround |
| 500 | stream_setup_failed | SSE response failed to initialize before the proxy call |
| 503 | auth_store_unavailable | transient idempotency-store race; retry |
| 500 | idempotency_outcome_unknown | V1.3 Phase 4b — process died mid-flight after possibly committing the domain mutation but before recording Complete; retries of the same Idempotency-Key replay this body until 24h TTL. Verify via GET before retry — see Recovery from idempotency_outcome_unknown for per-tool recipes. |
Idempotency V1.0.1
Every mutating POST (the five Write endpoints
above, plus /prompts/{id}/run when its body is the same
as a prior attempt) accepts an Idempotency-Key header.
This is a Stripe-style guarantee: a network glitch mid-call is safe
to retry — the server replays the original response byte-identically
instead of double-applying the side effect.
Contract
- Header value: 1–255 visible-ASCII characters (0x21–0x7E), no commas, sent at most once.
- Same key + same body + same route → server replays the original status, headers, and body.
- Same key + different body →
409 idempotency_key_reused(the agent picked a key it already used for a different request — generate a new one). - Same key, original still in flight →
409 idempotency_in_flight+Retry-After: 1. - Records expire 24 h after the original call completes (Stripe parity). After expiry the same key is fresh again.
-
Replay returns the response shape from the original call.
If we ship a new field on (e.g.)
POST /promptsbetween your first call and your retry, the retry returns the OLD shape — not the new one. This is intentional Stripe parity: replays are byte-identical snapshots. The 24 h TTL bounds staleness; for the freshest shape, mint a new key.
How the CLI uses it
The CLI auto-generates a fresh UUIDv4 per invocation by default —
each promethic prompts create call is a distinct
attempt. Pass --idempotency-key <uuid> to pin
one if you want a manual retry to be a no-op.
upload_attachment deduplicates on content:
re-uploading identical bytes under the same filename returns the existing attachment
without creating a duplicate or consuming quota — safe to retry on network failure.
Pass an explicit idempotencyKey for stronger cross-session replay
stability (same key returns the byte-identical original response for 24 hours
regardless of content changes).
--filename newname.txt)
is treated as a fresh upload and consumes storage twice.
If you want to rename an existing attachment, delete the
original through the web app first (DELETE on attachments
is V1.1 — see "Not in V1.0.1" below).
Recovery from idempotency_outcome_unknown V1.3 Phase 4b
If the server process dies between committing the domain mutation
and recording the idempotency Complete, a sweep flips the row to
state=failed with a synthetic body:
{
"type": "https://api.getpromethic.com/errors/idempotency_outcome_unknown",
"title": "Idempotent run outcome unknown",
"status": 500,
"reasonCode": "idempotency_outcome_unknown",
"detail": "The original request died mid-flight (process crash or lease expired without heartbeat). The domain change MAY OR MAY NOT have landed. Verify via a GET before any retry — replaying the same Idempotency-Key returns this body verbatim, and a NEW key may duplicate the original mutation.",
"route": "POST /api/v2/public/prompts"
}
Replays of the same key continue to return this body
until the row's 24 h TTL expires. The atomicity refactor
in PR #18 (per-endpoint BeginTransactionAsync wrapping
the domain mutation + Complete) makes this case much rarer post-2026-05-09
— for tools that landed BEFORE PR #18 (or future tools added
without the wrapper), this recovery is still load-bearing.
Per-tool verify recipes (use these BEFORE retrying with the same OR a new key):
run_prompt below uses a future clientIdempotencyKey
field on the record DTO as the authoritative disambiguator. That
field is not yet shipped — the explicit "(when available)"
framing in the recipe handles this. Until Phase 6 lands, agents will
fall back to the heuristic match (createdAt +
inputText). The heuristic is unreliable for repeated identical
inputs in the same window — verify carefully OR mint a new key + accept the
duplicate cost when in doubt.
| Tool / route | Verify recipe |
|---|---|
POST /prompts + MCP create_prompt |
GET /api/v2/public/prompts (or MCP
list_prompts) then match by the
name field you submitted — names are
user-chosen + likely unique within your set.
If found: the create succeeded; do NOT retry.
If not found: safe to mint a new key + retry.
|
POST /prompts/{id}/versions + MCP create_version |
GET /api/v2/public/prompts/{id}/versions
(or MCP list_versions) then match by
versionNumber = (highest from your
pre-call read) + 1. If a version with that
number exists with your prompt text: succeeded.
If not: safe to retry with a new key.
|
POST /prompts/{id}/attachments + MCP upload_attachment |
GET /api/v2/public/prompts/{id}/attachments
(or MCP list_attachments) then match
by filename + sizeBytes.
If found: succeeded. If not: safe to retry.
Note: storage quota was reserved at Begin time; a
lost call leaves the quota reserved until the
idempotency row's 24 h TTL refunds it via
the orphan-blob sweep.
|
POST /prompts/{id}/run + MCP run_prompt |
Run records auto-finalize by default. Authoritative
disambiguator (when available): filter
list_records by clientIdempotencyKey
— every record carries the originating Idempotency-Key
from the call that created it. If found: the run
succeeded and the record exists. Cost was billed;
you've paid for it. If not found: the run did not
complete; safe to mint a new key + retry.
Heuristic fallback (use ONLY when the authoritative path isn't available — e.g., a tool that doesn't yet expose clientIdempotencyKey):
match by createdAt in your call window AND
inputText. Be aware that an agent calling
run_prompt with the same input multiple times
in 24h cannot disambiguate via output /
cost_micros alone — those are nearly
identical for deterministic prompts. The heuristic is
a guess; do not blindly retry on a match-of-many.
|
MCP finalize_run / revise_run |
These take a runId. Step 1:
GET the run state via GET /api/v2/public/runs/{runId}
(or let MCP list_records filter by
runSessionId). If a record exists with
your runId: finalize succeeded.
If RunSession.State == Finalized with
a finalizedRecordId: ditto, succeeded.
If State == Active: the run is back to
a state you can retry from — mint a new key + retry.
If State ∈ {Running, Finalizing}: a
concurrent attempt is in flight or recovering — wait
+ re-poll.
If State == Failed: terminal; do not retry.
Do NOT mint a new key without GET-checking
state first — retrying a fresh-key finalize
against a Finalizing session 409s
(session_busy) or races the
Phase 4b finalize-failure→Active reset.
|
MCP delete_* / patch_record |
GET the resource by id. If 404 (delete)
or fields match your patch (patch): succeeded.
Otherwise safe to retry with a new key.
|
Rate limits
Per-minute fixed-window buckets, evaluated AFTER auth (so an unauthenticated burst can't drain a per-user bucket the caller doesn't own):
| Scope | Per key | Per user |
|---|---|---|
| read | 60/min | 300/min |
| execute | 30/min | 90/min |
On overflow: 429 with a Retry-After header and
an RFC 7807 problem document carrying reason_code: "rate_limited"
plus an action_hint describing whether the key or the user
bucket overflowed.
Versioning
The URL path carries the major version (/api/v2/public).
The SSE protocol carries an in-band protocolVersion for
forward-compat extension within the same path version.
- Removing or renaming an existing endpoint or event = major bump.
- Adding a new endpoint, event, or response field = minor (no bump).
- Changing field semantics on an existing field = major bump.
Catalog stability
GET /api/v2/public/models is an agent-facing contract.
What's safe (no major bump) for us to do:
- Add a new model.
- Add a new value to a parameter's
valuesenum (e.g.reasoning_effort: ["none","low","medium","high"]→[..., "xhigh"]). Strict-validating agents should treat unknown enum values as forward-compat additions, not errors. - Add a new capability bit, parameter, or cost field.
- Retire a model. Once retired the
model_idis no longer in the catalog and any prompt referencing it gets400 invalid_model_settingswithaction_hintdirecting the agent to fetch the live catalog and pick a current model.
Known V1.0.1 limitations resolved in V1.1
Resolved in V1.1 Phase 8: public DTOs now exposecost_centson records is integer-truncated; sub-cent costs round to0.costMilliCents(millicents, 1/1000 cent) for sub-cent precision.costCentsis removed from the public surface; render viacostMilliCents / 1000.
Not in V1.1
DELETE on prompts / versions / records / attachments — leaked-key blast radius too high without per-prompt grants.Resolved in V1.1 Phase 5 (records) + Phase 6b (prompts/versions/attachments): per-prompt grants gate every mutation; record self-delete restricted to the originating API key + 24h window; attachment delete blocked while an active RunSession references the blob.Image-output runs for API-key callers —Resolved in V1.1 Phase 7 + Phase 0: image-modality runs are supported on /run, /revise, /revise-again. Per-turn409 image_runs_not_supported_v1.effectivePromptForImageaccumulation persists intorecord.finalCopiedOutputon /finalize, restoring the desktop accumulated-prompt invariant. V1.1 Phase 0 wired the actual blob upload (Phase 7 lifted the gate but leftImageBlobKeys: nullhardcoded — pre-Phase-0 records came back withimageStoredPath: null). All images now persist to blob storage, retrievable viaGET /runs/{runId}/images/{n}in-flight andGET /records/{id}/image?index=Npost-finalize. Records preserve every per-turn image (re-finalize merges, never shrinks).GET /api/v2/public/runs/{runId}polling endpoint — V1.2. Until then, agents derive run state from their own bookkeeping of the original call. Therun_replayedevent on idempotent retries surfaces{succeeded: false, reasonCode: "replayed_state_unknown"}rather than masquerading as success.- CLI
run --output-dir <dir>— V1.2. Image bytes are fetchable today viaGET /runs/{runId}/images/{N}(in-flight) orGET /records/{id}/image?index=N(post-finalize); the auto-save UX is a CLI ergonomics improvement. - CLI grants management (
keys grants list/add/remove) — V1.2. Per-prompt restrictions are configured by the user via Settings → Developer Keys → Manage in the web/desktop apps; agents do not configure their own restrictions. - Searchable prompt picker in Manage view — V1.2. V1.1 ships a plain scrollable checkbox list; search arrives once a user has 30+ prompts.
- Webhooks, OAuth, PAT, team keys — V2.
- Streaming on the CLI — CLI internally buffers SSE for revise chain. V2 may surface raw streaming.
Resolved in V1.2:?fromTurn=Nrewind — RunSession.Deltas is turn-indexed today; surface in V2./runs/{runId}/reviseand/runs/{runId}/finalizeacceptfromTurnin the request body. Dropssession.Deltasentries withturnIndex > fromTurnbefore applying the operation. Image blobs orphaned by the rewind enqueue intoblob_cleanup_queue(drained by a background worker with reference-count guard).
Changelog
v1.3 — 2026-05-11 — BREAKING (per-prompt MCP tools)
- Prompt-level
descriptiondropped from every wire surface.POST /api/v2/public/promptsno longer acceptsdescription;PATCH /api/v2/public/prompts/{id}accepts onlynameandabbreviation.PublicPromptCreatedResponsedrops the field. Thecloud_prompts.Descriptioncolumn is dropped from the database with no data preservation. Capability descriptions live on versions only. - Version
descriptionis the agent-facing capability description. Every version carries a one-sentence summary that describes what the prompt does — what it expects as input and what it returns. Surfaced intools/listsynthesized tool descriptions andlist_prompts, so agents can pick a prompt in one round-trip. - New
descriptionModefield on versions. Numeric on the wire (0=Auto,1=Manual). In Auto, the server regenerates Description with gpt-5.4-nano on every PUT version that changespromptText(fire-and-forget worker, ~$0.0003 per fire, conditional UPDATE that no-ops on stale starts). In Manual, the user/agent owns the field. - Description-write rule. Writing
descriptionatPATCH /api/v2/public/prompts/{id}/versions/{vid}(orupdate_versionon MCP, or PUT version on the private cloud surface) is treated as the caller taking ownership — Mode auto-flips to Manual if it isn't already. PassdescriptionMode: 0in the same request to revert to Auto and let the server worker resume regenerating; explicitdescriptionModewins over the implicit description-presence flip. JSONnullfordescriptionis a no-op (send""to clear deliberately). Earlier (pre-2026-05-13) silent-ignore-in-Auto + explicit-only-flip rule was a footgun and is no longer in force. If-Matchprecondition (optional) on PUT/PATCH version endpoints. Token format:UpdatedAt.Ticksas lowercase hex (e.g.If-Match: 8db7e12c0e7c100). Mismatch returns412 Precondition Failed. Absent header keeps last-write-wins legacy semantics. PUT version now returns200 OK + VersionResponse(was204) so the client gets the newUpdatedAtfor the next If-Match token. Future PUT/PATCH endpoints will follow this convention.- Per-prompt MCP tools (opt-in). Toggle via
POST /api/v2/prompts/{id}/mcp-togglewith{ "expose": true }. Each exposed prompt appears in your agent's MCP tools/list aspromethic_{slug}(e.g.promethic_clay_cuties). Agents invoke by name in one round-trip — nolist_prompts+get_prompt+run_promptdance. Cap = 50 per account; cap-hit returns409 mcp_tool_cap_reached. Tool name is stable across prompt renames so hardcoded agent code keeps working. To re-derive the tool name from the new prompt name, callPOST /api/v2/prompts/{id}/mcp-rename; collision returns409 tool_name_takenor409 tool_name_reserved.
v1.2 — 2026-05-07
- Auto-finalize on
/run: pass?autoFinalize=true(now the default; toggle per-user via theautoFinalizeMcpRunssetting) and the server chains an internal/finalizeafter a successful run. The newrecordIdarrives via therecord_finalizedSSE event on the same stream asrun_completed. Three new SSE events:record_finalized,record_finalize_failed(chain failed; agent decides whether to callPOST /finalizemanually based onretryable),record_finalize_skipped(informational, afterrun_failed). fromTurnrewind primitive:/runs/{runId}/reviseand/runs/{runId}/finalizeacceptfromTurn. Drops session turns >fromTurn, then applies the operation. Negative →400 from_turn_invalid; out-of-range →400 from_turn_out_of_range.- Finalize-on-Finalized amend: calling
/finalizewithfinalText,tag, orfromTurnon a Finalized session reopens the session, bumpsRunGeneration, and re-finalizes — same record ID, same handle. Fresh idempotency boundary for the new gen. Passingnotesalone is a distinct case: it patches the record's notes directly without reopening (no new generation, not idempotent per call — each call updates the stored notes value). Passing nothing (no body / all fields absent) replays the most recent finalize response unchanged. - Unified
turns[]onPublicRecordResponse: every record DTO carries a synthesizedturnsarray (run / revision / edit, indexed contiguously) reconstructed from the input + delta chain + final-copied-output. Resolves the V1.1 stitching gap where agents had to mentally combineinputText+finalCopiedOutput+deltas[]. POST /records/{id}/revisereplaces/revise-again: rehydrate a freshRunSessionfrom a finalized record's snapshot and revise. Same body shape as/runs/{runId}/revise(carriesintermediateOutput+fromTurn). Per-record advisory lock serializes concurrent rehydrate attempts (409 record_revise_in_progresson contention). Old/revise-againroute HARD-REMOVED.- Image blob cleanup queue:
fromTurnrewinds legitimately shrink record image history. Dropped per-turn blobs enqueue intoblob_cleanup_queue(background worker, single-leader viapg_advisory_lock(3), reference-count guard against bothImageStoredPathstorage formats before S3 DELETE). - Spend audit discriminator:
UsageLog.Discriminatorcolumn reserved for billing-eligibility tagging;SpendQueryFilters.Billabledrives all SUM rollups (admin LIST endpoints intentionally show every row for audit visibility). - MCP CLI surface:
revise_againtool COLLAPSED intorevise_run(acceptsrunId; finalized sessions reopen transparently).run_promptgrowsautoFinalize?: boolean.finalize_run+revise_rungrowfromTurn?: number. Tool count: 25 → 24.
v1.1 — 2026-05-03
- Per-prompt grants (Phase 6a): API keys can be restricted to a specific set of prompts. Configured via Settings → Developer Keys → Manage in the web/desktop apps. Three new session-only endpoints:
GET/POST/DELETE /api/v2/keys/{keyId}/grants. Restricted-key access to non-granted prompts returns403 grant_required. - Server-stateful runs:
RunSessiontable replaces the V1 echo-back signed-blob model. Agents hold an opaquerunId; the server keeps prompt + version snapshot frozen at /run time, immune to mid-flight prompt edits.POST /runs/{runId}/revise,/finalize;GET /runs/{runId}/images/{N}for in-flight image fetch. - Idempotency-Key on the full execute surface (Phase 3e/3f):
/run,/revise,/finalizeall replay byte-identically on retry. Route-signature composition with@gen{N}on/finalizeso reopen-on-revise creates a fresh idempotency boundary. SSE replay protocol via the newrun_replayedevent (terminal-with-info). - Record self-management (Phase 5):
DELETE /records/{id}(24h, ApiKey-owned, hard-delete + cascade) andPATCH /records/{id}(notes + tag, no time window). HIPAA §164.312(b) audit row on every mutation with PHI-aware presence/length/SHA-256-prefix metadata. - DELETE on prompts/versions/attachments (Phase 6b): write-scope + grant check.
DELETE /attachments/{id}blocks if any activeRunSessionreferences the blob →409 attachment_referenced_by_active_run.DELETE /prompts/{id}/versions/{vid}rejects current version atomically.DELETE /prompts/{id}+versions/{vid}force-terminate any active RunSessions before proceeding (no 409 for active runs on the REST surface; MCP tools returnprompt_referenced_by_active_run/version_referenced_by_active_runinstead). - Image-output runs for API-key callers (Phase 7):
409 image_runs_not_supported_v1gate lifted on /run, /revise, /revise-again. Per-turneffectivePromptForImageaccumulation persists intorecord.finalCopiedOutputon /finalize, restoring the desktop accumulated-prompt invariant. - Catalog enforcement (Phase 4):
ModelSettingsValidatorwired ADDITIVELY intoPOST /prompts+POST /prompts/{id}/versions+PATCH /prompts/{id}. Out-of-enum values likereasoning_effort: "xtreme"now400 invalid_model_settingsat write time instead of silently failing at /run. - Observability + cost precision (Phase 8):
X-RateLimit-*headers on every response (both buckets reported).cost_micros(1/1000 cent) replacescost_centson public DTOs for sub-cent precision.response.usageSSE event reasoning_tokens fix for the Responses API shape.
v1.0.1 — 2026-04-29
- Write scope + 5 new endpoints: prompt create / patch (RFC 7396) / current-version switch, version create, attachment upload.
- Idempotency-Key header on all mutating POSTs (Stripe parity, 24 h TTL).
- RFC 7807 problem+json errors with
action_hint+invalid_paramsfor self-healing agents. GET /models— slim catalog endpoint withsupportedOutputModalities.- CLI:
prompts create/prompts patch/prompts switch-current/versions create/attachments add, plus YAML manifest mode. - Developer Keys management UI in the web + desktop apps.
v1 (alpha) — 2026-04-27
- Initial public surface: read + execute scopes.
- SSE protocol v1 with
run_session/run_completed/run_failedtaxonomy. promethicCLI alpha (Node 18+).