SSyncropel Docs

api.syncropic.com — managed inference

OpenAI-compatible inference gateway for Syncropel customers. Use any OpenAI SDK with your Syncropel API key — one endpoint, multiple models, audit on your own records, subscription-aware quotas.

TL;DR

api.syncropic.com is an OpenAI-compatible inference gateway operated by Syncropic. Point any OpenAI SDK at https://api.syncropic.com/v1 with your Syncropel API key and you get:

  • One endpoint, multiple frontier models, no per-provider API keys
  • OpenAI Chat Completions wire shape — every major SDK works without changes
  • Subscription-aware quotas that compose with your hosted instance billing
  • Per-call audit records on your own instance — full cost + token + tier visibility

Most Syncropel customers don't call api.syncropic.com directly — your hosted instance does it for you when your CLI or workspace runs an LLM-backed task. This page is for customers who want to call it programmatically (CLI scripts, application backends, custom agents).

What it is

A managed inference gateway. One HTTPS endpoint, OpenAI-compat wire shape, multiple frontier models behind it. You configure once and get the right backing model for whichever model name you ask for.

                    ┌────────────────────────┐
your client / app ─→│ api.syncropic.com      │─→ backing model
                    │ auth · quota · audit   │
                    └────────────────────────┘

You don't manage per-model API keys. Your Syncropel subscription gates the call, the gateway picks the appropriate backend, and you get a normal OpenAI-shaped response back. Audit records land on your instance so cost and usage are queryable.

Authentication

Send a bearer token on the Authorization header:

Authorization: Bearer <your-syncropel-api-key>

Three valid bearer types:

BearerIssued byUse case
Per-instance bearerHosted instance provisioningYour hosted instance uses this automatically
User-issued API keysyncropel.com/account/api-keysYour scripts and apps; revocable per-key
Self-host bearerYour own spl serveSelf-host customers running their own instance

Mint a user-issued API key at syncropel.com/account/api-keys. Store the value in a secret manager — the plaintext is shown once at mint time.

For local dev:

export SYNCROPEL_API_KEY=sk-...your-key...

Quick start (curl)

curl https://api.syncropic.com/v1/chat/completions \
  -H "Authorization: Bearer $SYNCROPEL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "messages": [
      {"role": "user", "content": "Say hello in one sentence."}
    ],
    "max_tokens": 1500
  }'

Response shape is OpenAI-compatible:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "claude-haiku-4-5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello — happy to help."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 7,
    "total_tokens": 19
  }
}

Response headers expose Syncropel-specific cost data:

HeaderMeaning
X-Syncropic-Cost-CentsCost of this call in USD cents (with markup applied)
X-Syncropic-TierTier the call was served from (paid / anonymous / self-host)
X-Syncropic-Request-IDTrace ID for support; include when filing tickets

Quick start (any OpenAI SDK)

Because the wire shape is OpenAI-compatible, any OpenAI SDK works by changing the base URL:

Python (openai package)

from openai import OpenAI

client = OpenAI(
    api_key="<your-syncropel-api-key>",
    base_url="https://api.syncropic.com/v1",
)

response = client.chat.completions.create(
    model="claude-haiku-4-5",
    messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)

TypeScript / Node

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SYNCROPEL_API_KEY,
  baseURL: "https://api.syncropic.com/v1",
});

const response = await client.chat.completions.create({
  model: "claude-haiku-4-5",
  messages: [{ role: "user", content: "Hello" }],
});
console.log(response.choices[0].message.content);

LangChain / LlamaIndex / etc.

Any framework that lets you set the OpenAI base URL works. Set:

  • Base URL: https://api.syncropic.com/v1
  • API key: your Syncropel API key

Streaming responses

The gateway supports streaming Server-Sent Events (text/event-stream) when the request includes "stream": true. Set it on any chat completion request and consume the stream the same way you would against OpenAI directly.

curl https://api.syncropic.com/v1/chat/completions \
  -H "Authorization: Bearer $SYNCROPEL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a one-line haiku about clean code."}
    ]
  }'

Python (openai SDK):

stream = client.chat.completions.create(
    model="claude-haiku-4-5",
    stream=True,
    messages=[{"role": "user", "content": "Hello"}],
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Notes:

  • Tokens-on-delivery accounting. You're billed for what was actually streamed to your client, not what the backing model produced. If you cancel the connection mid-stream, the audit record reflects the partial token count.
  • Audit records carry a stream_state field"completed" for normal stream end, "client_disconnect" if your client closed the connection before the model finished.
  • Tier gates run up-front. Subscription, rate limit, and bearer cap are evaluated before the upstream call begins; a 429 or 402 surfaces as a normal HTTP response, not as an SSE error frame.
  • Anonymous tier supports streaming on the included model. Same per-IP throttle as non-streaming.

Available models

Use the OpenAI-compatible model name as listed below. Pricing reflects the gateway's customer price (per 1M tokens) — gateway markup is included.

ModelTier requiredApprox. cost (per 1M tokens)
claude-haiku-4-5Paid$1.30 in / $6.50 out
claude-sonnet-4-6Paid$3.90 in / $19.50 out
gpt-4oPaid$3.25 in / $13.00 out
gemini-flash-1.5Anonymous + Paid$0.10 in / $0.39 out
gemini-proPaid$1.63 in / $6.50 out

The full live model list, with current pricing, is available via the capabilities endpoint:

curl https://api.syncropic.com/v1/capabilities \
  -H "Authorization: Bearer $SYNCROPEL_API_KEY"

If your subscription includes inference credit, that credit is consumed first; overage rolls into your monthly invoice at the listed price.

Subscription tiers

Three tiers gate access:

TierBearer typeModelsRate limitMonthly cap
PaidPer-instance + User-issued (with active subscription)All models60 req/minSubscription-included credit, then metered overage
AnonymousAnonymous (no Syncropel account)Single included model5 req/minHard cap: 100 messages/day per user
Self-hostSelf-host bearerAll models60 req/minPay-as-you-go via your subscription

Anonymous tier is meant for evaluation — you don't need a Syncropel subscription to try the gateway, but you're limited to a single low-cost model. Sign up for a paid plan to unlock the full model list.

See pricing for current subscription prices and what's included.

Rate limits and errors

Rate limits return 429 Too Many Requests with a Retry-After header (seconds):

HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/json

{
  "error": {
    "code": "rate_limited",
    "kind": "usage",
    "message": "60 requests per minute on paid tier; retry in 30s",
    "retry_after_seconds": 30,
    "request_id": "req_abc123"
  }
}

Other common error codes:

HTTPerror.codeMeaning
401invalid_bearerBearer token doesn't match a known issuer
401invalid_jwt_signatureBearer was a JWT but signature doesn't verify
402subscription_requiredNo active subscription; upgrade or use anonymous tier
402quota_exceededSubscription monthly cap exceeded
429rate_limitedPer-minute rate limit hit; retry after retry_after_seconds
503provider_errorBacking model degraded; retry or try a different model
503internal_errorGateway error; report request_id to support

Every error response includes a request_id — include it in support tickets for fast triage.

Audit on your own records

Every successful call emits an audit record on your hosted instance, under the th_audit_api_usage thread. Each record carries:

  • cost_usd — what the call cost you
  • tokens_input, tokens_output — what was billed
  • model — what was served
  • tier — paid / anonymous / self-host
  • latency_ms — wall-clock time
  • endpoint, request_id — for tracing

Query the thread directly:

spl thread records th_audit_api_usage -o json \
  | jq '.[] | .body | {cost_usd, model, tokens_input, tokens_output}'

Monthly cost reconciliation is a record query — no external billing dashboard needed.

Why use api.syncropic.com vs. a model provider directly

You don't have to. If you already have an API key for a specific model provider, you can call them directly. The gateway's value is in three properties:

  1. One endpoint, many models. Switch from one model to another by changing one model string. No new API key, no new SDK.
  2. Subscription-aligned billing. Your monthly invoice rolls inference into the same bill as your hosted instance plus included credit, instead of N separate provider invoices.
  3. Audit on your own records. Cost and usage land on your instance as records, queryable like any other data.

For pure cost: direct provider access can be cheaper if you only ever use one provider. For multi-model workflows or unified billing + audit, the gateway pays for itself.

See also

On this page