API Reference

Relai provides an OpenAI-compatible API. If you're familiar with the OpenAI API, you already know how to use Relai.

Base URLs

RegionBase URL
EUhttps://api.eu.llmrelai.com/v1
UShttps://api.us.llmrelai.com/v1

Authentication

All API requests require an API key in the Authorization header:

Authorization: Bearer relai_sk_eu_live_your-api-key

Key prefixes encode the home region and scope:

  • relai_sk_eu_live_… / relai_sk_us_live_… regional keys, must be used against their own region.
  • relai_sk_gbl_eu_live_… / relai_sk_gbl_us_live_…global keys, usable against any region. The embedded region is the home region where billing settles. See Regions & Key Scopes.

Endpoints

POST /v1/chat/completions

Create a chat completion.

Request Body:

{
  "model": "smart",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false
}

Parameters:

ParameterTypeRequiredDescription
modelstringYesModel ID or alias (smart, fast, cheap)
messagesarrayYesArray of message objects
temperaturenumberNoSampling temperature (0-2, default: 1)
max_tokensintegerNoMaximum tokens to generate
streambooleanNoEnable streaming (default: false)
toolsarrayNoFunction calling tools
userstringNoEnd-user ID for per-user metering and quotas

Response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1715000000,
  "model": "claude-sonnet-4.6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}

POST /v1/embeddings

Create embeddings for the given input text(s).

Request Body:

{
  "model": "embeddings/cheapest",
  "input": "The quick brown fox jumps over the lazy dog",
  "encoding_format": "float"
}

Parameters:

ParameterTypeRequiredDescription
modelstringYesModel ID or alias (e.g., embeddings/cheapest)
inputstring or arrayYesText(s) to embed. Can be a single string or array of strings.
encoding_formatstringNoFormat of embedding output: "float" or "base64"
dimensionsintegerNoThe number of dimensions for the output embeddings (model-dependent)
userstringNoEnd-user ID for per-user metering and quotas

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023, -0.0091, 0.0152, ...],
      "index": 0
    }
  ],
  "model": "openai/text-embedding-3-small",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

GET /v1/models

List available models.

Response:

{
  "object": "list",
  "data": [
    {
      "id": "claude-opus-4.7",
      "object": "model",
      "owned_by": "anthropic"
    },
    {
      "id": "gpt-5.5",
      "object": "model",
      "owned_by": "openai"
    }
  ]
}

GET /v1/balance

Get current credit balance.

Response:

{
  "credits_micro": 4500000,
  "credits_dollars": 4.50
}

GET /v1/keys

List API keys.

Response:

{
  "keys": [
    {
      "id": "key_abc123",
      "name": "Production",
      "prefix": "relai_sk_eu_live_A",
      "region": "eu",
      "scope": "regional",
      "rpm_limit": 60,
      "created_at": "2026-05-01T10:00:00Z",
      "last_used_at": "2026-05-08T12:30:00Z",
      "revoked": false
    },
    {
      "id": "key_def456",
      "name": "Global Production",
      "prefix": "relai_sk_gbl_eu_live_O",
      "region": "eu",
      "scope": "global",
      "rpm_limit": 60,
      "created_at": "2026-05-12T10:00:00Z",
      "last_used_at": "2026-05-13T01:30:00Z",
      "revoked": false
    }
  ]
}

POST /v1/keys

Create a new API key.

Request Body:

{
  "name": "Production Key",
  "rpm_limit": 60,
  "scope": "regional"
}

scope is optional and defaults to "regional". Set it to "global" to issue a cross-region key. Note that for global keys, the rpm_limit is enforced per region.

DELETE /v1/keys/{id}

Revoke an API key. For global keys, the home gateway also fans out a cache invalidation to all peer regions so the key stops working everywhere within a few seconds.

End-User Management

POST /v1/users

Create or update an end-user with spending quotas.

Request Body:

{
  "external_id": "user_abc123",
  "monthly_quota_dollars": 10.00,
  "daily_quota_dollars": 1.00,
  "alert_threshold": 0.8
}

Response:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "external_id": "user_abc123",
  "monthly_quota_dollars": 10.00,
  "daily_quota_dollars": 1.00,
  "alert_threshold": 0.8,
  "monthly_spend_dollars": 0.00,
  "daily_spend_dollars": 0.00,
  "created_at": "2026-05-12T10:00:00Z"
}

GET /v1/users

List all end-users with current spend data.

Query Parameters:

ParameterTypeDescription
limitintegerMax results (default: 50, max: 100)
offsetintegerPagination offset (default: 0)

GET /v1/users/{external_id}

Get a specific end-user's details and current spend.

PATCH /v1/users/{external_id}

Update an end-user's quotas.

{
  "monthly_quota_dollars": 25.00,
  "alert_threshold": 0.9
}

DELETE /v1/users/{external_id}

Delete an end-user record.

Organizations

GET /v1/orgs/{org_id}

Get organization details. Requires membership in the organization.

Response:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "My Company",
  "slug": "my-company",
  "region": "eu",
  "is_personal": false,
  "created_at": "2026-05-01T10:00:00Z"
}

PATCH /v1/orgs/{org_id}

Update organization name or slug. Requires owner role.

Request Body:

{
  "name": "New Company Name",
  "slug": "new-company-slug"
}

Both name and slug are optional. Slug must be 3-32 characters, lowercase letters, numbers, and hyphens only.

Organization Settings

GET /v1/orgs/{org_id}/settings

Get organization spending defaults. Requires owner, admin, or billing role.

Response:

{
  "default_user_monthly_quota_dollars": 10.00,
  "default_user_daily_quota_dollars": 1.00,
  "default_alert_threshold": 0.8,
  "low_balance_threshold_dollars": 5.00
}

PATCH /v1/orgs/{org_id}/settings

Update organization spending defaults. Requires owner, admin, or billing role.

{
  "default_user_monthly_quota_dollars": 20.00,
  "default_user_daily_quota_dollars": 2.00,
  "default_alert_threshold": 0.75,
  "low_balance_threshold_dollars": 10.00
}

Streaming

Enable streaming by setting stream: true. Responses use Server-Sent Events:

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":" there"}}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Rate Limits

Rate limits are per API key:

  • Default: 60 requests per minute
  • Configurable up to 1000 RPM per key
  • Organization-wide limits available on request

SDKs

Use any OpenAI-compatible SDK by setting the base URL:

TypeScript:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.RELAI_API_KEY,
  baseURL: "https://api.eu.llmrelai.com/v1",
});

Python:

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["RELAI_API_KEY"],
    base_url="https://api.eu.llmrelai.com/v1",
)