API Reference
Relai provides an OpenAI-compatible API. If you're familiar with the OpenAI API, you already know how to use Relai.
Base URLs
| Region | Base URL |
|---|---|
| EU | https://api.eu.llmrelai.com/v1 |
| US | https://api.us.llmrelai.com/v1 |
Authentication
All API requests require an API key in the Authorization header:
Authorization: Bearer relai_sk_eu_live_your-api-keyKey prefixes encode the home region and scope:
relai_sk_eu_live_…/relai_sk_us_live_…— regional keys, must be used against their own region.relai_sk_gbl_eu_live_…/relai_sk_gbl_us_live_…— global keys, usable against any region. The embedded region is the home region where billing settles. See Regions & Key Scopes.
Endpoints
POST /v1/chat/completions
Create a chat completion.
Request Body:
{
"model": "smart",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID or alias (smart, fast, cheap) |
| messages | array | Yes | Array of message objects |
| temperature | number | No | Sampling temperature (0-2, default: 1) |
| max_tokens | integer | No | Maximum tokens to generate |
| stream | boolean | No | Enable streaming (default: false) |
| tools | array | No | Function calling tools |
| user | string | No | End-user ID for per-user metering and quotas |
Response:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1715000000,
"model": "claude-sonnet-4.6",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
}
}POST /v1/embeddings
Create embeddings for the given input text(s).
Request Body:
{
"model": "embeddings/cheapest",
"input": "The quick brown fox jumps over the lazy dog",
"encoding_format": "float"
}Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID or alias (e.g., embeddings/cheapest) |
| input | string or array | Yes | Text(s) to embed. Can be a single string or array of strings. |
| encoding_format | string | No | Format of embedding output: "float" or "base64" |
| dimensions | integer | No | The number of dimensions for the output embeddings (model-dependent) |
| user | string | No | End-user ID for per-user metering and quotas |
Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0023, -0.0091, 0.0152, ...],
"index": 0
}
],
"model": "openai/text-embedding-3-small",
"usage": {
"prompt_tokens": 9,
"total_tokens": 9
}
}GET /v1/models
List available models.
Response:
{
"object": "list",
"data": [
{
"id": "claude-opus-4.7",
"object": "model",
"owned_by": "anthropic"
},
{
"id": "gpt-5.5",
"object": "model",
"owned_by": "openai"
}
]
}GET /v1/balance
Get current credit balance.
Response:
{
"credits_micro": 4500000,
"credits_dollars": 4.50
}GET /v1/keys
List API keys.
Response:
{
"keys": [
{
"id": "key_abc123",
"name": "Production",
"prefix": "relai_sk_eu_live_A",
"region": "eu",
"scope": "regional",
"rpm_limit": 60,
"created_at": "2026-05-01T10:00:00Z",
"last_used_at": "2026-05-08T12:30:00Z",
"revoked": false
},
{
"id": "key_def456",
"name": "Global Production",
"prefix": "relai_sk_gbl_eu_live_O",
"region": "eu",
"scope": "global",
"rpm_limit": 60,
"created_at": "2026-05-12T10:00:00Z",
"last_used_at": "2026-05-13T01:30:00Z",
"revoked": false
}
]
}POST /v1/keys
Create a new API key.
Request Body:
{
"name": "Production Key",
"rpm_limit": 60,
"scope": "regional"
}scope is optional and defaults to "regional". Set it to "global" to issue a cross-region key. Note that for global keys, the rpm_limit is enforced per region.
DELETE /v1/keys/{id}
Revoke an API key. For global keys, the home gateway also fans out a cache invalidation to all peer regions so the key stops working everywhere within a few seconds.
End-User Management
POST /v1/users
Create or update an end-user with spending quotas.
Request Body:
{
"external_id": "user_abc123",
"monthly_quota_dollars": 10.00,
"daily_quota_dollars": 1.00,
"alert_threshold": 0.8
}Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"external_id": "user_abc123",
"monthly_quota_dollars": 10.00,
"daily_quota_dollars": 1.00,
"alert_threshold": 0.8,
"monthly_spend_dollars": 0.00,
"daily_spend_dollars": 0.00,
"created_at": "2026-05-12T10:00:00Z"
}GET /v1/users
List all end-users with current spend data.
Query Parameters:
| Parameter | Type | Description |
|---|---|---|
| limit | integer | Max results (default: 50, max: 100) |
| offset | integer | Pagination offset (default: 0) |
GET /v1/users/{external_id}
Get a specific end-user's details and current spend.
PATCH /v1/users/{external_id}
Update an end-user's quotas.
{
"monthly_quota_dollars": 25.00,
"alert_threshold": 0.9
}DELETE /v1/users/{external_id}
Delete an end-user record.
Organizations
GET /v1/orgs/{org_id}
Get organization details. Requires membership in the organization.
Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"name": "My Company",
"slug": "my-company",
"region": "eu",
"is_personal": false,
"created_at": "2026-05-01T10:00:00Z"
}PATCH /v1/orgs/{org_id}
Update organization name or slug. Requires owner role.
Request Body:
{
"name": "New Company Name",
"slug": "new-company-slug"
}Both name and slug are optional. Slug must be 3-32 characters, lowercase letters, numbers, and hyphens only.
Organization Settings
GET /v1/orgs/{org_id}/settings
Get organization spending defaults. Requires owner, admin, or billing role.
Response:
{
"default_user_monthly_quota_dollars": 10.00,
"default_user_daily_quota_dollars": 1.00,
"default_alert_threshold": 0.8,
"low_balance_threshold_dollars": 5.00
}PATCH /v1/orgs/{org_id}/settings
Update organization spending defaults. Requires owner, admin, or billing role.
{
"default_user_monthly_quota_dollars": 20.00,
"default_user_daily_quota_dollars": 2.00,
"default_alert_threshold": 0.75,
"low_balance_threshold_dollars": 10.00
}Streaming
Enable streaming by setting stream: true. Responses use Server-Sent Events:
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":" there"}}]}
data: {"id":"chatcmpl-abc","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]Rate Limits
Rate limits are per API key:
- Default: 60 requests per minute
- Configurable up to 1000 RPM per key
- Organization-wide limits available on request
SDKs
Use any OpenAI-compatible SDK by setting the base URL:
TypeScript:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.RELAI_API_KEY,
baseURL: "https://api.eu.llmrelai.com/v1",
});Python:
from openai import OpenAI
client = OpenAI(
api_key=os.environ["RELAI_API_KEY"],
base_url="https://api.eu.llmrelai.com/v1",
)