AI Gateway
Base URL: https://app.neureus.ai
Auth: Authorization: Bearer <api_key>
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /ai/chat | Chat completion (sync or streaming SSE) |
GET | /ai/models | List available models with pricing |
POST | /ai/embeddings | Generate embeddings |
GET | /ai/providers | List configured BYOK providers |
PUT | /ai/providers/:provider/key | Set BYOK key for a provider |
POST | /ai/providers/:provider/rotate | Rotate BYOK key |
DELETE | /ai/providers/:provider/key | Remove BYOK key |
POST | /ai/widget/key | Create a publishable widget key |
DELETE | /ai/widget/key | Revoke a widget key |
POST /ai/chat
Chat with any supported model. Returns JSON or SSE depending on stream.
Request body:
{ messages: Array<{ role: 'system' | 'user' | 'assistant'; content: string }>; model?: string; // default: "meta-llm" (Llama 3.1 8B); use "auto" for task routing stream?: boolean; // default: false temperature?: number; // 0–2 maxTokens?: number; max_tokens?: number; // OpenAI-compatible alias systemPrompt?: string; // prepended as system message prefer_async?: boolean; // route to batch API if eligible (50% provider discount)}Response (non-streaming):
{ text: string; choices: Array<{ message: { role: string; content: string }; finish_reason: string }>; // OpenAI-compat toolCalls?: any[]; reasoning?: string; logId: string; inputTokens: number; outputTokens: number; usage: { prompt_tokens: number; completion_tokens: number; total_tokens: number }; // OpenAI-compat cached: boolean; costUsd: number;}Response (streaming, stream: true):
Returns Content-Type: text/event-stream. Each data: event:
{ "delta": { "content": "token text" }, "done": false }Final event: { "done": true, "logId": "...", "inputTokens": 100, "outputTokens": 42 }
Headers:
x-neureus-options: compress=true— enable prompt compressionx-neureus-debug: true— include_preprocessingstats in responsex-neureus-surface: <surface>— tag request source for analytics
Example:
import { Neureus } from '@neureus/sdk';
const client = new Neureus({ apiKey: process.env.NEUREUS_API_KEY! });const res = await client.ai.chat({ messages: [{ role: 'user', content: 'Explain edge computing.' }], model: 'claude-sonnet-4-6',});console.log(res.content);GET /ai/models
List all available models with pricing, task type, and context window.
Query params: ?provider=openai|anthropic|workers-ai and/or ?type=coding|reasoning|chat|... (optional)
Response:
{ "models": [ { "id": "claude-sonnet-4-6", "name": "Claude Sonnet 4.6", "provider": "anthropic", "type": "reasoning", "contextWindow": 200000, "inputCostPer1k": 0.003, "outputCostPer1k": 0.015, "supportedFeatures": ["streaming", "tool_use"] } ]}Responses are cached in KV for 5 minutes.
PUT /ai/providers/:provider/key
Register a BYOK API key. Encrypted with AES-GCM per tenant.
Providers: openai, anthropic
Request body: { "key": "sk-your-provider-key" }
Response: { "provider": "openai", "configured": true }
POST /ai/providers/:provider/rotate
Re-encrypt a BYOK key under the existing DEK. Use to rotate a compromised or expired key without changing the DEK.
Request body: { "key": "sk-your-new-key" }
POST /ai/widget/key
Create a publishable widget key (pk_) scoped to a tenant and optionally to a specific agent.
Request body: { "agent_id": "agent_abc123" } (optional)
Response: { "key": "pk_...", "keyPreview": "pk_xxxx" }
Widget keys are chat-only and safe to embed in browser/mobile code.