Skip to content

Architecture

Neureus runs as a Hono Worker on Cloudflare’s global edge. The HTTP surface — routing, auth, business logic — lives in one worker. Stateful workloads (workflows, rate limiting, monitoring) delegate to Durable Objects, and AI inference goes to external providers (Workers AI, OpenAI, Anthropic). There’s no internal service mesh, but DO roundtrips and provider calls do add latency.

Four product layers

┌─────────────────────────────────────────────────────┐
│ Layer 1 — AI Gateway │
│ Multi-provider inference · Streaming · BYOK · Batch│
├─────────────────────────────────────────────────────┤
│ Layer 2 — Knowledge │
│ RAG · Document pipeline · Vectorize │
├─────────────────────────────────────────────────────┤
│ Layer 3 — Automation │
│ Agents (ReAct) · Workflows (DO) · HITL · MCP │
├─────────────────────────────────────────────────────┤
│ Layer 4 — Intelligence │
│ Composite patterns · Industry profiles │
└─────────────────────────────────────────────────────┘

Storage

StorePurpose
D1 (galactic-neureus)Relational state — 30+ nr_* tables, all tenant-scoped
KV (CACHE_KV)Challenge tokens, model catalog cache, BYOK metadata
R2 (galactic-platform-storage)Document storage (prefix: neureus/{tenantId}/)
Vectorize (neureus-index)Embedding vectors for RAG
Analytics EnginePer-request metrics, LLM events, auth events

Request path

Client → Cloudflare Edge → Worker
cors → timingMiddleware → tenantMiddleware → route handler

tenantMiddleware verifies the Bearer token and sets tenantId on the Hono context. Every route handler reads tenantId from context — never from raw headers.

Durable Objects

DOBindingPurpose
NeurWorkflowEngineDOWORKFLOW_ENGINEDurable workflow execution, alarm-based HITL timeout
NeurRateLimiterDORATE_LIMITERPer-tenant rate limiting
NeurMonitoringRoomDOMONITORING_ROOMReal-time monitoring room state