Architecture
Neureus runs as a Hono Worker on Cloudflare’s global edge. The HTTP surface — routing, auth, business logic — lives in one worker. Stateful workloads (workflows, rate limiting, monitoring) delegate to Durable Objects, and AI inference goes to external providers (Workers AI, OpenAI, Anthropic). There’s no internal service mesh, but DO roundtrips and provider calls do add latency.
Four product layers
┌─────────────────────────────────────────────────────┐│ Layer 1 — AI Gateway ││ Multi-provider inference · Streaming · BYOK · Batch│├─────────────────────────────────────────────────────┤│ Layer 2 — Knowledge ││ RAG · Document pipeline · Vectorize │├─────────────────────────────────────────────────────┤│ Layer 3 — Automation ││ Agents (ReAct) · Workflows (DO) · HITL · MCP │├─────────────────────────────────────────────────────┤│ Layer 4 — Intelligence ││ Composite patterns · Industry profiles │└─────────────────────────────────────────────────────┘Storage
| Store | Purpose |
|---|---|
D1 (galactic-neureus) | Relational state — 30+ nr_* tables, all tenant-scoped |
KV (CACHE_KV) | Challenge tokens, model catalog cache, BYOK metadata |
R2 (galactic-platform-storage) | Document storage (prefix: neureus/{tenantId}/) |
Vectorize (neureus-index) | Embedding vectors for RAG |
| Analytics Engine | Per-request metrics, LLM events, auth events |
Request path
Client → Cloudflare Edge → Worker cors → timingMiddleware → tenantMiddleware → route handlertenantMiddleware verifies the Bearer token and sets tenantId on the Hono context. Every route handler reads tenantId from context — never from raw headers.
Durable Objects
| DO | Binding | Purpose |
|---|---|---|
NeurWorkflowEngineDO | WORKFLOW_ENGINE | Durable workflow execution, alarm-based HITL timeout |
NeurRateLimiterDO | RATE_LIMITER | Per-tenant rate limiting |
NeurMonitoringRoomDO | MONITORING_ROOM | Real-time monitoring room state |