Norman Engine
LLM orchestration layer for the Norman ecosystem. Provides a unified API for interacting with different LLM providers with streaming support, token counting, and rate limiting.
What It Does
Norman Engine serves as the AI backbone for all Norman apps, abstracting LLM providers behind a consistent interface:
- Provider Abstraction — Switch between OpenAI and AWS Bedrock via configuration
- Streaming Support — Real-time SSE response streaming for chat interfaces
- Token Management — Accurate token counting with tiktoken and usage tracking
- Model Selection — Dynamic model routing based on request requirements
- Usage Analytics — Per-user token usage tracking and aggregation
API Surface
| Endpoint | Method | Description |
|---|---|---|
/api/chat | POST | Streaming and non-streaming chat completions |
/api/complete | POST | Non-streaming chat completions |
/api/models | GET | List available models for the current provider |
/api/usage | GET | Per-user token usage statistics |
Dependencies
- OpenAI — Current LLM provider
- AWS Bedrock — Future provider (abstracted behind same interface)
- Express — API framework
- Tiktoken — Token counting
- Winston — Structured logging
- MongoDB — Usage and chat log storage