Skip to content

System Overview

CORTEX is a secure, OpenAI-compatible gateway that fronts one or more vLLM engines and provides health-aware routing, access control, metering, and administration.

graph TD
  Client[Client / SDK] -->|HTTP: /v1/*| Gateway[FastAPI Gateway]
  subgraph Gateway
    Router[OpenAI Routes] --> Auth[API Key Auth]
    Router --> RL[Rate Limit / Concurrency]
    Router --> Choose[URL Selection]
    Choose --> Upstreams[(vLLM Engines)]
    Gateway --> DB[(Postgres)]
    Gateway --> Redis[(Redis)]
    Gateway --> Prom[Prometheus]
  end
  Health[Health Poller] --> Upstreams
  Health --> Gateway

Key concepts: - OpenAI-compatible API: /v1/chat/completions, /v1/completions, /v1/embeddings - URL selection prefers healthy engines and rotates via round-robin - Metrics exposed via Prometheus; optional OpenTelemetry traces - Admin APIs and UI manage users, organizations, keys, models, and usage data

See Backend and Frontend pages for details.