Frontend Architecture¶
Next.js App Router + TypeScript admin UI (frontend/).
Structure¶
app/(admin)/*: Pages for Health, Keys, Usage, Models, Orgs, Users, Chat, Guidesrc/components/*: UI primitives, charts, monitoring widgets, models tools, chat UIsrc/lib/api-clients.ts: fetch helper addsx-request-idand normalizes error envelopesrc/lib/chat-client.ts: streaming chat client and model constraint fetchingsrc/lib/chat-api.ts: server-side chat session persistencesrc/hooks/useChat.ts: chat state management hookproviders/*: App/Toast/User providers- Styling: Tailwind CSS with custom utility classes in
styles/globals.css
Chat Playground Components¶
The Chat Playground (src/components/chat/) provides an interactive chat interface:
Core Components¶
| Component | Purpose |
|---|---|
ChatInput.tsx |
Message input with context window tracking |
ChatSidebar.tsx |
Session list, new chat, delete operations |
MessageList.tsx |
Conversation display with auto-scroll |
MessageContent.tsx |
Markdown and code syntax highlighting |
ModelSelector.tsx |
Running model dropdown with health awareness |
PerformanceMetrics.tsx |
Real-time tok/s, TTFT, token count |
Supporting Libraries¶
| File | Purpose |
|---|---|
lib/chat-client.ts |
SSE streaming, model constraints, token estimation |
lib/chat-api.ts |
Session CRUD operations via REST API |
hooks/useChat.ts |
State management, streaming control, metrics |
Feature Highlights¶
Streaming Chat: - Native Fetch + ReadableStream for SSE parsing - No external dependencies (no LangChain, no AI SDK) - Real-time token counting and TTFT measurement - Abort support via AbortController
Chat Persistence: - Server-side storage (database, not localStorage) - User-scoped sessions (isolation between users) - Cross-device access to chat history - Auto-generated titles from first message
Model Selection:
- Fetches running models via /v1/models/running
- Health-aware filtering (only shows healthy models)
- Model locking once conversation starts
- Constraint fetching for context tracking
See also: Chat Playground Guide
Model Form Components¶
The model management UI (src/components/models/) provides comprehensive configuration:
Core Components¶
| Component | Purpose |
|---|---|
ModelForm.tsx |
Main form container, state management |
ModelWorkflowForm.tsx |
Multi-step wizard variant |
EngineSelection.tsx |
vLLM vs llama.cpp selection |
ModeSelection.tsx |
Online vs Offline mode |
Engine-Specific Configuration¶
| Component | Purpose |
|---|---|
VLLMConfiguration.tsx |
vLLM-specific settings (TP, memory, attention) |
LlamaCppConfiguration.tsx |
llama.cpp settings (ngl, tensor split, speculative decoding) |
GGUF-Specific Components¶
| Component | Purpose |
|---|---|
GGUFGroupSelector.tsx |
Quantization selection with quality indicators |
EngineGuidance.tsx |
Smart engine/format recommendations |
SafeTensorDisplay.tsx |
SafeTensor model metadata display |
ArchitectureCompatibility.tsx |
vLLM/llama.cpp compatibility badges |
SpeculativeDecodingExplainer.tsx |
Modal explaining speculative decoding |
MergeInstructionsModal.tsx |
GGUF merge instructions |
Feature Highlights¶
GGUF Group Selector: - Visual quantization selection with radio buttons - Quality/speed bar indicators (1-5 scale) - Bits-per-weight and description tooltips - Multi-part status badges - Metadata badges (architecture, context, layers)
Engine Guidance: - Contextual warnings (multi-part GGUF + vLLM) - One-click engine/format switching - Recommendation badges on engine selector - SafeTensors availability tips
Speculative Decoding: - Collapsible advanced section - Draft model path input with example - Draft tokens and acceptance probability sliders - "What is this?" explainer modal (React Portal for proper layering)
Architecture Compatibility: - Inline badges showing vLLM/llama.cpp support - Color-coded: green (full), yellow (partial), orange (experimental), red (none) - Tooltips with detailed compatibility notes
Conditional UI Logic¶
The form dynamically shows/hides fields based on context:
| Condition | Visible Fields |
|---|---|
engineType === 'vllm' |
vLLM configuration section |
engineType === 'llamacpp' |
llama.cpp configuration section |
useGguf === true |
GGUF weight format dropdown (vLLM) |
useGguf === true |
Hide SafeTensor-specific options |
useGguf === false |
Show quantization dropdown (vLLM) |
Data fetching¶
- TanStack Query for caching and retries; error toasts map backend error structure.
- Env
NEXT_PUBLIC_GATEWAY_URLcontrols gateway base URL.
Accessibility & UX¶
- Keyboard-friendly components, focus management, skeletons and loading states.
- Loading indicators during folder inspection ("Scanning model folder...")
- React Portals for modals to prevent clipping
Authentication¶
- Dev cookie session (
cortex_session) expected by admin pages; replace with production auth later.