Skip to content

Frontend Architecture

Next.js App Router + TypeScript admin UI (frontend/).

Structure

  • app/(admin)/*: Pages for Health, Keys, Usage, Models, Orgs, Users, Chat, Guide
  • src/components/*: UI primitives, charts, monitoring widgets, models tools, chat UI
  • src/lib/api-clients.ts: fetch helper adds x-request-id and normalizes error envelope
  • src/lib/chat-client.ts: streaming chat client and model constraint fetching
  • src/lib/chat-api.ts: server-side chat session persistence
  • src/hooks/useChat.ts: chat state management hook
  • providers/*: App/Toast/User providers
  • Styling: Tailwind CSS with custom utility classes in styles/globals.css

Chat Playground Components

The Chat Playground (src/components/chat/) provides an interactive chat interface:

Core Components

Component Purpose
ChatInput.tsx Message input with context window tracking
ChatSidebar.tsx Session list, new chat, delete operations
MessageList.tsx Conversation display with auto-scroll
MessageContent.tsx Markdown and code syntax highlighting
ModelSelector.tsx Running model dropdown with health awareness
PerformanceMetrics.tsx Real-time tok/s, TTFT, token count

Supporting Libraries

File Purpose
lib/chat-client.ts SSE streaming, model constraints, token estimation
lib/chat-api.ts Session CRUD operations via REST API
hooks/useChat.ts State management, streaming control, metrics

Feature Highlights

Streaming Chat: - Native Fetch + ReadableStream for SSE parsing - No external dependencies (no LangChain, no AI SDK) - Real-time token counting and TTFT measurement - Abort support via AbortController

Chat Persistence: - Server-side storage (database, not localStorage) - User-scoped sessions (isolation between users) - Cross-device access to chat history - Auto-generated titles from first message

Model Selection: - Fetches running models via /v1/models/running - Health-aware filtering (only shows healthy models) - Model locking once conversation starts - Constraint fetching for context tracking

See also: Chat Playground Guide


Model Form Components

The model management UI (src/components/models/) provides comprehensive configuration:

Core Components

Component Purpose
ModelForm.tsx Main form container, state management
ModelWorkflowForm.tsx Multi-step wizard variant
EngineSelection.tsx vLLM vs llama.cpp selection
ModeSelection.tsx Online vs Offline mode

Engine-Specific Configuration

Component Purpose
VLLMConfiguration.tsx vLLM-specific settings (TP, memory, attention)
LlamaCppConfiguration.tsx llama.cpp settings (ngl, tensor split, speculative decoding)

GGUF-Specific Components

Component Purpose
GGUFGroupSelector.tsx Quantization selection with quality indicators
EngineGuidance.tsx Smart engine/format recommendations
SafeTensorDisplay.tsx SafeTensor model metadata display
ArchitectureCompatibility.tsx vLLM/llama.cpp compatibility badges
SpeculativeDecodingExplainer.tsx Modal explaining speculative decoding
MergeInstructionsModal.tsx GGUF merge instructions

Feature Highlights

GGUF Group Selector: - Visual quantization selection with radio buttons - Quality/speed bar indicators (1-5 scale) - Bits-per-weight and description tooltips - Multi-part status badges - Metadata badges (architecture, context, layers)

Engine Guidance: - Contextual warnings (multi-part GGUF + vLLM) - One-click engine/format switching - Recommendation badges on engine selector - SafeTensors availability tips

Speculative Decoding: - Collapsible advanced section - Draft model path input with example - Draft tokens and acceptance probability sliders - "What is this?" explainer modal (React Portal for proper layering)

Architecture Compatibility: - Inline badges showing vLLM/llama.cpp support - Color-coded: green (full), yellow (partial), orange (experimental), red (none) - Tooltips with detailed compatibility notes

Conditional UI Logic

The form dynamically shows/hides fields based on context:

Condition Visible Fields
engineType === 'vllm' vLLM configuration section
engineType === 'llamacpp' llama.cpp configuration section
useGguf === true GGUF weight format dropdown (vLLM)
useGguf === true Hide SafeTensor-specific options
useGguf === false Show quantization dropdown (vLLM)

Data fetching

  • TanStack Query for caching and retries; error toasts map backend error structure.
  • Env NEXT_PUBLIC_GATEWAY_URL controls gateway base URL.

Accessibility & UX

  • Keyboard-friendly components, focus management, skeletons and loading states.
  • Loading indicators during folder inspection ("Scanning model folder...")
  • React Portals for modals to prevent clipping

Authentication

  • Dev cookie session (cortex_session) expected by admin pages; replace with production auth later.