Deployments¶
This guide covers deployment options for Cortex, from development to production environments.
Quick Start (Recommended)¶
make quick-start
This single command: - Auto-detects your host IP - Configures CORS for network access - Creates the default admin user - Enables monitoring on Linux - Starts all services
Deployment Options¶
Development Mode¶
Best for local development and testing:
# Using Makefile (recommended)
make up
# Or direct Docker Compose
docker compose -f docker.compose.dev.yaml up --build
Features: - Hot-reload for frontend - Debug logging enabled - Dev auth bypass available - SQLite-compatible dev settings
Production Mode¶
For production deployments:
make up ENV=prod
# Or direct
docker compose -f docker.compose.prod.yaml up -d
Required configuration: - External PostgreSQL with backups - External Redis for rate limiting - Reverse proxy with TLS (nginx/traefik) - Strong authentication keys - Restricted CORS origins
Offline/Air-Gapped¶
For restricted networks without internet access:
# On internet-connected machine
make prepare-offline
# Transfer cortex-offline-images/ to target
# On air-gapped machine
make load-offline
make verify-offline
make quick-start
See Offline Deployment Guide for complete instructions.
Environment Profiles¶
Linux Profile¶
Enables host metrics collection:
make up PROFILES=linux
GPU Profile¶
Enables GPU metrics (requires NVIDIA drivers):
make up PROFILES=gpu
Combined¶
make up PROFILES=linux,gpu
On Linux with NVIDIA GPUs, make up automatically enables both profiles.
Production Checklist¶
Security¶
- Set
GATEWAY_DEV_ALLOW_ALL_KEYS=false - Configure strong
INTERNAL_VLLM_API_KEY - Set specific
CORS_ALLOW_ORIGINS(not*) - Change default admin password
- Enable TLS via reverse proxy
Infrastructure¶
- External PostgreSQL with backups
- External Redis for rate limiting
- Persistent volumes for models
- Log aggregation configured
Monitoring¶
- Prometheus scraping enabled
- GPU metrics configured (DCGM exporter)
- Alerting rules defined
- Dashboard imported
Backup¶
- Database backups automated
- Model files backed up
- Recovery procedure tested
Run make prod-check to verify production readiness.
Health Endpoints¶
| Service | Endpoint | Purpose |
|---|---|---|
| Gateway | GET /health |
Service health |
| Gateway | GET /admin/system/summary |
System metrics |
| Prometheus | GET /api/v1/query |
Metrics queries |
| PostgreSQL | Container healthcheck | Database health |
| Redis | Container healthcheck | Cache health |
Migration Workflows¶
Export System Configuration¶
# Via Admin UI: Deployment → Export
# Or via API:
curl -X POST http://localhost:8084/admin/deployment/export \
-H "Content-Type: application/json" \
-b cookies.txt \
-d '{
"output_dir": "/var/cortex/exports",
"include_images": true,
"include_db": true,
"include_configs": true
}'
Import on New System¶
- Load Docker images:
make load-offline - Restore database via Admin UI or API
- Import models via Admin UI
- Verify with
make health
See Backup & Restore for detailed procedures.
Scaling¶
Horizontal Gateway Scaling¶
- Deploy multiple gateway replicas behind load balancer
- Share PostgreSQL and Redis instances
- Configure sticky sessions for streaming
Model Scaling¶
- Add models to the registry via Admin UI
- Gateway routes requests to available models
- Use health-aware routing for reliability
See Scaling & Reliability for advanced patterns.