Cortex Makefile Guide for Administrators¶
This guide explains how to use the Makefile for simplified administration of Cortex-vLLM.
Prerequisites¶
Before using the Makefile commands, ensure you have:
- Docker installed (version 20.10 or later)
- Docker Compose installed (v2.0 or later)
- Make utility (usually pre-installed on Linux/macOS)
- Bash shell (for IP detection script)
To verify prerequisites:
make install-deps
🌐 Automatic Configuration¶
Cortex automatically detects and configures:
IP Detection¶
- ✅ Detects your host machine's IP address (e.g.,
192.168.1.181
) - ✅ CORS is automatically configured for your IP
- ✅ Works with
make
commands ordocker compose
standalone - ✅ Fallback detection in gateway container if needed
Monitoring (Linux Systems)¶
- ✅ Auto-detects Linux OS and NVIDIA GPU
- ✅ Enables
linux
profile → node-exporter (host metrics) - ✅ Enables
gpu
profile → dcgm-exporter + cadvisor (GPU metrics) - ✅ No manual profile configuration needed
Check your detected IP:
make info
# Output:
# Detected Host IP: 192.168.1.181
# Endpoints:
# Gateway: http://192.168.1.181:8084
# Admin UI: http://192.168.1.181:3001
📌 Always use the IP shown in the output, NOT
localhost
!
For more details on how IP detection works, see docs/architecture/ip-detection.md
.
Getting Started¶
First Time Setup¶
The simplest way to get started:
# 1. Clone the repository
git clone https://github.com/your-org/Cortex-vLLM.git
cd Cortex-vLLM
# 2. Start everything with one command
make quick-start
This will:
- Build all Docker images
- Start all services (gateway, database, Redis, Prometheus)
- Create a default admin user (username: admin
, password: admin
)
- Show you the URLs to access the services
Your First API Call¶
After quick-start completes:
# 1. Login to save session cookie
make login
# Enter username: admin
# Enter password: admin
# 2. Create an API key
make create-key
# Copy the token from the output
# 3. Test the API
curl -H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
http://localhost:8084/v1/chat/completions \
-d '{"model":"meta-llama/Llama-3-8B-Instruct","messages":[{"role":"user","content":"Hello!"}]}'
Common Tasks¶
Starting and Stopping Services¶
# Start all services (detached mode - runs in background)
make up
# Stop all services
make down
# Restart all services
make restart
# View what's running
make status
Viewing Logs¶
# View logs from all services
make logs
# View logs from specific service
make logs SERVICE=gateway
make logs SERVICE=postgres
make logs SERVICE=prometheus
# Quick shortcuts
make logs-gateway
make logs-postgres
Checking Health¶
# Check health of all services
make health
# Shows: Gateway, containers, Prometheus, exporters (if enabled)
# Check monitoring stack specifically
make monitoring-status
# Shows: node-exporter, dcgm-exporter, cadvisor, GPU count
Managing the Database¶
# Backup the database
make db-backup
# Creates backup in backups/cortex_backup_YYYYMMDD_HHMMSS.sql
# Restore from backup
make db-restore BACKUP_FILE=backups/cortex_backup_20240104_120000.sql
# Open PostgreSQL shell
make db-shell
# Reset database (⚠️ DANGER: deletes all data)
make db-reset
Cleaning Up¶
# Stop services and remove containers/volumes
make clean
# Also remove managed model containers
make clean-all
# Remove unused Docker resources (free up disk space)
make prune
Advanced Usage¶
Environment Selection¶
Run in production mode:
make up ENV=prod
make down ENV=prod
Using Profiles for Monitoring¶
If you have a Linux host with NVIDIA GPUs:
# Start with Linux host monitoring and GPU metrics
make up PROFILES=linux,gpu
# Verify exporters are running
make health
Available profiles:
- linux
- Enables node-exporter for host CPU/memory/disk metrics
- gpu
- Enables DCGM exporter for NVIDIA GPU metrics
Monitoring Commands¶
# Check monitoring stack status
make monitoring-status
# View monitoring logs
make logs-prometheus # Prometheus scraper
make logs-node-exporter # Host metrics (CPU, mem, disk, net)
make logs-dcgm # GPU metrics (utilization, memory, temp)
make logs-cadvisor # Container metrics
What gets monitored automatically:
On Linux systems: - ✅ Host metrics (node-exporter): CPU usage, memory, disk I/O, network traffic - ✅ GPU metrics (dcgm-exporter): GPU utilization, VRAM, temperature (if NVIDIA detected) - ✅ Container metrics (cadvisor): Per-container CPU, memory, network
All metrics visualized in Admin UI → System Monitor page with real-time charts.
Running Tests¶
# Run smoke tests
make test
# Test API endpoints
make test-api
# Validate complete configuration
make validate
Production Deployment Check¶
Before deploying to production:
make prod-check
# This will verify:
# - Dev auth is disabled
# - Security settings are configured
# - Required environment variables are set
Complete Command Reference¶
Run make help
to see all available commands:
make help
Service Management¶
make build
- Build Docker imagesmake up
- Start services (background)make up-fg
- Start services (foreground, shows logs)make down
- Stop and remove containersmake restart
- Restart all servicesmake stop
- Stop containersmake start
- Start stopped containers
Monitoring & Debugging¶
make logs
- View logs (all services)make logs SERVICE=name
- View specific servicemake logs-gateway
- Gateway logsmake logs-postgres
- Database logsmake ps
/make status
- List containersmake health
- Health check all services
Setup & Configuration¶
make quick-start
- Complete setup in one commandmake bootstrap
- Create admin user (interactive)make bootstrap-default
- Create default adminmake login
- Login and save sessionmake create-key
- Generate API key
Database Operations¶
make db-backup
- Backup databasemake db-restore BACKUP_FILE=path
- Restore backupmake db-shell
- Open PostgreSQL shellmake db-reset
- Reset database
Cleanup¶
make clean
- Stop and remove volumesmake clean-all
- Also remove model containersmake prune
- Clean unused Docker resources
Testing¶
make test
- Run smoke testsmake test-api
- Test endpoints
Development¶
make shell-gateway
- Open shell in gatewaymake shell-postgres
- Open shell in Postgresmake watch
- Watch container status
Information¶
make help
- Show all commandsmake info
- Show current configurationmake version
- Show version infomake install-deps
- Verify dependencies
Troubleshooting¶
"make: command not found"¶
Solution: Install make utility
# Ubuntu/Debian
sudo apt-get install make
# macOS (usually pre-installed)
xcode-select --install
# Windows WSL
sudo apt-get install make
"Docker daemon is not running"¶
Solution: Start Docker
# Linux
sudo systemctl start docker
# macOS/Windows
# Start Docker Desktop application
Services won't start¶
# Clean everything and start fresh
make clean
make up
# If that doesn't work, check logs
make logs
Can't connect to services¶
-
Check services are running:
make status
-
Check health:
make health
-
View logs for errors:
make logs-gateway
Database connection errors¶
# Check if Postgres is running
make status
# View Postgres logs
make logs-postgres
# If needed, reset database
make db-reset
Port conflicts¶
If ports 8084, 9090, or 5432 are already in use:
- Edit
docker.compose.dev.yaml
to change port mappings - Restart services:
make restart
Need to completely reset¶
# Nuclear option: remove everything
make clean-all
docker system prune -af --volumes
# Start fresh
make quick-start
Best Practices¶
Regular Backups¶
Set up a cron job for regular backups:
# Add to crontab (run daily at 2 AM)
0 2 * * * cd /path/to/Cortex-vLLM && make db-backup
Monitor Health¶
Regularly check service health:
make health
View Logs Regularly¶
Keep an eye on logs for errors:
make logs-gateway | grep ERROR
Before Updates¶
- Backup database:
make db-backup
- Stop services:
make down
- Pull updates:
git pull
- Rebuild and start:
make up
Quick Reference Card¶
Print this and keep it handy:
┌─────────────────────────────────────────────────┐
│ CORTEX QUICK REFERENCE │
├─────────────────────────────────────────────────┤
│ Start: make up │
│ Stop: make down │
│ Restart: make restart │
│ Status: make status │
│ Logs: make logs │
│ Health: make health │
│ Backup DB: make db-backup │
│ Clean: make clean │
│ Help: make help │
├─────────────────────────────────────────────────┤
│ URLs: │
│ Gateway: http://localhost:8084 │
│ Admin UI: http://localhost:3001 │
│ Prometheus: http://localhost:9090 │
│ PgAdmin: http://localhost:5050 │
└─────────────────────────────────────────────────┘
Support¶
For more detailed documentation: - Full docs: https://aulendurforge.github.io/Cortex-vLLM/ - GitHub issues: Report bugs or request features - README.md: Quick start guide
Security Notes¶
For Production Deployments:
- Change default admin password immediately after
quick-start
- Set
ENV=prod
when running in production - Configure proper CORS origins (not
*
) - Enable TLS via reverse proxy (nginx/traefik)
- Set strong
INTERNAL_VLLM_API_KEY
- Disable dev auth:
GATEWAY_DEV_ALLOW_ALL_KEYS=false
- Set up regular automated backups
- Review security checklist:
make prod-check
Need help? Run make help
for a complete list of commands.