Cortex Makefile Guide for Administrators¶

This guide explains how to use the Makefile for simplified administration of Cortex-vLLM.

Prerequisites¶

Before using the Makefile commands, ensure you have:

Docker installed (version 20.10 or later)
Docker Compose installed (v2.0 or later)
Make utility (usually pre-installed on Linux/macOS)
Bash shell (for IP detection script)

To verify prerequisites:

make install-deps

🌐 Automatic Configuration¶

Cortex automatically detects and configures:

IP Detection¶

✅ Detects your host machine's IP address (e.g., 192.168.1.181)
✅ CORS is automatically configured for your IP
✅ Works with make commands or docker compose standalone
✅ Fallback detection in gateway container if needed

Monitoring (Linux Systems)¶

✅ Auto-detects Linux OS and NVIDIA GPU
✅ Enables linux profile → node-exporter (host metrics)
✅ Enables gpu profile → dcgm-exporter + cadvisor (GPU metrics)
✅ No manual profile configuration needed

Check your detected IP:

make info

# Output:
# Detected Host IP: 192.168.1.181
# Endpoints:
# Gateway:         http://192.168.1.181:8084
# Admin UI:        http://192.168.1.181:3001

📌 Always use the IP shown in the output, NOT localhost!

For more details on how IP detection works, see docs/architecture/ip-detection.md.

Getting Started¶

First Time Setup¶

The simplest way to get started:

# 1. Clone the repository
git clone https://github.com/your-org/Cortex-vLLM.git
cd Cortex-vLLM

# 2. Start everything with one command
make quick-start

This will: - Build all Docker images - Start all services (gateway, database, Redis, Prometheus) - Create a default admin user (username: admin, password: admin) - Show you the URLs to access the services

Your First API Call¶

After quick-start completes:

# 1. Login to save session cookie
make login
# Enter username: admin
# Enter password: admin

# 2. Create an API key
make create-key
# Copy the token from the output

# 3. Test the API
curl -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  http://localhost:8084/v1/chat/completions \
  -d '{"model":"meta-llama/Llama-3-8B-Instruct","messages":[{"role":"user","content":"Hello!"}]}'

Common Tasks¶

Starting and Stopping Services¶

# Start all services (detached mode - runs in background)
make up

# Stop all services
make down

# Restart all services
make restart

# View what's running
make status

Viewing Logs¶

# View logs from all services
make logs

# View logs from specific service
make logs SERVICE=gateway
make logs SERVICE=postgres
make logs SERVICE=prometheus

# Quick shortcuts
make logs-gateway
make logs-postgres

Checking Health¶

# Check health of all services
make health
# Shows: Gateway, containers, Prometheus, exporters (if enabled)

# Check monitoring stack specifically
make monitoring-status
# Shows: node-exporter, dcgm-exporter, cadvisor, GPU count

Managing the Database¶

# Backup the database
make db-backup
# Creates backup in backups/cortex_backup_YYYYMMDD_HHMMSS.sql

# Restore from backup
make db-restore BACKUP_FILE=backups/cortex_backup_20240104_120000.sql

# Open PostgreSQL shell
make db-shell

# Reset database (⚠️ DANGER: deletes all data)
make db-reset

Cleaning Up¶

# Stop services and remove containers/volumes
make clean

# Also remove managed model containers
make clean-all

# Remove unused Docker resources (free up disk space)
make prune

Advanced Usage¶

Environment Selection¶

Run in production mode:

make up ENV=prod
make down ENV=prod

Using Profiles for Monitoring¶

If you have a Linux host with NVIDIA GPUs:

# Start with Linux host monitoring and GPU metrics
make up PROFILES=linux,gpu

# Verify exporters are running
make health

Available profiles: - linux - Enables node-exporter for host CPU/memory/disk metrics - gpu - Enables DCGM exporter for NVIDIA GPU metrics

Monitoring Commands¶

# Check monitoring stack status
make monitoring-status

# View monitoring logs
make logs-prometheus        # Prometheus scraper
make logs-node-exporter     # Host metrics (CPU, mem, disk, net)
make logs-dcgm              # GPU metrics (utilization, memory, temp)
make logs-cadvisor          # Container metrics

What gets monitored automatically:

On Linux systems: - ✅ Host metrics (node-exporter): CPU usage, memory, disk I/O, network traffic - ✅ GPU metrics (dcgm-exporter): GPU utilization, VRAM, temperature (if NVIDIA detected) - ✅ Container metrics (cadvisor): Per-container CPU, memory, network

All metrics visualized in Admin UI → System Monitor page with real-time charts.

Running Tests¶

# Run smoke tests
make test

# Test API endpoints
make test-api

# Validate complete configuration
make validate

Production Deployment Check¶

Before deploying to production:

make prod-check

# This will verify:
# - Dev auth is disabled
# - Security settings are configured
# - Required environment variables are set

Complete Command Reference¶

Run make help to see all available commands:

make help

Service Management¶

make build - Build Docker images
make up - Start services (background)
make up-fg - Start services (foreground, shows logs)
make down - Stop and remove containers
make restart - Restart all services
make stop - Stop containers
make start - Start stopped containers

Monitoring & Debugging¶

make logs - View logs (all services)
make logs SERVICE=name - View specific service
make logs-gateway - Gateway logs
make logs-postgres - Database logs
make ps / make status - List containers
make health - Health check all services

Setup & Configuration¶

make quick-start - Complete setup in one command
make bootstrap - Create admin user (interactive)
make bootstrap-default - Create default admin
make login - Login and save session
make create-key - Generate API key

Database Operations¶

make db-backup - Backup database
make db-restore BACKUP_FILE=path - Restore backup
make db-shell - Open PostgreSQL shell
make db-reset - Reset database

Cleanup¶

make clean - Stop and remove volumes
make clean-all - Also remove model containers
make prune - Clean unused Docker resources

Testing¶

make test - Run smoke tests
make test-api - Test endpoints

Development¶

make shell-gateway - Open shell in gateway
make shell-postgres - Open shell in Postgres
make watch - Watch container status

Information¶

make help - Show all commands
make info - Show current configuration
make version - Show version info
make install-deps - Verify dependencies

Troubleshooting¶

"make: command not found"¶

Solution: Install make utility

# Ubuntu/Debian
sudo apt-get install make

# macOS (usually pre-installed)
xcode-select --install

# Windows WSL
sudo apt-get install make

"Docker daemon is not running"¶

Solution: Start Docker

# Linux
sudo systemctl start docker

# macOS/Windows
# Start Docker Desktop application

Services won't start¶

# Clean everything and start fresh
make clean
make up

# If that doesn't work, check logs
make logs

Can't connect to services¶

Check services are running:
```
make status
```
Check health:
```
make health
```
View logs for errors:
```
make logs-gateway
```

Database connection errors¶

# Check if Postgres is running
make status

# View Postgres logs
make logs-postgres

# If needed, reset database
make db-reset

Port conflicts¶

If ports 8084, 9090, or 5432 are already in use:

Edit docker.compose.dev.yaml to change port mappings
Restart services:
```
make restart
```

Need to completely reset¶

# Nuclear option: remove everything
make clean-all
docker system prune -af --volumes

# Start fresh
make quick-start

Best Practices¶

Regular Backups¶

Set up a cron job for regular backups:

# Add to crontab (run daily at 2 AM)
0 2 * * * cd /path/to/Cortex-vLLM && make db-backup

Monitor Health¶

Regularly check service health:

make health

View Logs Regularly¶

Keep an eye on logs for errors:

make logs-gateway | grep ERROR

Before Updates¶

Backup database: make db-backup
Stop services: make down
Pull updates: git pull
Rebuild and start: make up

Quick Reference Card¶

Print this and keep it handy:

┌─────────────────────────────────────────────────┐
│         CORTEX QUICK REFERENCE                  │
├─────────────────────────────────────────────────┤
│ Start:         make up                          │
│ Stop:          make down                        │
│ Restart:       make restart                     │
│ Status:        make status                      │
│ Logs:          make logs                        │
│ Health:        make health                      │
│ Backup DB:     make db-backup                   │
│ Clean:         make clean                       │
│ Help:          make help                        │
├─────────────────────────────────────────────────┤
│ URLs:                                           │
│   Gateway:     http://localhost:8084            │
│   Admin UI:    http://localhost:3001            │
│   Prometheus:  http://localhost:9090            │
│   PgAdmin:     http://localhost:5050            │
└─────────────────────────────────────────────────┘

Support¶

For more detailed documentation: - Full docs: https://aulendurforge.github.io/Cortex-vLLM/ - GitHub issues: Report bugs or request features - README.md: Quick start guide

Security Notes¶

For Production Deployments:

Change default admin password immediately after quick-start
Set ENV=prod when running in production
Configure proper CORS origins (not *)
Enable TLS via reverse proxy (nginx/traefik)
Set strong INTERNAL_VLLM_API_KEY
Disable dev auth: GATEWAY_DEV_ALLOW_ALL_KEYS=false
Set up regular automated backups
Review security checklist: make prod-check

Need help? Run make help for a complete list of commands.