Administrator Setup Guide for Cortex-vLLM¶

🎯 Quick Start (5 Minutes)¶

For new administrators - no configuration needed!

# 1. Clone the repository
git clone <your-repo-url>
cd Cortex-vLLM

# 2. Start everything
make quick-start

# 3. Access at the IP shown in output
# Example: http://192.168.1.181:3001/login
# Username: admin
# Password: admin

That's it! The system automatically: - ✅ Detects your host machine's IP address - ✅ Configures CORS for network access - ✅ Enables monitoring on Linux (host + GPU metrics) - ✅ Sets up the database - ✅ Creates admin user - ✅ Starts all services

⚙️ What Gets Configured Automatically¶

1. IP Address Detection (No Manual Config Needed!)¶

The system automatically detects your LAN IP and configures:

# When you run: make up
# System detects: 192.168.1.181 (example)
# Automatically sets:
# - CORS_ALLOW_ORIGINS=http://192.168.1.181:3001,http://localhost:3001,http://127.0.0.1:3001
# - All URLs in output use your real IP
# - Frontend connects to correct gateway IP

To verify your detected IP:

make ip      # Shows IP prominently
make info    # Shows full configuration

2. CORS Configuration (Automatic!)¶

How it works: 1. Makefile runs scripts/detect-ip.sh 2. Detected IP is passed to Docker Compose: HOST_IP=192.168.1.181 3. Docker Compose injects into gateway environment: CORS_ALLOW_ORIGINS=http://${HOST_IP}:3001,... 4. Gateway allows requests from your network!

No manual CORS configuration required!

3. Network Access (Works Out of the Box!)¶

After running make quick-start, anyone on your network can access: - Admin UI: http://YOUR_HOST_IP:3001 - API Gateway: http://YOUR_HOST_IP:8084

The frontend automatically detects the gateway URL based on which IP the user accesses it from.

📋 Pre-Installation Checklist¶

Required (Must Have)¶

Docker installed (v20.10+)
Docker Compose installed (v2.0+)
At least 8GB RAM available
At least 20GB free disk space

Optional (For Enhanced Features)¶

NVIDIA GPU + drivers (for GPU model serving)
NVIDIA Container Toolkit (for GPU access)
Static IP on host machine (recommended for production)

Verify Prerequisites¶

make install-deps

🔧 Configuration Files (Optional Overrides Only)¶

Directory Structure¶

Cortex-vLLM/
├── docker.compose.dev.yaml   ← Main config (edit if needed)
├── docker.compose.prod.yaml  ← Production config
├── backend/
│   └── .env.dev              ← Optional overrides (create if needed)
├── Makefile                  ← Admin commands
└── scripts/
    └── detect-ip.sh          ← IP detection logic

When You DON'T Need to Edit Config Files¶

Never edit for: - ✅ IP addresses (auto-detected) - ✅ CORS settings (auto-configured) - ✅ Basic usage (defaults work) - ✅ Development mode (pre-configured)

When You MIGHT Edit Config Files¶

Only edit docker.compose.dev.yaml if you need to: - Change port mappings (if ports conflict) - Modify storage paths for models - Change database credentials - Add environment-specific settings

Example - Change Ports:

# docker.compose.dev.yaml
gateway:
  ports: ["8085:8084"]  # Changed from 8084 to 8085

frontend:
  ports: ["3002:3001"]  # Changed from 3001 to 3002

Then restart:

make restart
make info  # See new URLs

🚀 Step-by-Step First Time Setup¶

Step 1: Install Prerequisites¶

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y docker.io docker-compose-plugin make git

# Start Docker
sudo systemctl start docker
sudo systemctl enable docker

# Add your user to docker group (optional - avoid sudo)
sudo usermod -aG docker $USER
# Log out and back in for this to take effect

Step 2: Clone Repository¶

git clone <your-repo-url>
cd Cortex-vLLM

Step 3: Verify System Can Detect IP¶

# Test IP detection
bash scripts/detect-ip.sh

# Should show your LAN IP, e.g., 192.168.1.181
# NOT localhost
# NOT a Docker bridge IP (172.17-31.x.x)

If it shows "localhost": - Your network interface might not be configured - Set manually: export HOST_IP=192.168.1.181 - See troubleshooting below

Step 4: Start Cortex¶

make quick-start

What happens: 1. Detects IP: 192.168.1.181 2. Passes to Docker Compose: HOST_IP=192.168.1.181 3. Builds containers 4. Starts all services 5. Configures CORS automatically 6. Creates admin user 7. Shows you the URLs

Step 5: Access Admin UI¶

# The output will show something like:
# ✓ Cortex is ready!
# Login at: http://192.168.1.181:3001/login (admin/admin)

# Open that URL in your browser
# Use the IP shown, NOT localhost

Step 6: Create API Key¶

# Option 1: Via Makefile
make login      # Enter admin/admin
make create-key # Copy the token

# Option 2: Via Admin UI
# Login → API Keys → Create New Key

Step 7: Test API¶

# Get your detected IP
make ip

# Test the API (replace YOUR_TOKEN and YOUR_IP)
curl -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  http://YOUR_IP:8084/v1/chat/completions \
  -d '{"model":"test","messages":[{"role":"user","content":"Hello!"}]}'

🔍 Verifying Everything is Configured Correctly¶

Check 1: Services Running¶

make status

# Should show:
# - gateway (Up)
# - frontend (Up)
# - postgres (Up, healthy)
# - redis (Up)
# - prometheus (Up)

Check 2: Detected IP is Correct¶

make ip

# Should show your LAN IP, not localhost
# Example: 192.168.1.181

Check 3: CORS is Configured¶

# Check CORS in gateway container
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS

# Should show:
# http://192.168.1.181:3001,http://localhost:3001,http://127.0.0.1:3001
# (with YOUR actual IP, not localhost at the start)

Check 4: Health Checks Pass¶

make health

# Should show:
# - Gateway Health: {"status":"ok"}
# - Containers: All running
# - Prometheus: Ready

Check 5: Network Access Works¶

# From host machine
curl http://192.168.1.181:8084/health

# Should return: {"status":"ok"}

# From another device on your network (same command)
curl http://192.168.1.181:8084/health

# Should also work!

🛠 Advanced Configuration (Optional)¶

Override Detected IP (If Needed)¶

If IP detection picks the wrong interface:

# Method 1: Environment variable (temporary)
HOST_IP=10.1.10.241 make up
HOST_IP=10.1.10.241 make info

# Method 2: Export for session (persistent)
export HOST_IP=10.1.10.241
make up
make info

# Method 3: Create .env file (permanent)
echo "HOST_IP=10.1.10.241" > .env.local
export $(cat .env.local | xargs)
make up

Change Storage Paths¶

Edit docker.compose.dev.yaml:

environment:
  CORTEX_MODELS_DIR_HOST: /path/to/your/models  # Host path
  HF_CACHE_DIR_HOST: /path/to/hf/cache         # Host cache path

Then:

make restart

Enable Rate Limiting¶

Edit docker.compose.dev.yaml:

gateway:
  environment:
    RATE_LIMIT_ENABLED: "true"
    RATE_LIMIT_RPS: 10
    RATE_LIMIT_BURST: 20

Then:

make restart

Production Deployment¶

# 1. Edit docker.compose.prod.yaml
# - Set GATEWAY_DEV_ALLOW_ALL_KEYS=false
# - Set strong INTERNAL_VLLM_API_KEY
# - Configure TLS reverse proxy

# 2. Pre-flight check
make prod-check

# 3. Start in production mode
make up ENV=prod

# 4. Change default admin password immediately!
# Login → Users → Edit admin user → Change password

🐛 Troubleshooting¶

"Can't access from another device"¶

Check IP detection:

make ip

# Use the IP shown here in your browser
# NOT "localhost"

Check CORS configuration:

docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS

# Should include your detected IP

Check firewall:

# Ubuntu/Debian - allow ports
sudo ufw status
sudo ufw allow 3001/tcp
sudo ufw allow 8084/tcp

# CentOS/RHEL
sudo firewall-cmd --list-all
sudo firewall-cmd --add-port=3001/tcp --permanent
sudo firewall-cmd --add-port=8084/tcp --permanent
sudo firewall-cmd --reload

"Wrong IP detected"¶

If the script detects a VPN or wrong interface:

# See all available IPs
ip addr show | grep 'inet '

# Identify the correct one (usually 192.168.x.x or 10.x.x.x)

# Override detection
export HOST_IP=192.168.1.181
make restart
make ip  # Verify

"Services won't start"¶

# Full diagnostic
make status       # What's running?
make logs        # Any errors?
make health      # Services healthy?

# Nuclear option - reset everything
make clean-all
make quick-start

"Frontend shows but API calls fail"¶

Check gateway is running:

make status
make logs-gateway

Test gateway directly:

curl http://192.168.1.181:8084/health

Check CORS:

# From browser console (F12), check if you see CORS errors
# If yes, verify CORS config:
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS

🔒 Security Checklist (Production)¶

Before Going to Production¶

# Run pre-flight check
make prod-check

Must configure:

Disable Dev Mode

# docker.compose.prod.yaml
GATEWAY_DEV_ALLOW_ALL_KEYS: "false"

Set Strong API Key

INTERNAL_VLLM_API_KEY: "your-strong-random-key-here"

Restrict CORS (if needed)

# If you want to restrict to specific IPs only:
CORS_ALLOW_ORIGINS: http://192.168.1.181:3001

Change Default Admin Password
Login → Users → Edit admin → Set new password
Enable TLS
Set up nginx or traefik reverse proxy
Get SSL certificate (Let's Encrypt)

Set Up Backups

# Add to crontab
crontab -e
# Add: 0 2 * * * cd /path/to/Cortex-vLLM && make db-backup

Configure Firewall
Only allow ports 3001 and 8084 from trusted networks
Block all other incoming traffic

📊 Monitoring Your Deployment¶

Daily Checks¶

# Morning routine
make status   # Services up?
make health   # Everything healthy?

# If issues found
make logs     # Check for errors

Weekly Maintenance¶

# Backup database
make db-backup

# Review logs for errors
make logs-gateway | grep ERROR

# Check disk space
df -h

Monthly Tasks¶

# Clean unused Docker resources
make prune

# Review security settings
make prod-check

# Update containers (if new versions available)
make down
git pull
make up

🌐 Network Configuration Details¶

How Dynamic IP Detection Works¶

graph LR
    A[make up] --> B[detect-ip.sh]
    B --> C[Scan all interfaces]
    C --> D[Filter out Docker/loopback]
    D --> E[Score remaining IPs]
    E --> F[Select best: 192.168.1.181]
    F --> G[Export HOST_IP]
    G --> H[Docker Compose]
    H --> I[Gateway CORS Config]

IP Scoring Algorithm¶

The system prefers IPs in this order:

192.168.x.x → Score: 100 (home/small office)
10.x.x.x → Score: 95 (corporate network)
172.16-31.x.x → Score: 85 (private, non-Docker)
Public IPs → Score: 50
Link-local → Score: 10

Rejected IPs: - 127.0.0.1 (loopback) - 172.17-31.x.x (Docker bridges)

Frontend Auto-Detection¶

The Next.js frontend automatically detects which IP the user accessed it from:

// User accesses: http://192.168.1.181:3001
// Frontend detects: window.location.hostname = "192.168.1.181"
// Calls gateway at: http://192.168.1.181:8084

Result: Works seamlessly from any device!

🎓 Common Administrative Tasks¶

Adding a New User¶

# Option 1: Via Admin UI
# Login → Users → Create User

# Option 2: Via API
curl -X POST http://192.168.1.181:8084/admin/users \
  -b cookies.txt \
  -H 'Content-Type: application/json' \
  -d '{"username":"john","password":"secret","role":"User"}'

Creating API Keys for Users¶

# Via Admin UI
# Login → API Keys → Create Key → Assign to User

# Save the token immediately (shown only once!)

Deploying a Model¶

# Via Admin UI
# Login → Models → Create Model
# - Choose engine (vLLM or llama.cpp)
# - Configure parameters
# - Click "Start"

# Monitor startup
make logs | grep model

Backing Up Data¶

# Manual backup
make db-backup

# Automated backups (daily at 2 AM)
crontab -e
# Add:
0 2 * * * cd /path/to/Cortex-vLLM && make db-backup

Restoring from Backup¶

# List available backups
ls -lh backups/

# Restore
make db-restore BACKUP_FILE=backups/cortex_backup_20251004_143000.sql

# Restart services
make restart

📱 Multi-Device Access¶

Scenario 1: Access from Host Machine¶

Browser on host: http://192.168.1.181:3001 ✅
Browser on host: http://localhost:3001 ✅ (also works)

Scenario 2: Access from Other Devices¶

Laptop on network: http://192.168.1.181:3001 ✅
Tablet on network: http://192.168.1.181:3001 ✅
Phone on network:  http://192.168.1.181:3001 ✅

Using localhost:  ❌ Won't work (localhost is device-local)

Scenario 3: API Calls from Applications¶

# Python application on the network
import requests

# Use the host IP
response = requests.post(
    "http://192.168.1.181:8084/v1/chat/completions",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    json={
        "model": "your-model",
        "messages": [{"role": "user", "content": "Hello"}]
    }
)

🎯 Configuration Validation Commands¶

Complete Health Check¶

# 1. Check IP detection
make ip
# Expected: Shows your LAN IP (192.168.x.x or 10.x.x.x)

# 2. Check services
make status
# Expected: All containers "Up"

# 3. Check health endpoints
make health
# Expected: Gateway returns {"status":"ok"}

# 4. Verify CORS
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS
# Expected: Includes your detected IP

# 5. Test from browser
# Open: http://YOUR_IP:3001
# Expected: Login page loads

# 6. Test API
curl http://YOUR_IP:8084/health
# Expected: {"status":"ok"}

📝 Important Notes for Administrators¶

✅ DO This¶

Always use the detected IP shown by make ip
Run make db-backup before making changes
Check make health regularly
Review logs with make logs if issues occur
Use make help to see all available commands

❌ DON'T Do This¶

Don't edit CORS_ALLOW_ORIGINS manually - it's auto-configured
Don't use "localhost" for network access - use the detected IP
Don't skip backups before database changes
Don't run make db-reset without backing up first
Don't expose to internet without TLS and proper security

🚨 If Something Goes Wrong¶

Quick Recovery¶

# Stop everything
make down

# Clean everything
make clean-all

# Start fresh
make quick-start

# Should work now!

Get Help¶

# Check what IP was detected
make ip

# Check logs for errors  
make logs-gateway | tail -100

# Check CORS configuration
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS

# Test connectivity
curl -v http://192.168.1.181:8084/health

Contact Support¶

If issues persist: 1. Run diagnostic: make health > diagnostic.txt 2. Save logs: make logs > logs.txt 2>&1 3. Note your IP: make ip 4. Share diagnostic.txt and logs.txt with support

📖 Additional Documentation¶

In This Repository: - README.md - Overview and quick start - START_HERE.md - 5-minute quick start guide - docs/operations/makefile-guide.md - Complete command reference - docs/architecture/ip-detection.md - Technical details on IP detection - docs/architecture/configuration-flow.md - How automatic configuration works - docs/getting-started/configuration-checklist.md - Validation checklist

Online Docs: - Full documentation: https://aulendurforge.github.io/Cortex-vLLM/

Quick Help:

make help  # See all commands
make info  # See current configuration

✨ Summary¶

For 99% of deployments:

# This is all you need:
make quick-start

The system automatically: - ✅ Detects your IP - ✅ Configures CORS - ✅ Sets up networking - ✅ Creates admin user - ✅ Starts all services

Access at the IP shown in the output. That's it! 🎉

🔐 Security Best Practices¶

Change default password immediately after quick-start
Use strong API keys for production
Enable firewall rules
Set up TLS for production (nginx/traefik)
Regular backups (automated via cron)
Monitor logs for suspicious activity
Review make prod-check output before production deployment

Questions? Run make help or check MAKEFILE_GUIDE.md