Administrator Setup Guide for Cortex-vLLM¶
🎯 Quick Start (5 Minutes)¶
For new administrators - no configuration needed!
# 1. Clone the repository
git clone <your-repo-url>
cd Cortex-vLLM
# 2. Start everything
make quick-start
# 3. Access at the IP shown in output
# Example: http://192.168.1.181:3001/login
# Username: admin
# Password: admin
That's it! The system automatically: - ✅ Detects your host machine's IP address - ✅ Configures CORS for network access - ✅ Enables monitoring on Linux (host + GPU metrics) - ✅ Sets up the database - ✅ Creates admin user - ✅ Starts all services
⚙️ What Gets Configured Automatically¶
1. IP Address Detection (No Manual Config Needed!)¶
The system automatically detects your LAN IP and configures:
# When you run: make up
# System detects: 192.168.1.181 (example)
# Automatically sets:
# - CORS_ALLOW_ORIGINS=http://192.168.1.181:3001,http://localhost:3001,http://127.0.0.1:3001
# - All URLs in output use your real IP
# - Frontend connects to correct gateway IP
To verify your detected IP:
make ip # Shows IP prominently
make info # Shows full configuration
2. CORS Configuration (Automatic!)¶
How it works:
1. Makefile runs scripts/detect-ip.sh
2. Detected IP is passed to Docker Compose: HOST_IP=192.168.1.181
3. Docker Compose injects into gateway environment: CORS_ALLOW_ORIGINS=http://${HOST_IP}:3001,...
4. Gateway allows requests from your network!
No manual CORS configuration required!
3. Network Access (Works Out of the Box!)¶
After running make quick-start
, anyone on your network can access:
- Admin UI: http://YOUR_HOST_IP:3001
- API Gateway: http://YOUR_HOST_IP:8084
The frontend automatically detects the gateway URL based on which IP the user accesses it from.
📋 Pre-Installation Checklist¶
Required (Must Have)¶
- Docker installed (v20.10+)
- Docker Compose installed (v2.0+)
- At least 8GB RAM available
- At least 20GB free disk space
Optional (For Enhanced Features)¶
- NVIDIA GPU + drivers (for GPU model serving)
- NVIDIA Container Toolkit (for GPU access)
- Static IP on host machine (recommended for production)
Verify Prerequisites¶
make install-deps
🔧 Configuration Files (Optional Overrides Only)¶
Directory Structure¶
Cortex-vLLM/
├── docker.compose.dev.yaml ← Main config (edit if needed)
├── docker.compose.prod.yaml ← Production config
├── backend/
│ └── .env.dev ← Optional overrides (create if needed)
├── Makefile ← Admin commands
└── scripts/
└── detect-ip.sh ← IP detection logic
When You DON'T Need to Edit Config Files¶
Never edit for: - ✅ IP addresses (auto-detected) - ✅ CORS settings (auto-configured) - ✅ Basic usage (defaults work) - ✅ Development mode (pre-configured)
When You MIGHT Edit Config Files¶
Only edit docker.compose.dev.yaml
if you need to:
- Change port mappings (if ports conflict)
- Modify storage paths for models
- Change database credentials
- Add environment-specific settings
Example - Change Ports:
# docker.compose.dev.yaml
gateway:
ports: ["8085:8084"] # Changed from 8084 to 8085
frontend:
ports: ["3002:3001"] # Changed from 3001 to 3002
Then restart:
make restart
make info # See new URLs
🚀 Step-by-Step First Time Setup¶
Step 1: Install Prerequisites¶
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y docker.io docker-compose-plugin make git
# Start Docker
sudo systemctl start docker
sudo systemctl enable docker
# Add your user to docker group (optional - avoid sudo)
sudo usermod -aG docker $USER
# Log out and back in for this to take effect
Step 2: Clone Repository¶
git clone <your-repo-url>
cd Cortex-vLLM
Step 3: Verify System Can Detect IP¶
# Test IP detection
bash scripts/detect-ip.sh
# Should show your LAN IP, e.g., 192.168.1.181
# NOT localhost
# NOT a Docker bridge IP (172.17-31.x.x)
If it shows "localhost":
- Your network interface might not be configured
- Set manually: export HOST_IP=192.168.1.181
- See troubleshooting below
Step 4: Start Cortex¶
make quick-start
What happens:
1. Detects IP: 192.168.1.181
2. Passes to Docker Compose: HOST_IP=192.168.1.181
3. Builds containers
4. Starts all services
5. Configures CORS automatically
6. Creates admin user
7. Shows you the URLs
Step 5: Access Admin UI¶
# The output will show something like:
# ✓ Cortex is ready!
# Login at: http://192.168.1.181:3001/login (admin/admin)
# Open that URL in your browser
# Use the IP shown, NOT localhost
Step 6: Create API Key¶
# Option 1: Via Makefile
make login # Enter admin/admin
make create-key # Copy the token
# Option 2: Via Admin UI
# Login → API Keys → Create New Key
Step 7: Test API¶
# Get your detected IP
make ip
# Test the API (replace YOUR_TOKEN and YOUR_IP)
curl -H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
http://YOUR_IP:8084/v1/chat/completions \
-d '{"model":"test","messages":[{"role":"user","content":"Hello!"}]}'
🔍 Verifying Everything is Configured Correctly¶
Check 1: Services Running¶
make status
# Should show:
# - gateway (Up)
# - frontend (Up)
# - postgres (Up, healthy)
# - redis (Up)
# - prometheus (Up)
Check 2: Detected IP is Correct¶
make ip
# Should show your LAN IP, not localhost
# Example: 192.168.1.181
Check 3: CORS is Configured¶
# Check CORS in gateway container
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS
# Should show:
# http://192.168.1.181:3001,http://localhost:3001,http://127.0.0.1:3001
# (with YOUR actual IP, not localhost at the start)
Check 4: Health Checks Pass¶
make health
# Should show:
# - Gateway Health: {"status":"ok"}
# - Containers: All running
# - Prometheus: Ready
Check 5: Network Access Works¶
# From host machine
curl http://192.168.1.181:8084/health
# Should return: {"status":"ok"}
# From another device on your network (same command)
curl http://192.168.1.181:8084/health
# Should also work!
🛠 Advanced Configuration (Optional)¶
Override Detected IP (If Needed)¶
If IP detection picks the wrong interface:
# Method 1: Environment variable (temporary)
HOST_IP=10.1.10.241 make up
HOST_IP=10.1.10.241 make info
# Method 2: Export for session (persistent)
export HOST_IP=10.1.10.241
make up
make info
# Method 3: Create .env file (permanent)
echo "HOST_IP=10.1.10.241" > .env.local
export $(cat .env.local | xargs)
make up
Change Storage Paths¶
Edit docker.compose.dev.yaml
:
environment:
CORTEX_MODELS_DIR_HOST: /path/to/your/models # Host path
HF_CACHE_DIR_HOST: /path/to/hf/cache # Host cache path
Then:
make restart
Enable Rate Limiting¶
Edit docker.compose.dev.yaml
:
gateway:
environment:
RATE_LIMIT_ENABLED: "true"
RATE_LIMIT_RPS: 10
RATE_LIMIT_BURST: 20
Then:
make restart
Production Deployment¶
# 1. Edit docker.compose.prod.yaml
# - Set GATEWAY_DEV_ALLOW_ALL_KEYS=false
# - Set strong INTERNAL_VLLM_API_KEY
# - Configure TLS reverse proxy
# 2. Pre-flight check
make prod-check
# 3. Start in production mode
make up ENV=prod
# 4. Change default admin password immediately!
# Login → Users → Edit admin user → Change password
🐛 Troubleshooting¶
"Can't access from another device"¶
Check IP detection:
make ip
# Use the IP shown here in your browser
# NOT "localhost"
Check CORS configuration:
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS
# Should include your detected IP
Check firewall:
# Ubuntu/Debian - allow ports
sudo ufw status
sudo ufw allow 3001/tcp
sudo ufw allow 8084/tcp
# CentOS/RHEL
sudo firewall-cmd --list-all
sudo firewall-cmd --add-port=3001/tcp --permanent
sudo firewall-cmd --add-port=8084/tcp --permanent
sudo firewall-cmd --reload
"Wrong IP detected"¶
If the script detects a VPN or wrong interface:
# See all available IPs
ip addr show | grep 'inet '
# Identify the correct one (usually 192.168.x.x or 10.x.x.x)
# Override detection
export HOST_IP=192.168.1.181
make restart
make ip # Verify
"Services won't start"¶
# Full diagnostic
make status # What's running?
make logs # Any errors?
make health # Services healthy?
# Nuclear option - reset everything
make clean-all
make quick-start
"Frontend shows but API calls fail"¶
Check gateway is running:
make status
make logs-gateway
Test gateway directly:
curl http://192.168.1.181:8084/health
Check CORS:
# From browser console (F12), check if you see CORS errors
# If yes, verify CORS config:
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS
🔒 Security Checklist (Production)¶
Before Going to Production¶
# Run pre-flight check
make prod-check
Must configure:
-
Disable Dev Mode
# docker.compose.prod.yaml GATEWAY_DEV_ALLOW_ALL_KEYS: "false"
-
Set Strong API Key
INTERNAL_VLLM_API_KEY: "your-strong-random-key-here"
-
Restrict CORS (if needed)
# If you want to restrict to specific IPs only: CORS_ALLOW_ORIGINS: http://192.168.1.181:3001
-
Change Default Admin Password
-
Login → Users → Edit admin → Set new password
-
Enable TLS
- Set up nginx or traefik reverse proxy
-
Get SSL certificate (Let's Encrypt)
-
Set Up Backups
# Add to crontab crontab -e # Add: 0 2 * * * cd /path/to/Cortex-vLLM && make db-backup
-
Configure Firewall
- Only allow ports 3001 and 8084 from trusted networks
- Block all other incoming traffic
📊 Monitoring Your Deployment¶
Daily Checks¶
# Morning routine
make status # Services up?
make health # Everything healthy?
# If issues found
make logs # Check for errors
Weekly Maintenance¶
# Backup database
make db-backup
# Review logs for errors
make logs-gateway | grep ERROR
# Check disk space
df -h
Monthly Tasks¶
# Clean unused Docker resources
make prune
# Review security settings
make prod-check
# Update containers (if new versions available)
make down
git pull
make up
🌐 Network Configuration Details¶
How Dynamic IP Detection Works¶
graph LR
A[make up] --> B[detect-ip.sh]
B --> C[Scan all interfaces]
C --> D[Filter out Docker/loopback]
D --> E[Score remaining IPs]
E --> F[Select best: 192.168.1.181]
F --> G[Export HOST_IP]
G --> H[Docker Compose]
H --> I[Gateway CORS Config]
IP Scoring Algorithm¶
The system prefers IPs in this order:
- 192.168.x.x → Score: 100 (home/small office)
- 10.x.x.x → Score: 95 (corporate network)
- 172.16-31.x.x → Score: 85 (private, non-Docker)
- Public IPs → Score: 50
- Link-local → Score: 10
Rejected IPs: - 127.0.0.1 (loopback) - 172.17-31.x.x (Docker bridges)
Frontend Auto-Detection¶
The Next.js frontend automatically detects which IP the user accessed it from:
// User accesses: http://192.168.1.181:3001
// Frontend detects: window.location.hostname = "192.168.1.181"
// Calls gateway at: http://192.168.1.181:8084
Result: Works seamlessly from any device!
🎓 Common Administrative Tasks¶
Adding a New User¶
# Option 1: Via Admin UI
# Login → Users → Create User
# Option 2: Via API
curl -X POST http://192.168.1.181:8084/admin/users \
-b cookies.txt \
-H 'Content-Type: application/json' \
-d '{"username":"john","password":"secret","role":"User"}'
Creating API Keys for Users¶
# Via Admin UI
# Login → API Keys → Create Key → Assign to User
# Save the token immediately (shown only once!)
Deploying a Model¶
# Via Admin UI
# Login → Models → Create Model
# - Choose engine (vLLM or llama.cpp)
# - Configure parameters
# - Click "Start"
# Monitor startup
make logs | grep model
Backing Up Data¶
# Manual backup
make db-backup
# Automated backups (daily at 2 AM)
crontab -e
# Add:
0 2 * * * cd /path/to/Cortex-vLLM && make db-backup
Restoring from Backup¶
# List available backups
ls -lh backups/
# Restore
make db-restore BACKUP_FILE=backups/cortex_backup_20251004_143000.sql
# Restart services
make restart
📱 Multi-Device Access¶
Scenario 1: Access from Host Machine¶
Browser on host: http://192.168.1.181:3001 ✅
Browser on host: http://localhost:3001 ✅ (also works)
Scenario 2: Access from Other Devices¶
Laptop on network: http://192.168.1.181:3001 ✅
Tablet on network: http://192.168.1.181:3001 ✅
Phone on network: http://192.168.1.181:3001 ✅
Using localhost: ❌ Won't work (localhost is device-local)
Scenario 3: API Calls from Applications¶
# Python application on the network
import requests
# Use the host IP
response = requests.post(
"http://192.168.1.181:8084/v1/chat/completions",
headers={"Authorization": "Bearer YOUR_TOKEN"},
json={
"model": "your-model",
"messages": [{"role": "user", "content": "Hello"}]
}
)
🎯 Configuration Validation Commands¶
Complete Health Check¶
# 1. Check IP detection
make ip
# Expected: Shows your LAN IP (192.168.x.x or 10.x.x.x)
# 2. Check services
make status
# Expected: All containers "Up"
# 3. Check health endpoints
make health
# Expected: Gateway returns {"status":"ok"}
# 4. Verify CORS
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS
# Expected: Includes your detected IP
# 5. Test from browser
# Open: http://YOUR_IP:3001
# Expected: Login page loads
# 6. Test API
curl http://YOUR_IP:8084/health
# Expected: {"status":"ok"}
📝 Important Notes for Administrators¶
✅ DO This¶
- Always use the detected IP shown by
make ip
- Run
make db-backup
before making changes - Check
make health
regularly - Review logs with
make logs
if issues occur - Use
make help
to see all available commands
❌ DON'T Do This¶
- Don't edit CORS_ALLOW_ORIGINS manually - it's auto-configured
- Don't use "localhost" for network access - use the detected IP
- Don't skip backups before database changes
- Don't run
make db-reset
without backing up first - Don't expose to internet without TLS and proper security
🚨 If Something Goes Wrong¶
Quick Recovery¶
# Stop everything
make down
# Clean everything
make clean-all
# Start fresh
make quick-start
# Should work now!
Get Help¶
# Check what IP was detected
make ip
# Check logs for errors
make logs-gateway | tail -100
# Check CORS configuration
docker exec cortex-gateway-1 printenv CORS_ALLOW_ORIGINS
# Test connectivity
curl -v http://192.168.1.181:8084/health
Contact Support¶
If issues persist:
1. Run diagnostic: make health > diagnostic.txt
2. Save logs: make logs > logs.txt 2>&1
3. Note your IP: make ip
4. Share diagnostic.txt and logs.txt with support
📖 Additional Documentation¶
In This Repository:
- README.md
- Overview and quick start
- START_HERE.md
- 5-minute quick start guide
- docs/operations/makefile-guide.md
- Complete command reference
- docs/architecture/ip-detection.md
- Technical details on IP detection
- docs/architecture/configuration-flow.md
- How automatic configuration works
- docs/getting-started/configuration-checklist.md
- Validation checklist
Online Docs: - Full documentation: https://aulendurforge.github.io/Cortex-vLLM/
Quick Help:
make help # See all commands
make info # See current configuration
✨ Summary¶
For 99% of deployments:
# This is all you need:
make quick-start
The system automatically: - ✅ Detects your IP - ✅ Configures CORS - ✅ Sets up networking - ✅ Creates admin user - ✅ Starts all services
Access at the IP shown in the output. That's it! 🎉
🔐 Security Best Practices¶
- Change default password immediately after quick-start
- Use strong API keys for production
- Enable firewall rules
- Set up TLS for production (nginx/traefik)
- Regular backups (automated via cron)
- Monitor logs for suspicious activity
- Review
make prod-check
output before production deployment
Questions? Run make help
or check MAKEFILE_GUIDE.md