Files
guru-connect/INSTALLATION_GUIDE.md
Mike Swanson e3e95f8fa7
Some checks failed
Build and Test / Build Server (Linux) (push) Has been cancelled
Build and Test / Build Agent (Windows) (push) Has been cancelled
Build and Test / Security Audit (push) Has been cancelled
Build and Test / Build Summary (push) Has been cancelled
Run Tests / Test Server (push) Has been cancelled
Run Tests / Test Agent (push) Has been cancelled
Run Tests / Code Coverage (push) Has been cancelled
Run Tests / Lint and Format Check (push) Has been cancelled
chore: sync repository to current working state
Brings azcomputerguru/guru-connect up to the authoritative working copy that
had been maintained in the claudetools monorepo: Phase 1 security and
infrastructure (middleware, metrics, utils, token blacklist, deployment
scripts, security audits) plus the native-remote-control integration spec.
Preserves the repo .gitignore, .cargo, and server/static/downloads.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 06:15:29 -07:00

519 lines
11 KiB
Markdown

# GuruConnect Production Infrastructure Installation Guide
**Date:** 2026-01-18
**Server:** 172.16.3.30
**Status:** Core system operational, infrastructure ready for installation
---
## Current Status
- Server Process: Running (PID 3847752)
- Health Check: OK
- Metrics Endpoint: Operational
- Database: Connected (2 users)
- Dashboard: https://connect.azcomputerguru.com/dashboard
**Login:** username=`howard`, password=`AdminGuruConnect2026`
---
## Installation Options
### Option 1: One-Command Installation (Recommended)
Run the master installation script that installs everything:
```bash
ssh guru@172.16.3.30
cd ~/guru-connect
sudo bash install-production-infrastructure.sh
```
This will install:
1. Systemd service for auto-start and management
2. Prometheus & Grafana monitoring stack
3. Automated PostgreSQL backups (daily at 2:00 AM)
4. Log rotation configuration
**Time:** ~10-15 minutes (Grafana installation takes longest)
---
### Option 2: Step-by-Step Manual Installation
If you prefer to install components individually:
#### Step 1: Install Systemd Service
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/server
sudo ./setup-systemd.sh
```
**What this does:**
- Installs GuruConnect as a systemd service
- Enables auto-start on boot
- Configures auto-restart on failure
- Sets resource limits and security hardening
**Verify:**
```bash
sudo systemctl status guruconnect
sudo journalctl -u guruconnect -n 20
```
---
#### Step 2: Install Prometheus & Grafana
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/infrastructure
sudo ./setup-monitoring.sh
```
**What this does:**
- Installs Prometheus for metrics collection
- Installs Grafana for visualization
- Configures Prometheus to scrape GuruConnect metrics
- Sets up Prometheus data source in Grafana
**Access:**
- Prometheus: http://172.16.3.30:9090
- Grafana: http://172.16.3.30:3000 (admin/admin)
**Post-installation:**
1. Access Grafana at http://172.16.3.30:3000
2. Login with admin/admin
3. Change the default password
4. Import dashboard:
- Go to Dashboards > Import
- Upload `~/guru-connect/infrastructure/grafana-dashboard.json`
---
#### Step 3: Install Automated Backups
```bash
ssh guru@172.16.3.30
# Create backup directory
sudo mkdir -p /home/guru/backups/guruconnect
sudo chown guru:guru /home/guru/backups/guruconnect
# Install systemd timer
sudo cp ~/guru-connect/server/guruconnect-backup.service /etc/systemd/system/
sudo cp ~/guru-connect/server/guruconnect-backup.timer /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable guruconnect-backup.timer
sudo systemctl start guruconnect-backup.timer
```
**Verify:**
```bash
sudo systemctl status guruconnect-backup.timer
sudo systemctl list-timers
```
**Test manual backup:**
```bash
cd ~/guru-connect/server
./backup-postgres.sh
ls -lh /home/guru/backups/guruconnect/
```
**Backup Schedule:** Daily at 2:00 AM
**Retention:** 30 daily, 4 weekly, 6 monthly backups
---
#### Step 4: Install Log Rotation
```bash
ssh guru@172.16.3.30
sudo cp ~/guru-connect/server/guruconnect.logrotate /etc/logrotate.d/guruconnect
sudo chmod 644 /etc/logrotate.d/guruconnect
```
**Verify:**
```bash
sudo cat /etc/logrotate.d/guruconnect
sudo logrotate -d /etc/logrotate.d/guruconnect
```
**Log Rotation:** Daily, 30 days retention, compressed
---
## Verification
After installation, verify everything is working:
```bash
ssh guru@172.16.3.30
bash ~/guru-connect/verify-installation.sh
```
Expected output (all green):
- Server process: Running
- Health endpoint: OK
- Metrics endpoint: OK
- Systemd service: Active
- Prometheus: Active
- Grafana: Active
- Backup timer: Active
- Log rotation: Configured
- Database: Connected
---
## Post-Installation Tasks
### 1. Configure Grafana
1. Access http://172.16.3.30:3000
2. Login with admin/admin
3. Change password when prompted
4. Import dashboard:
```
Dashboards > Import > Upload JSON file
Select: ~/guru-connect/infrastructure/grafana-dashboard.json
```
### 2. Test Backup & Restore
**Test backup:**
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/server
./backup-postgres.sh
```
**Verify backup created:**
```bash
ls -lh /home/guru/backups/guruconnect/
```
**Test restore (CAUTION - use test database):**
```bash
cd ~/guru-connect/server
./restore-postgres.sh /home/guru/backups/guruconnect/guruconnect-YYYY-MM-DD-HHMMSS.sql.gz
```
### 3. Configure NPM (Nginx Proxy Manager)
If Prometheus/Grafana need external access:
1. Add proxy hosts in NPM:
- prometheus.azcomputerguru.com -> http://172.16.3.30:9090
- grafana.azcomputerguru.com -> http://172.16.3.30:3000
2. Enable SSL/TLS via Let's Encrypt
3. Restrict access (firewall or NPM access lists)
### 4. Test Health Monitoring
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/server
./health-monitor.sh
```
Expected output: All checks passed
---
## Service Management
### GuruConnect Server
```bash
# Start server
sudo systemctl start guruconnect
# Stop server
sudo systemctl stop guruconnect
# Restart server
sudo systemctl restart guruconnect
# Check status
sudo systemctl status guruconnect
# View logs
sudo journalctl -u guruconnect -f
# View recent logs
sudo journalctl -u guruconnect -n 100
```
### Prometheus
```bash
# Status
sudo systemctl status prometheus
# Restart
sudo systemctl restart prometheus
# Logs
sudo journalctl -u prometheus -n 50
```
### Grafana
```bash
# Status
sudo systemctl status grafana-server
# Restart
sudo systemctl restart grafana-server
# Logs
sudo journalctl -u grafana-server -n 50
```
### Backups
```bash
# Check timer status
sudo systemctl status guruconnect-backup.timer
# Check when next backup runs
sudo systemctl list-timers
# Manually trigger backup
sudo systemctl start guruconnect-backup.service
# View backup logs
sudo journalctl -u guruconnect-backup -n 20
```
---
## Troubleshooting
### Server Won't Start
```bash
# Check logs
sudo journalctl -u guruconnect -n 50
# Check if port 3002 is in use
sudo netstat -tulpn | grep 3002
# Verify .env file
cat ~/guru-connect/server/.env
# Test manual start
cd ~/guru-connect/server
./start-secure.sh
```
### Database Connection Issues
```bash
# Test PostgreSQL
PGPASSWORD=gc_a7f82d1e4b9c3f60 psql -h localhost -U guruconnect -d guruconnect -c 'SELECT 1'
# Check PostgreSQL service
sudo systemctl status postgresql
# Verify DATABASE_URL in .env
cat ~/guru-connect/server/.env | grep DATABASE_URL
```
### Prometheus Not Scraping Metrics
```bash
# Check Prometheus targets
# Access: http://172.16.3.30:9090/targets
# Verify GuruConnect metrics endpoint
curl http://172.16.3.30:3002/metrics
# Check Prometheus config
sudo cat /etc/prometheus/prometheus.yml
# Restart Prometheus
sudo systemctl restart prometheus
```
### Grafana Dashboard Not Loading
```bash
# Check Grafana logs
sudo journalctl -u grafana-server -n 50
# Verify data source
# Access: http://172.16.3.30:3000/datasources
# Test Prometheus connection
curl http://localhost:9090/api/v1/query?query=up
```
---
## Monitoring & Alerts
### Prometheus Alerts
Configured alerts (from `infrastructure/alerts.yml`):
1. **GuruConnectDown** - Server unreachable for 1 minute
2. **HighErrorRate** - >10 errors/second for 5 minutes
3. **TooManyActiveSessions** - >100 active sessions
4. **HighRequestLatency** - p95 >1s for 5 minutes
5. **DatabaseOperationsFailure** - DB errors >1/second
6. **ServerRestarted** - Uptime <5 minutes (informational)
**View alerts:** http://172.16.3.30:9090/alerts
### Grafana Dashboard
Pre-configured panels:
1. Active Sessions (gauge)
2. Requests per Second (graph)
3. Error Rate (graph with alerting)
4. Request Latency p50/p95/p99 (graph)
5. Active Connections by Type (stacked graph)
6. Database Query Duration (graph)
7. Server Uptime (singlestat)
8. Total Sessions Created (singlestat)
9. Total Requests (singlestat)
10. Total Errors (singlestat with thresholds)
---
## Backup & Recovery
### Manual Backup
```bash
cd ~/guru-connect/server
./backup-postgres.sh
```
Backup location: `/home/guru/backups/guruconnect/guruconnect-YYYY-MM-DD-HHMMSS.sql.gz`
### Restore from Backup
**WARNING:** This will drop and recreate the database!
```bash
cd ~/guru-connect/server
./restore-postgres.sh /path/to/backup.sql.gz
```
The script will:
1. Stop GuruConnect service
2. Drop existing database
3. Recreate database
4. Restore from backup
5. Restart service
### Backup Verification
```bash
# List backups
ls -lh /home/guru/backups/guruconnect/
# Check backup size
du -sh /home/guru/backups/guruconnect/*
# Verify backup contents (without restoring)
zcat /path/to/backup.sql.gz | head -50
```
---
## Security Checklist
- [x] JWT secret configured (96-char base64)
- [x] Database password changed from default
- [x] Admin password changed from default
- [x] Security headers enabled (CSP, X-Frame-Options, etc.)
- [x] Database credentials in .env (not committed to git)
- [ ] Grafana default password changed (admin/admin)
- [ ] Firewall rules configured (limit access to monitoring ports)
- [ ] SSL/TLS enabled for public endpoints
- [ ] Backup encryption (optional - consider encrypting backups)
- [ ] Regular security updates (OS, PostgreSQL, Prometheus, Grafana)
---
## Files Reference
### Configuration Files
- `server/.env` - Environment variables and secrets
- `server/guruconnect.service` - Systemd service unit
- `infrastructure/prometheus.yml` - Prometheus scrape config
- `infrastructure/alerts.yml` - Alert rules
- `infrastructure/grafana-dashboard.json` - Pre-built dashboard
### Scripts
- `server/start-secure.sh` - Manual server start
- `server/backup-postgres.sh` - Manual backup
- `server/restore-postgres.sh` - Restore from backup
- `server/health-monitor.sh` - Health checks
- `server/setup-systemd.sh` - Install systemd service
- `infrastructure/setup-monitoring.sh` - Install Prometheus/Grafana
- `install-production-infrastructure.sh` - Master installer
- `verify-installation.sh` - Verify installation status
---
## Support & Documentation
**Main Documentation:**
- `PHASE1_WEEK2_INFRASTRUCTURE.md` - Week 2 planning
- `DEPLOYMENT_WEEK2_INFRASTRUCTURE.md` - Week 2 deployment log
- `CLAUDE.md` - Project coding guidelines
**Gitea Repository:**
- https://git.azcomputerguru.com/azcomputerguru/guru-connect
**Dashboard:**
- https://connect.azcomputerguru.com/dashboard
**API Docs:**
- http://172.16.3.30:3002/api/docs (if OpenAPI enabled)
---
## Next Steps (Phase 1 Week 3)
After infrastructure is fully installed:
1. **CI/CD Automation**
- Gitea CI pipeline configuration
- Automated builds on commit
- Automated tests in CI
- Deployment automation
- Build artifact storage
- Version tagging
2. **Advanced Monitoring**
- Alertmanager configuration for email/Slack alerts
- Custom Grafana dashboards
- Log aggregation (optional - Loki)
- Distributed tracing (optional - Jaeger)
3. **Production Hardening**
- Firewall configuration
- Fail2ban for brute-force protection
- Rate limiting
- DDoS protection
- Regular security audits
---
**Last Updated:** 2026-01-18 04:00 UTC
**Version:** Phase 1 Week 2 Complete