Files
claudetools/projects/msp-tools/guru-connect/INSTALLATION_GUIDE.md
Mike Swanson 6c316aa701 Add VPN configuration tools and agent documentation
Created comprehensive VPN setup tooling for Peaceful Spirit L2TP/IPsec connection
and enhanced agent documentation framework.

VPN Configuration (PST-NW-VPN):
- Setup-PST-L2TP-VPN.ps1: Automated L2TP/IPsec setup with split-tunnel and DNS
- Connect-PST-VPN.ps1: Connection helper with PPP adapter detection, DNS (192.168.0.2), and route config (192.168.0.0/24)
- Connect-PST-VPN-Standalone.ps1: Self-contained connection script for remote deployment
- Fix-PST-VPN-Auth.ps1: Authentication troubleshooting for CHAP/MSChapv2
- Diagnose-VPN-Interface.ps1: Comprehensive VPN interface and routing diagnostic
- Quick-Test-VPN.ps1: Fast connectivity verification (DNS/router/routes)
- Add-PST-VPN-Route-Manual.ps1: Manual route configuration helper
- vpn-connect.bat, vpn-disconnect.bat: Simple batch file shortcuts
- OpenVPN config files (Windows-compatible, abandoned for L2TP)

Key VPN Implementation Details:
- L2TP creates PPP adapter with connection name as interface description
- UniFi auto-configures DNS (192.168.0.2) but requires manual route to 192.168.0.0/24
- Split-tunnel enabled (only remote traffic through VPN)
- All-user connection for pre-login auto-connect via scheduled task
- Authentication: CHAP + MSChapv2 for UniFi compatibility

Agent Documentation:
- AGENT_QUICK_REFERENCE.md: Quick reference for all specialized agents
- documentation-squire.md: Documentation and task management specialist agent
- Updated all agent markdown files with standardized formatting

Project Organization:
- Moved conversation logs to dedicated directories (guru-connect-conversation-logs, guru-rmm-conversation-logs)
- Cleaned up old session JSONL files from projects/msp-tools/
- Added guru-connect infrastructure (agent, dashboard, proto, scripts, .gitea workflows)
- Added guru-rmm server components and deployment configs

Technical Notes:
- VPN IP pool: 192.168.4.x (client gets 192.168.4.6)
- Remote network: 192.168.0.0/24 (router at 192.168.0.10)
- PSK: rrClvnmUeXEFo90Ol+z7tfsAZHeSK6w7
- Credentials: pst-admin / 24Hearts$

Files: 15 VPN scripts, 2 agent docs, conversation log reorganization,
guru-connect/guru-rmm infrastructure additions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 11:51:47 -07:00

519 lines
11 KiB
Markdown

# GuruConnect Production Infrastructure Installation Guide
**Date:** 2026-01-18
**Server:** 172.16.3.30
**Status:** Core system operational, infrastructure ready for installation
---
## Current Status
- Server Process: Running (PID 3847752)
- Health Check: OK
- Metrics Endpoint: Operational
- Database: Connected (2 users)
- Dashboard: https://connect.azcomputerguru.com/dashboard
**Login:** username=`howard`, password=`AdminGuruConnect2026`
---
## Installation Options
### Option 1: One-Command Installation (Recommended)
Run the master installation script that installs everything:
```bash
ssh guru@172.16.3.30
cd ~/guru-connect
sudo bash install-production-infrastructure.sh
```
This will install:
1. Systemd service for auto-start and management
2. Prometheus & Grafana monitoring stack
3. Automated PostgreSQL backups (daily at 2:00 AM)
4. Log rotation configuration
**Time:** ~10-15 minutes (Grafana installation takes longest)
---
### Option 2: Step-by-Step Manual Installation
If you prefer to install components individually:
#### Step 1: Install Systemd Service
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/server
sudo ./setup-systemd.sh
```
**What this does:**
- Installs GuruConnect as a systemd service
- Enables auto-start on boot
- Configures auto-restart on failure
- Sets resource limits and security hardening
**Verify:**
```bash
sudo systemctl status guruconnect
sudo journalctl -u guruconnect -n 20
```
---
#### Step 2: Install Prometheus & Grafana
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/infrastructure
sudo ./setup-monitoring.sh
```
**What this does:**
- Installs Prometheus for metrics collection
- Installs Grafana for visualization
- Configures Prometheus to scrape GuruConnect metrics
- Sets up Prometheus data source in Grafana
**Access:**
- Prometheus: http://172.16.3.30:9090
- Grafana: http://172.16.3.30:3000 (admin/admin)
**Post-installation:**
1. Access Grafana at http://172.16.3.30:3000
2. Login with admin/admin
3. Change the default password
4. Import dashboard:
- Go to Dashboards > Import
- Upload `~/guru-connect/infrastructure/grafana-dashboard.json`
---
#### Step 3: Install Automated Backups
```bash
ssh guru@172.16.3.30
# Create backup directory
sudo mkdir -p /home/guru/backups/guruconnect
sudo chown guru:guru /home/guru/backups/guruconnect
# Install systemd timer
sudo cp ~/guru-connect/server/guruconnect-backup.service /etc/systemd/system/
sudo cp ~/guru-connect/server/guruconnect-backup.timer /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable guruconnect-backup.timer
sudo systemctl start guruconnect-backup.timer
```
**Verify:**
```bash
sudo systemctl status guruconnect-backup.timer
sudo systemctl list-timers
```
**Test manual backup:**
```bash
cd ~/guru-connect/server
./backup-postgres.sh
ls -lh /home/guru/backups/guruconnect/
```
**Backup Schedule:** Daily at 2:00 AM
**Retention:** 30 daily, 4 weekly, 6 monthly backups
---
#### Step 4: Install Log Rotation
```bash
ssh guru@172.16.3.30
sudo cp ~/guru-connect/server/guruconnect.logrotate /etc/logrotate.d/guruconnect
sudo chmod 644 /etc/logrotate.d/guruconnect
```
**Verify:**
```bash
sudo cat /etc/logrotate.d/guruconnect
sudo logrotate -d /etc/logrotate.d/guruconnect
```
**Log Rotation:** Daily, 30 days retention, compressed
---
## Verification
After installation, verify everything is working:
```bash
ssh guru@172.16.3.30
bash ~/guru-connect/verify-installation.sh
```
Expected output (all green):
- Server process: Running
- Health endpoint: OK
- Metrics endpoint: OK
- Systemd service: Active
- Prometheus: Active
- Grafana: Active
- Backup timer: Active
- Log rotation: Configured
- Database: Connected
---
## Post-Installation Tasks
### 1. Configure Grafana
1. Access http://172.16.3.30:3000
2. Login with admin/admin
3. Change password when prompted
4. Import dashboard:
```
Dashboards > Import > Upload JSON file
Select: ~/guru-connect/infrastructure/grafana-dashboard.json
```
### 2. Test Backup & Restore
**Test backup:**
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/server
./backup-postgres.sh
```
**Verify backup created:**
```bash
ls -lh /home/guru/backups/guruconnect/
```
**Test restore (CAUTION - use test database):**
```bash
cd ~/guru-connect/server
./restore-postgres.sh /home/guru/backups/guruconnect/guruconnect-YYYY-MM-DD-HHMMSS.sql.gz
```
### 3. Configure NPM (Nginx Proxy Manager)
If Prometheus/Grafana need external access:
1. Add proxy hosts in NPM:
- prometheus.azcomputerguru.com -> http://172.16.3.30:9090
- grafana.azcomputerguru.com -> http://172.16.3.30:3000
2. Enable SSL/TLS via Let's Encrypt
3. Restrict access (firewall or NPM access lists)
### 4. Test Health Monitoring
```bash
ssh guru@172.16.3.30
cd ~/guru-connect/server
./health-monitor.sh
```
Expected output: All checks passed
---
## Service Management
### GuruConnect Server
```bash
# Start server
sudo systemctl start guruconnect
# Stop server
sudo systemctl stop guruconnect
# Restart server
sudo systemctl restart guruconnect
# Check status
sudo systemctl status guruconnect
# View logs
sudo journalctl -u guruconnect -f
# View recent logs
sudo journalctl -u guruconnect -n 100
```
### Prometheus
```bash
# Status
sudo systemctl status prometheus
# Restart
sudo systemctl restart prometheus
# Logs
sudo journalctl -u prometheus -n 50
```
### Grafana
```bash
# Status
sudo systemctl status grafana-server
# Restart
sudo systemctl restart grafana-server
# Logs
sudo journalctl -u grafana-server -n 50
```
### Backups
```bash
# Check timer status
sudo systemctl status guruconnect-backup.timer
# Check when next backup runs
sudo systemctl list-timers
# Manually trigger backup
sudo systemctl start guruconnect-backup.service
# View backup logs
sudo journalctl -u guruconnect-backup -n 20
```
---
## Troubleshooting
### Server Won't Start
```bash
# Check logs
sudo journalctl -u guruconnect -n 50
# Check if port 3002 is in use
sudo netstat -tulpn | grep 3002
# Verify .env file
cat ~/guru-connect/server/.env
# Test manual start
cd ~/guru-connect/server
./start-secure.sh
```
### Database Connection Issues
```bash
# Test PostgreSQL
PGPASSWORD=gc_a7f82d1e4b9c3f60 psql -h localhost -U guruconnect -d guruconnect -c 'SELECT 1'
# Check PostgreSQL service
sudo systemctl status postgresql
# Verify DATABASE_URL in .env
cat ~/guru-connect/server/.env | grep DATABASE_URL
```
### Prometheus Not Scraping Metrics
```bash
# Check Prometheus targets
# Access: http://172.16.3.30:9090/targets
# Verify GuruConnect metrics endpoint
curl http://172.16.3.30:3002/metrics
# Check Prometheus config
sudo cat /etc/prometheus/prometheus.yml
# Restart Prometheus
sudo systemctl restart prometheus
```
### Grafana Dashboard Not Loading
```bash
# Check Grafana logs
sudo journalctl -u grafana-server -n 50
# Verify data source
# Access: http://172.16.3.30:3000/datasources
# Test Prometheus connection
curl http://localhost:9090/api/v1/query?query=up
```
---
## Monitoring & Alerts
### Prometheus Alerts
Configured alerts (from `infrastructure/alerts.yml`):
1. **GuruConnectDown** - Server unreachable for 1 minute
2. **HighErrorRate** - >10 errors/second for 5 minutes
3. **TooManyActiveSessions** - >100 active sessions
4. **HighRequestLatency** - p95 >1s for 5 minutes
5. **DatabaseOperationsFailure** - DB errors >1/second
6. **ServerRestarted** - Uptime <5 minutes (informational)
**View alerts:** http://172.16.3.30:9090/alerts
### Grafana Dashboard
Pre-configured panels:
1. Active Sessions (gauge)
2. Requests per Second (graph)
3. Error Rate (graph with alerting)
4. Request Latency p50/p95/p99 (graph)
5. Active Connections by Type (stacked graph)
6. Database Query Duration (graph)
7. Server Uptime (singlestat)
8. Total Sessions Created (singlestat)
9. Total Requests (singlestat)
10. Total Errors (singlestat with thresholds)
---
## Backup & Recovery
### Manual Backup
```bash
cd ~/guru-connect/server
./backup-postgres.sh
```
Backup location: `/home/guru/backups/guruconnect/guruconnect-YYYY-MM-DD-HHMMSS.sql.gz`
### Restore from Backup
**WARNING:** This will drop and recreate the database!
```bash
cd ~/guru-connect/server
./restore-postgres.sh /path/to/backup.sql.gz
```
The script will:
1. Stop GuruConnect service
2. Drop existing database
3. Recreate database
4. Restore from backup
5. Restart service
### Backup Verification
```bash
# List backups
ls -lh /home/guru/backups/guruconnect/
# Check backup size
du -sh /home/guru/backups/guruconnect/*
# Verify backup contents (without restoring)
zcat /path/to/backup.sql.gz | head -50
```
---
## Security Checklist
- [x] JWT secret configured (96-char base64)
- [x] Database password changed from default
- [x] Admin password changed from default
- [x] Security headers enabled (CSP, X-Frame-Options, etc.)
- [x] Database credentials in .env (not committed to git)
- [ ] Grafana default password changed (admin/admin)
- [ ] Firewall rules configured (limit access to monitoring ports)
- [ ] SSL/TLS enabled for public endpoints
- [ ] Backup encryption (optional - consider encrypting backups)
- [ ] Regular security updates (OS, PostgreSQL, Prometheus, Grafana)
---
## Files Reference
### Configuration Files
- `server/.env` - Environment variables and secrets
- `server/guruconnect.service` - Systemd service unit
- `infrastructure/prometheus.yml` - Prometheus scrape config
- `infrastructure/alerts.yml` - Alert rules
- `infrastructure/grafana-dashboard.json` - Pre-built dashboard
### Scripts
- `server/start-secure.sh` - Manual server start
- `server/backup-postgres.sh` - Manual backup
- `server/restore-postgres.sh` - Restore from backup
- `server/health-monitor.sh` - Health checks
- `server/setup-systemd.sh` - Install systemd service
- `infrastructure/setup-monitoring.sh` - Install Prometheus/Grafana
- `install-production-infrastructure.sh` - Master installer
- `verify-installation.sh` - Verify installation status
---
## Support & Documentation
**Main Documentation:**
- `PHASE1_WEEK2_INFRASTRUCTURE.md` - Week 2 planning
- `DEPLOYMENT_WEEK2_INFRASTRUCTURE.md` - Week 2 deployment log
- `CLAUDE.md` - Project coding guidelines
**Gitea Repository:**
- https://git.azcomputerguru.com/azcomputerguru/guru-connect
**Dashboard:**
- https://connect.azcomputerguru.com/dashboard
**API Docs:**
- http://172.16.3.30:3002/api/docs (if OpenAPI enabled)
---
## Next Steps (Phase 1 Week 3)
After infrastructure is fully installed:
1. **CI/CD Automation**
- Gitea CI pipeline configuration
- Automated builds on commit
- Automated tests in CI
- Deployment automation
- Build artifact storage
- Version tagging
2. **Advanced Monitoring**
- Alertmanager configuration for email/Slack alerts
- Custom Grafana dashboards
- Log aggregation (optional - Loki)
- Distributed tracing (optional - Jaeger)
3. **Production Hardening**
- Firewall configuration
- Fail2ban for brute-force protection
- Rate limiting
- DDoS protection
- Regular security audits
---
**Last Updated:** 2026-01-18 04:00 UTC
**Version:** Phase 1 Week 2 Complete