diff --git a/PHASE_6_TEST_PLAN.md b/PHASE_6_TEST_PLAN.md new file mode 100644 index 0000000..e4133bd --- /dev/null +++ b/PHASE_6_TEST_PLAN.md @@ -0,0 +1,571 @@ +# Phase 6: End-to-End Testing Plan +## Safe Agent Rollout System + +**Date:** 2026-05-25 +**Version:** GuruRMM v0.6.41+ +**Tester:** Mike Swanson + +--- + +## Prerequisites + +### Environment Setup +- [ ] SSH access to Saturn (172.16.3.30) +- [ ] Access to GuruRMM dashboard (https://rmm.azcomputerguru.com) +- [ ] JWT token for API testing +- [ ] At least 2 test agents (GURU-KALI, GURU-5070 recommended) + +### Pre-Test Verification +```bash +# On Saturn +ssh azcomputerguru@172.16.3.30 + +# 1. Verify migration 046 is applied +sudo -u postgres psql gururmm_production -c "\d update_rollouts" +sudo -u postgres psql gururmm_production -c "\d update_health_metrics" +sudo -u postgres psql gururmm_production -c "\d agent_update_events" + +# 2. Verify server build is current +cd /opt/gururmm/server +git status # Should show Phase 4 code +cargo build --release --features production + +# 3. Verify dashboard build is current +cd /opt/gururmm/dashboard +git status # Should show Phase 5 code +npm run build + +# 4. Verify health monitor is running +sudo systemctl status gururmm-server +sudo journalctl -u gururmm-server -n 50 | grep "Health monitoring task spawned" +``` + +--- + +## Test 1: Beta-First Build Workflow + +**Objective:** Verify new builds default to beta channel and stable agents don't receive them. + +### Steps + +1. **Trigger a test build** +```bash +# On Saturn +cd /opt/gururmm +sudo ./build-linux.sh # Will auto-increment to next version +sudo ./build-windows.sh +``` + +2. **Verify .channel files created** +```bash +cd /var/www/gururmm/downloads +ls -la *.channel | tail -10 + +# Expected: New version should have .channel files containing "beta" +VERSION=$(ls -t gururmm-agent-linux-amd64-*.tar.gz | head -1 | grep -oP '\d+\.\d+\.\d+') +cat gururmm-agent-linux-amd64-${VERSION}.tar.gz.channel +# Should output: beta +``` + +3. **Mark test agents as beta** +```bash +# Via API or SQL +curl -X PATCH https://rmm.azcomputerguru.com/api/agents/GURU-KALI-UUID/channel \ + -H "Authorization: Bearer $TOKEN" \ + -d '{"update_channel": "beta"}' +``` + +4. **Verify beta agent receives update** +- Open dashboard → Agents → GURU-KALI +- Wait for agent connection (heartbeat every 60s) +- Check agent state for pending update +- Expected: Should see update_available = true + +5. **Verify stable agent does NOT receive update** +- Ensure GURU-5070 is on "stable" channel +- Check agent state +- Expected: update_available = false (version not in stable channel) + +### Success Criteria +- ✅ .channel files exist for new version +- ✅ .channel files contain "beta" +- ✅ Beta agents offered the update +- ✅ Stable agents NOT offered the update +- ✅ Scanner logs show beta/stable filtering + +--- + +## Test 2: Health Monitoring & Crash Detection + +**Objective:** Verify health monitor detects crashes and updates metrics. + +### Steps + +1. **Clear existing health data (optional)** +```sql +sudo -u postgres psql gururmm_production -c "DELETE FROM update_health_metrics WHERE version = '$VERSION';" +sudo -u postgres psql gururmm_production -c "DELETE FROM agent_update_events WHERE version_to = '$VERSION';" +``` + +2. **Simulate successful update** +```bash +# On test agent (GURU-KALI) +# Let update complete normally +# Wait 5 minutes +``` + +3. **Check event logging** +```sql +SELECT event_type, version_to, created_at +FROM agent_update_events +WHERE agent_id = 'GURU-KALI-UUID' +ORDER BY created_at DESC +LIMIT 5; + +# Expected events: +# - update_dispatched +# - download_started (if implemented) +# - download_complete (if implemented) +# - update_applied +``` + +4. **Check health metrics incremented** +```sql +SELECT version, total_attempts, successful_updates, failed_updates, crash_count, health_status +FROM update_health_metrics +WHERE version = '$VERSION'; + +# Expected: +# total_attempts = 1 +# successful_updates = 1 +# health_status = 'unknown' (< 5 attempts) +``` + +5. **Simulate crash** +```bash +# On test agent +# 1. Trigger update dispatch +# 2. Immediately after "update_applied" event, stop agent service +sudo systemctl stop gururmm-agent +# 3. Wait 60-90 seconds for health monitor scan +``` + +6. **Verify crash detection** +```sql +SELECT event_type, created_at +FROM agent_update_events +WHERE agent_id = 'GURU-KALI-UUID' +AND event_type = 'crash_detected' +ORDER BY created_at DESC; + +# Expected: Should see crash_detected event + +SELECT crash_count, health_status +FROM update_health_metrics +WHERE version = '$VERSION'; + +# Expected: crash_count incremented, health_status may change +``` + +7. **Check server logs** +```bash +sudo journalctl -u gururmm-server -n 100 | grep -E "crash|health" +# Expected: "Detected crash: agent X went offline after updating to Y" +``` + +### Success Criteria +- ✅ Events logged correctly (update_dispatched, update_applied) +- ✅ Health metrics incremented on success +- ✅ Crash detected within 90 seconds +- ✅ crash_detected event logged +- ✅ Crash counter incremented +- ✅ Health status updated based on thresholds + +--- + +## Test 3: Promotion Workflow + +**Objective:** Verify promotion from beta to stable with health gates. + +### Steps + +1. **Attempt promotion with insufficient data** +```bash +curl -X POST https://rmm.azcomputerguru.com/api/updates/rollouts/$VERSION/promote \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"os": "linux", "arch": "amd64", "force": false}' + +# Expected: May succeed (unknown status allows promotion) or fail if health check implemented +``` + +2. **Generate healthy metrics** +```bash +# Simulate 5+ successful updates +# Option A: Manually insert via SQL (for testing) +# Option B: Trigger real updates on multiple beta agents + +# SQL approach for testing: +sudo -u postgres psql gururmm_production << EOF +UPDATE update_health_metrics +SET total_attempts = 5, + successful_updates = 5, + failed_updates = 0, + crash_count = 0, + health_status = 'healthy' +WHERE version = '$VERSION' AND os = 'linux' AND arch = 'amd64'; +EOF +``` + +3. **Verify health status** +```bash +curl https://rmm.azcomputerguru.com/api/updates/rollouts \ + -H "Authorization: Bearer $TOKEN" | jq '.[] | select(.version == "'$VERSION'")' + +# Expected: health.status = "healthy" +``` + +4. **Promote to stable** +```bash +curl -X POST https://rmm.azcomputerguru.com/api/updates/rollouts/$VERSION/promote \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"os": "linux", "arch": "amd64", "force": false}' + +# Expected: {"success": true, "message": "Promoted...", "files_updated": 2} +``` + +5. **Verify .channel files updated** +```bash +cat /var/www/gururmm/downloads/gururmm-agent-linux-amd64-${VERSION}.tar.gz.channel +# Expected: stable +``` + +6. **Verify database updated** +```sql +SELECT channel, promoted_at, promoted_by +FROM update_rollouts +WHERE version = '$VERSION' AND os = 'linux' AND arch = 'amd64'; + +# Expected: channel = 'stable', promoted_at = NOW(), promoted_by = user_id +``` + +7. **Verify stable agents receive update** +- Ensure test agent is on "stable" channel +- Wait for scanner rescan (happens immediately after promotion) +- Check agent state +- Expected: update_available = true + +8. **Test force promotion** +```bash +# Set health to warning +sudo -u postgres psql gururmm_production << EOF +UPDATE update_health_metrics +SET health_status = 'warning' +WHERE version = '$VERSION' AND os = 'windows' AND arch = 'amd64'; +EOF + +# Try promotion without force +curl -X POST https://rmm.azcomputerguru.com/api/updates/rollouts/$VERSION/promote \ + -H "Authorization: Bearer $TOKEN" \ + -d '{"os": "windows", "arch": "amd64", "force": false}' + +# Expected: 403 error with message about health status + +# Try with force flag +curl -X POST https://rmm.azcomputerguru.com/api/updates/rollouts/$VERSION/promote \ + -H "Authorization: Bearer $TOKEN" \ + -d '{"os": "windows", "arch": "amd64", "force": true}' + +# Expected: 200 success (overridden health check) +``` + +### Success Criteria +- ✅ Promotion blocked for unhealthy versions (unless forced) +- ✅ Promotion succeeds for healthy versions +- ✅ .channel files updated from "beta" to "stable" +- ✅ Database rollouts table updated +- ✅ Scanner rescans immediately +- ✅ Stable agents receive update after promotion +- ✅ Force flag overrides health checks +- ✅ Dashboard shows updated channel + +--- + +## Test 4: Rollback Workflow + +**Objective:** Verify rollback blocks version and force-downgrades agents. + +### Steps + +1. **Prepare for rollback** +```bash +# Ensure test agent is running the rollback target version +# Verify previous stable version exists +curl https://rmm.azcomputerguru.com/api/updates/rollouts \ + -H "Authorization: Bearer $TOKEN" | jq '.[] | select(.channel == "stable") | .version' +``` + +2. **Execute rollback** +```bash +curl -X POST https://rmm.azcomputerguru.com/api/updates/rollouts/$VERSION/rollback \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "os": "linux", + "arch": "amd64", + "reason": "Test rollback: simulating critical bug in version '$VERSION'" + }' + +# Expected: {"success": true, "agents_affected": 1, "downgrade_version": "0.6.40"} +``` + +3. **Verify .channel files removed** +```bash +ls /var/www/gururmm/downloads/gururmm-agent-linux-amd64-${VERSION}.tar.gz.channel +# Expected: File not found (removed) +``` + +4. **Verify health status blocked** +```sql +SELECT health_status, last_incident +FROM update_health_metrics +WHERE version = '$VERSION' AND os = 'linux' AND arch = 'amd64'; + +# Expected: health_status = 'blocked', last_incident = reason text +``` + +5. **Verify forced downgrade dispatched** +```bash +# Check server logs for WebSocket dispatch +sudo journalctl -u gururmm-server -n 100 | grep -i "downgrade\|rollback" + +# Check agent receives forced update +# Monitor agent logs for update trigger +``` + +6. **Verify agent downgrades** +- Agent should receive UpdateAvailable message with previous version +- Agent should download and install previous version +- Check agent version after completion +- Expected: agent_version = previous stable version + +7. **Verify blocked version not offered again** +```bash +# Scanner should skip files without .channel files +# Verify version is not in available updates list +curl https://rmm.azcomputerguru.com/api/updates/rollouts \ + -H "Authorization: Bearer $TOKEN" | jq '.[] | select(.version == "'$VERSION'")' + +# If present, should show channel = null or health.status = "blocked" +``` + +### Success Criteria +- ✅ .channel files removed +- ✅ Health status set to "blocked" +- ✅ Last incident reason recorded +- ✅ Connected agents receive forced downgrade +- ✅ Agents successfully downgrade to previous stable +- ✅ Blocked version not offered to new agents +- ✅ Dashboard shows blocked status + +--- + +## Test 5: Dashboard UI Testing + +**Objective:** Verify Updates page displays correctly and actions work. + +### Steps + +1. **Access Updates page** +- Navigate to https://rmm.azcomputerguru.com/updates +- Login if needed + +2. **Verify data display** +- [ ] Table shows all rollout versions +- [ ] Columns: Version, OS/Arch, Channel, Health, Success Rate, Agent Counts, Actions +- [ ] Health badges color-coded (green/yellow/red/gray) +- [ ] Success rate calculated correctly +- [ ] Agent counts accurate + +3. **Test promote button** +- [ ] Enabled for beta + healthy versions only +- [ ] Disabled with tooltip for unhealthy versions +- [ ] Click opens confirmation dialog +- [ ] Confirm triggers API call +- [ ] Success toast appears +- [ ] Table refreshes with updated data + +4. **Test rollback button** +- [ ] Always enabled +- [ ] Click opens dialog with reason input +- [ ] Reason field is required +- [ ] Confirm triggers API call +- [ ] Success toast shows agent count +- [ ] Table refreshes with updated data + +5. **Test error handling** +- [ ] Shows loading state during fetch +- [ ] Shows error message if API fails +- [ ] Retry button works +- [ ] Shows empty state if no rollouts + +6. **Test auto-refresh** +- [ ] Data refreshes every 30 seconds +- [ ] Refresh doesn't disrupt UI interactions +- [ ] Manual refresh button works + +### Success Criteria +- ✅ All table columns display correct data +- ✅ Health badges use correct colors +- ✅ Promote button only enabled for healthy beta versions +- ✅ Rollback button always enabled +- ✅ Confirmation dialogs work +- ✅ API calls succeed +- ✅ Toasts display success/error +- ✅ Auto-refresh works +- ✅ Responsive on mobile + +--- + +## Test 6: Integration Testing + +**Objective:** Test complete workflows end-to-end. + +### Workflow 1: New Build → Beta Testing → Promotion → Stable Deployment + +1. Trigger new build (auto-bumps version) +2. Verify .channel files = "beta" +3. Mark GURU-KALI as beta agent +4. Wait for update dispatch +5. Monitor update installation +6. Verify success event logged +7. Repeat 4 more times for healthy status +8. Promote via dashboard +9. Verify GURU-5070 (stable) receives update +10. Monitor stable deployment +11. Verify all agents updated + +**Expected:** Beta testing prevents bad updates from reaching production. + +### Workflow 2: Critical Bug → Rollback → Fleet Downgrade + +1. Simulate critical bug discovered post-promotion +2. Execute rollback via dashboard +3. Verify all agents receive forced downgrade +4. Verify agents revert to previous stable +5. Verify new agents don't receive blocked version +6. Verify health metrics show blocked status + +**Expected:** Rollback protects fleet from bad updates. + +### Workflow 3: Crash Detection → Auto-Block (Future Enhancement) + +1. Deploy update to beta agents +2. Simulate crash (stop service after update) +3. Wait for health monitor (60s) +4. Verify crash detected and logged +5. Check if crash rate >25% +6. Verify health status = "critical" +7. Attempt promotion +8. Verify promotion blocked + +**Expected:** High crash rates prevent automatic promotion. + +--- + +## Performance Testing + +### Load Testing +- [ ] 100+ agents checking for updates simultaneously +- [ ] Scanner performance with 50+ versions +- [ ] Health monitor with 1000+ update events +- [ ] Dashboard with 20+ rollouts displayed + +### Stress Testing +- [ ] Rapid version releases (5 builds in 10 minutes) +- [ ] Mass rollback (100+ agents) +- [ ] Concurrent API calls (multiple users promoting/rolling back) + +--- + +## Security Testing + +### Authentication +- [ ] All API endpoints require valid JWT +- [ ] Expired tokens rejected +- [ ] Invalid tokens rejected + +### Authorization +- [ ] Admin role can promote/rollback +- [ ] Non-admin role blocked (if RBAC implemented) + +### Input Validation +- [ ] SQL injection attempts blocked +- [ ] XSS attempts in reason field sanitized +- [ ] Invalid version strings rejected +- [ ] Invalid OS/arch values rejected + +### File System Security +- [ ] .channel files have correct permissions +- [ ] Path traversal attempts blocked +- [ ] Only authorized processes can modify .channel files + +--- + +## Regression Testing + +### Existing Functionality +- [ ] Agent registration still works +- [ ] Heartbeat processing unaffected +- [ ] Command execution unaffected +- [ ] Metrics collection unaffected +- [ ] Alert generation unaffected +- [ ] Policy enforcement unaffected + +### Database Performance +- [ ] No slow queries introduced +- [ ] Indexes used efficiently +- [ ] No lock contention + +--- + +## Documentation Verification + +- [ ] API endpoints documented +- [ ] Database schema documented +- [ ] Dashboard user guide accurate +- [ ] Admin procedures documented +- [ ] Troubleshooting guide created + +--- + +## Sign-Off + +### Phase 6 Test Results + +**Tester:** ___________________________ +**Date:** ___________________________ + +**Test 1 - Beta-First Workflow:** ⬜ PASS ⬜ FAIL +**Test 2 - Health Monitoring:** ⬜ PASS ⬜ FAIL +**Test 3 - Promotion:** ⬜ PASS ⬜ FAIL +**Test 4 - Rollback:** ⬜ PASS ⬜ FAIL +**Test 5 - Dashboard UI:** ⬜ PASS ⬜ FAIL +**Test 6 - Integration:** ⬜ PASS ⬜ FAIL + +**Overall Status:** ⬜ APPROVED FOR PRODUCTION ⬜ NEEDS FIXES + +**Notes:** +``` + + +``` + +**Blockers/Issues:** +``` + + +``` + +**Deployment Date:** ___________________________ diff --git a/verify-rollout-system.sh b/verify-rollout-system.sh new file mode 100755 index 0000000..314accb --- /dev/null +++ b/verify-rollout-system.sh @@ -0,0 +1,282 @@ +#!/usr/bin/env bash +# Verification script for Safe Agent Rollout System +# Run on Saturn (172.16.3.30) to verify Phase 1-5 implementation + +set -e + +GREEN='\033[0;32m' +RED='\033[0;31m' +YELLOW='\033[1;33m' +NC='\033[0m' + +echo "==========================================" +echo "GuruRMM Safe Rollout System Verification" +echo "==========================================" +echo "" + +# Function to check status +check() { + if [ $? -eq 0 ]; then + echo -e "${GREEN}[OK]${NC} $1" + return 0 + else + echo -e "${RED}[FAIL]${NC} $1" + return 1 + fi +} + +info() { + echo -e "${YELLOW}[INFO]${NC} $1" +} + +FAIL_COUNT=0 + +# ===== Phase 1: Build Scripts ===== +echo "Phase 1: Build Scripts" +echo "----------------------" + +if grep -q "Mark all new builds as beta" /opt/gururmm/build-linux.sh; then + check "build-linux.sh has beta marking code" +else + check "build-linux.sh missing beta marking code" + ((FAIL_COUNT++)) +fi + +if grep -q "Mark all new builds as beta" /opt/gururmm/build-windows.sh; then + check "build-windows.sh has beta marking code" +else + check "build-windows.sh missing beta marking code" + ((FAIL_COUNT++)) +fi + +# Check for actual .channel files +CHANNEL_COUNT=$(find /var/www/gururmm/downloads -name "*.channel" 2>/dev/null | wc -l) +if [ "$CHANNEL_COUNT" -gt 0 ]; then + check ".channel files exist in downloads directory ($CHANNEL_COUNT found)" + info "Sample: $(find /var/www/gururmm/downloads -name "*.channel" | head -1)" + SAMPLE_FILE=$(find /var/www/gururmm/downloads -name "*.channel" | head -1) + if [ -f "$SAMPLE_FILE" ]; then + SAMPLE_CONTENT=$(cat "$SAMPLE_FILE") + info "Content: $SAMPLE_CONTENT" + fi +else + check ".channel files in downloads directory" + ((FAIL_COUNT++)) + info "No .channel files found - may need to trigger a build" +fi + +echo "" + +# ===== Phase 2: Database Migration ===== +echo "Phase 2: Database Migration" +echo "---------------------------" + +# Check tables exist +if sudo -u postgres psql gururmm_production -t -c "\d update_rollouts" &>/dev/null; then + check "update_rollouts table exists" +else + check "update_rollouts table exists" + ((FAIL_COUNT++)) +fi + +if sudo -u postgres psql gururmm_production -t -c "\d update_health_metrics" &>/dev/null; then + check "update_health_metrics table exists" +else + check "update_health_metrics table exists" + ((FAIL_COUNT++)) +fi + +if sudo -u postgres psql gururmm_production -t -c "\d agent_update_events" &>/dev/null; then + check "agent_update_events table exists" +else + check "agent_update_events table exists" + ((FAIL_COUNT++)) +fi + +# Check for data +ROLLOUT_COUNT=$(sudo -u postgres psql gururmm_production -t -c "SELECT COUNT(*) FROM update_rollouts" 2>/dev/null | xargs) +info "Rollouts tracked: $ROLLOUT_COUNT" + +EVENT_COUNT=$(sudo -u postgres psql gururmm_production -t -c "SELECT COUNT(*) FROM agent_update_events" 2>/dev/null | xargs) +info "Update events logged: $EVENT_COUNT" + +METRIC_COUNT=$(sudo -u postgres psql gururmm_production -t -c "SELECT COUNT(*) FROM update_health_metrics" 2>/dev/null | xargs) +info "Health metrics tracked: $METRIC_COUNT" + +echo "" + +# ===== Phase 3: Health Monitoring ===== +echo "Phase 3: Health Monitoring" +echo "--------------------------" + +# Check source files exist +if [ -f "/opt/gururmm/server/src/updates/health.rs" ]; then + check "health.rs source file exists" +else + check "health.rs source file exists" + ((FAIL_COUNT++)) +fi + +# Check if server is running +if systemctl is-active --quiet gururmm-server; then + check "GuruRMM server is running" + + # Check for health monitor in logs + if sudo journalctl -u gururmm-server --since "1 hour ago" | grep -q "Health monitoring task spawned"; then + check "Health monitor task spawned (found in logs)" + else + echo -e "${YELLOW}[WARN]${NC} Health monitor spawn message not found in recent logs" + info "May need to restart service if code just deployed" + fi +else + check "GuruRMM server is running" + ((FAIL_COUNT++)) +fi + +echo "" + +# ===== Phase 4: API Endpoints ===== +echo "Phase 4: API Endpoints" +echo "----------------------" + +if [ -f "/opt/gururmm/server/src/api/updates.rs" ]; then + check "updates.rs API file exists" + + # Check for key functions + if grep -q "pub async fn list_rollouts" /opt/gururmm/server/src/api/updates.rs; then + check "list_rollouts endpoint defined" + else + check "list_rollouts endpoint defined" + ((FAIL_COUNT++)) + fi + + if grep -q "pub async fn promote_version" /opt/gururmm/server/src/api/updates.rs; then + check "promote_version endpoint defined" + else + check "promote_version endpoint defined" + ((FAIL_COUNT++)) + fi + + if grep -q "pub async fn rollback_version" /opt/gururmm/server/src/api/updates.rs; then + check "rollback_version endpoint defined" + else + check "rollback_version endpoint defined" + ((FAIL_COUNT++)) + fi +else + check "updates.rs API file exists" + ((FAIL_COUNT++)) +fi + +# Check routes registered +if grep -q "api::updates::list_rollouts" /opt/gururmm/server/src/api/mod.rs; then + check "API routes registered in mod.rs" +else + check "API routes registered in mod.rs" + ((FAIL_COUNT++)) +fi + +echo "" + +# ===== Phase 5: Dashboard UI ===== +echo "Phase 5: Dashboard UI" +echo "---------------------" + +if [ -f "/opt/gururmm/dashboard/src/pages/Updates.tsx" ]; then + check "Updates.tsx page exists" + + # Check for key components + if grep -q "RolloutInfo" /opt/gururmm/dashboard/src/pages/Updates.tsx; then + check "RolloutInfo interface defined" + else + check "RolloutInfo interface defined" + ((FAIL_COUNT++)) + fi + + if grep -q "handlePromote" /opt/gururmm/dashboard/src/pages/Updates.tsx; then + check "Promote functionality implemented" + else + check "Promote functionality implemented" + ((FAIL_COUNT++)) + fi + + if grep -q "handleRollback" /opt/gururmm/dashboard/src/pages/Updates.tsx; then + check "Rollback functionality implemented" + else + check "Rollback functionality implemented" + ((FAIL_COUNT++)) + fi +else + check "Updates.tsx page exists" + ((FAIL_COUNT++)) +fi + +# Check navigation +if grep -q "/updates" /opt/gururmm/dashboard/src/App.tsx; then + check "Updates route registered in App.tsx" +else + check "Updates route registered in App.tsx" + ((FAIL_COUNT++)) +fi + +if grep -q "updates" /opt/gururmm/dashboard/src/components/Layout.tsx; then + check "Updates navigation link added" +else + check "Updates navigation link added" + ((FAIL_COUNT++)) +fi + +echo "" + +# ===== Build Status ===== +echo "Build Status" +echo "------------" + +# Check server binary +if [ -f "/opt/gururmm/gururmm-server" ]; then + SERVER_SIZE=$(stat -f%z "/opt/gururmm/gururmm-server" 2>/dev/null || stat -c%s "/opt/gururmm/gururmm-server" 2>/dev/null) + SERVER_DATE=$(stat -f%Sm "/opt/gururmm/gururmm-server" 2>/dev/null || stat -c%y "/opt/gururmm/gururmm-server" 2>/dev/null | cut -d' ' -f1) + check "Server binary exists (${SERVER_SIZE} bytes, ${SERVER_DATE})" +else + check "Server binary exists" + ((FAIL_COUNT++)) +fi + +# Check dashboard build +if [ -d "/opt/gururmm/dashboard/dist" ]; then + check "Dashboard build exists" +else + echo -e "${YELLOW}[WARN]${NC} Dashboard dist/ directory not found - may need to run 'npm run build'" +fi + +echo "" + +# ===== Summary ===== +echo "==========================================" +echo "Verification Summary" +echo "==========================================" + +if [ $FAIL_COUNT -eq 0 ]; then + echo -e "${GREEN}✓ All checks passed!${NC}" + echo "" + echo "Safe Agent Rollout System is ready for Phase 6 testing." + echo "" + echo "Next steps:" + echo " 1. Review PHASE_6_TEST_PLAN.md" + echo " 2. Execute Test 1: Beta-first build workflow" + echo " 3. Execute Test 2-4: Health monitoring, promotion, rollback" + echo " 4. Execute Test 5: Dashboard UI testing" + echo " 5. Execute Test 6: Integration testing" + exit 0 +else + echo -e "${RED}✗ ${FAIL_COUNT} check(s) failed${NC}" + echo "" + echo "Review failures above and fix before proceeding to Phase 6." + echo "" + echo "Common issues:" + echo " - Code not deployed (git pull + rebuild needed)" + echo " - Migration not applied (run migration 046)" + echo " - Service not restarted (systemctl restart gururmm-server)" + echo " - Build not triggered (no .channel files yet)" + exit 1 +fi