Author: Mike Swanson Machine: Mikes-MacBook-Air.local Timestamp: 2026-05-25 13:53:11
8.8 KiB
8.8 KiB
Updates Page - User Guide
Overview
The Updates page provides a centralized dashboard for managing agent version rollouts across your GuruRMM infrastructure. It shows real-time health metrics and enables safe promotion or emergency rollback of agent versions.
Accessing the Page
- Log into GuruRMM dashboard
- Navigate to Config > Updates in the sidebar
- Or visit:
https://rmm.azcomputerguru.com/updates
Understanding the Table
Columns
Version
- Displays the agent version number (e.g., 0.6.27)
- Shown in monospace font for clarity
- Sorted newest to oldest
OS / Arch
- Operating system and architecture
- Examples:
windows / x86_64,linux / aarch64 - Each OS/arch combination is tracked separately
Channel
- Beta (blue badge): Testing channel with limited agents
- Stable (purple badge): Production channel with all agents
Health Status
- Healthy (green): All metrics within safe thresholds
- Warning (yellow): Some metrics approaching thresholds
- Critical (red): Metrics exceed safety thresholds
- Blocked (dark red): Version blocked from promotion
- Unknown (gray): No health data yet
Success Rate
- Percentage of successful update attempts
- Color coded:
- Green: >= 95% (excellent)
- Yellow: >= 80% (acceptable)
- Red: < 80% (concerning)
- Shows fraction: e.g., "96% (48/50)"
Beta Agents
- Number of agents currently on this version in beta channel
- Updates in real-time
Stable Agents
- Number of agents currently on this version in stable channel
- Updates in real-time
Actions
- Promote (up arrow): Move beta version to stable
- Rollback (rotate arrow): Force downgrade all agents
Promoting a Version to Stable
When to Promote
Promote a beta version when:
- Health status is "Healthy" (green)
- Success rate is >= 95%
- Beta agents have been running it for sufficient time
- No critical issues reported
How to Promote
- Find the beta version you want to promote
- Click the Promote button (up arrow)
- Review the confirmation dialog
- Click Promote to confirm
If Health Check Fails
If the automatic health check fails (e.g., crash rate too high):
- You'll see a warning dialog explaining the issue
- Option to Force Promote appears
- Review the warning carefully
- Only force promote if you understand the risks
- Consider investigating the health issues first
After Promotion
- Success toast shows: "Version X.Y.Z promoted to stable"
- Table refreshes automatically
- All agents on "stable" channel will update to this version
- Beta agents remain on beta (they'll get the next beta)
Rolling Back a Version
When to Rollback
Rollback a version when:
- Critical bug discovered after promotion
- Unexpected behavior in production
- Security vulnerability found
- Performance issues causing problems
How to Rollback
- Find the version causing issues
- Click the Rollback button (rotate arrow)
- Enter a required reason in the dialog
- Example: "Critical memory leak causing crashes"
- This reason is logged for audit purposes
- Review the warning: "This will force-downgrade all agents"
- Click Rollback to confirm
After Rollback
- Success toast shows: "Version X.Y.Z rolled back. N agents downgraded"
- All agents on this version will downgrade immediately
- Previous stable version becomes active again
- Rollback is logged in the database
Understanding Health Metrics
What Gets Tracked
- Total Attempts: Number of agents that attempted to update
- Success Count: Updates that completed successfully
- Failure Count: Updates that failed (network, download, etc.)
- Crash Count: Agents that crashed after updating
Health Status Calculation
The system automatically calculates health based on:
- Success rate (success_count / total_attempts)
- Crash rate (crash_count / total_attempts)
- Failure patterns over time
Thresholds (from Phase 4)
- Healthy:
- Success rate >= 95%
- Crash rate < 1%
- Failure rate < 5%
- Warning:
- Success rate 90-94%
- Crash rate 1-2%
- Failure rate 5-10%
- Critical:
- Success rate < 90%
- Crash rate >= 2%
- Failure rate > 10%
- Blocked:
- Crash rate >= 5% (auto-blocked from promotion)
Auto-Refresh
Behavior
- Table refreshes every 30 seconds automatically
- Manual refresh available with Refresh button
- Auto-refresh doesn't interrupt dialogs or actions
- Loading indicator shows during refresh
Why Auto-Refresh?
- See real-time health status changes
- Monitor promotion rollout progress
- Catch issues as they emerge
- No need to manually reload
Best Practices
Testing Flow
- Deploy to Beta: Build and deploy new version
- Monitor Health: Watch Updates page for 24-48 hours
- Check Success Rate: Ensure >= 95% success
- Review Logs: Look for any errors or warnings
- Promote: Once healthy, promote to stable
- Monitor Rollout: Watch stable agents update
- Be Ready: Keep an eye on metrics for first hour
Emergency Response
- Notice Issue: See critical health status or reports
- Assess Impact: Check how many agents affected
- Rollback: Use rollback button immediately
- Document: Provide clear reason in rollback dialog
- Investigate: Review logs and crash reports
- Fix: Prepare hotfix version
- Re-test: Deploy to beta first, never skip testing
Version Naming
- Use semantic versioning: MAJOR.MINOR.PATCH
- Example: 0.6.27 → 0.6.28 (patch), 0.7.0 (minor), 1.0.0 (major)
- Consistent versioning helps with sorting and tracking
Common Scenarios
Scenario 1: Normal Promotion
1. New version 0.6.28 deployed to beta
2. 15 beta agents update successfully (100% success rate)
3. Health shows "Healthy" (green)
4. After 24 hours, promote to stable
5. 230 stable agents begin updating
6. Monitor for issues during rollout
Scenario 2: Warning Status
1. Beta version shows "Warning" (yellow)
2. Success rate is 92% (46/50 agents)
3. Investigate: 4 agents failed due to network timeout
4. Decision: Wait longer or fix issue before promoting
5. Do not promote until "Healthy"
Scenario 3: Emergency Rollback
1. Version 0.6.28 promoted to stable
2. After 30 minutes, users report crashes
3. Updates page shows "Critical" status
4. Crash count increasing
5. Immediately rollback with reason: "Crashes on startup"
6. Agents downgrade within minutes
7. Investigate crash dumps and fix issue
Scenario 4: Force Promotion
1. Beta version shows "Warning" but issue is minor
2. Attempt to promote
3. System blocks due to health check
4. Review the specific warning
5. If acceptable (e.g., known cosmetic issue), force promote
6. Monitor closely after force promotion
Troubleshooting
Table Shows "No rollouts yet"
- No versions have been deployed to beta or stable
- Build and deploy a version to see it appear
- Check server logs for build issues
Promote Button Disabled
- Version is not on beta channel (only beta can promote)
- Health status is not "Healthy"
- Hover over button to see tooltip explaining why
Auto-Refresh Stopped
- Check network connection
- Check browser console for errors
- Try manual refresh button
- Reload page if issues persist
Action Failed with Error
- Check network connectivity
- Verify you're still logged in
- Check server status
- Review error message in toast notification
Security & Permissions
Who Can Promote/Rollback?
- Currently: All authenticated users
- Future: May be restricted to admin role
- All actions are logged with user attribution
Audit Trail
- All promotions logged in database
- All rollbacks logged with reason
- Timestamps and user IDs recorded
- Review audit logs:
rollout_eventstable
Performance
Page Load Time
- Initial load: < 2 seconds
- Auto-refresh: < 500ms
- Action response: < 1 second
Mobile Support
- Fully responsive design
- Table scrolls horizontally on small screens
- Dialogs adapt to mobile viewport
- All actions work on touch devices
Related Pages
Agent Detail Page
- Shows current version for individual agent
- Links to Updates page for version info
- Displays update channel (beta/stable)
Policies Page
- Configure auto-update policies
- Set update channel per client/site/agent
- Control update timing and windows
Logs Page
- View update-related log entries
- Filter by "update" or "rollout" keywords
- See detailed failure messages
Support
Need Help?
- Check server logs:
/api/logsendpoint - Review agent logs on affected machines
- Contact support with version number and error details
- Include rollout health metrics in report
Report Issues
- Use rollback feature to mitigate first
- Document reproduction steps
- Include success rate and crash count
- Provide sample agent IDs affected
Version: 1.0 Last Updated: 2026-05-25 Part of: Safe Agent Rollout System (Phase 5)