Dataforth (projects/dataforth-dos/): - UI feature: row coloring + PUSH/RE-PUSH buttons + Website Status filter - Database dedup to one row per SN (2.89M -> 469K rows, UNIQUE constraint added) - Import logic handles FAIL -> PASS retest transition - Refactored upload-to-api.js to render datasheets in-memory (dropped For_Web filesystem dep) - Bulk pushed 170,984 records to Hoffman API - Statistical sanity check: 100/100 stamped SNs verified on Hoffman GuruRMM (projects/msp-tools/guru-rmm/): - ROADMAP.md: added Terminology (5-tier hierarchy), Tunnel Channels Phase 2, Logging/Audit/Observability, Multi-tenancy, Modular Architecture, Protocol Versioning, Certificates sections + Decisions Log - CONTEXT.md: hierarchy table, new anti-patterns (bootstrap sacred, no cross-module imports), revised next-steps priorities Session logs for both projects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
10 KiB
GuruRMM Session Log — 2026-04-15
Context
End-to-end test of the Tunnel Phase 1 lifecycle, triggered opportunistically
while troubleshooting SSH flakiness on AD2 (Dataforth project). No code
changes — exercised the production API from an off-LAN workstation via the
public Cloudflare endpoint (rmm-api.azcomputerguru.com).
What worked
| Step | Endpoint | Result |
|---|---|---|
| Login | POST /api/auth/login |
200, token returned |
| List agents | GET /api/agents |
6 agents, AD2 and DESKTOP-0O8A1RL online on v0.6.0 |
| Open tunnel | POST /api/v1/tunnel/open (agent_id=AD2 d28a1c90-47d7-448f-a287-197bc8892234) |
200, {session_id: 0682a80c-a899-403b-9473-aaaed50e4aba, status: active} |
| Status while active | GET /api/v1/tunnel/status/{id} |
200, full session record (opened_at, last_activity, agent_id) |
| Close tunnel | POST /api/v1/tunnel/close |
200, {status: closed} |
Findings (actionable)
1. Status endpoint returns 403 after close
GET /api/v1/tunnel/status/{id} against a just-closed session returns
403 Forbidden — "Session not found or not owned by user" instead of
{status: closed}. Root cause likely that the WHERE status = 'active'
filter (from idx_tech_sessions_active — see CONTEXT.md line 256) is applied
to the status lookup in addition to the ownership check, so closed sessions
fail ownership verification and fall through to the 403 branch.
Fix: separate the existence lookup from the ownership check. If the session exists but belongs to the requesting tech, return the closed record rather than masking it as a permission error.
Location to inspect: server/src/api/tunnel.rs (status handler) and/or
server/src/db/tunnel.rs (session fetch query).
2. Agent writes no logs
gururmm-agent.exe 0.6.0 on AD2 produces no files in
C:\Program Files\GuruRMM\, C:\ProgramData\GuruRMM\, nor any Windows
Application Event Log entries under provider gururmm*. This made it
impossible to confirm the agent-side state transition
(Heartbeat → Tunnel) or receipt of TunnelReady during the test.
Fix: add a log target in agent/src/main.rs (env_logger or tracing
with a rolling file appender) writing to
C:\ProgramData\GuruRMM\agent.log. Optionally also emit critical events
(tunnel open/close, update success/failure) to the Windows Event Log via
eventlog crate.
3. Phase 2 gap confirmed against a real use case
Live need: run a couple of diagnostic commands on AD2 (sshd flapping sporadically on port 22, no process crash in Event Log; want to investigate firewall/Defender events from the server side). With no channels, the tunnel's only utility today is proving the session layer works. The actual remote-operate capability still depends on Phase 2.
Priority order for Phase 2 channels (based on what would have been useful here):
- Terminal channel first — unlocks 80% of field use cases (log tails,
Get-Service,Restart-Service,Get-WinEvent). - Service channel second — tight scope, high value for "restart sshd".
- File channel third — needed but rarely urgent; SFTP already exists.
- Registry channel last — niche, can defer.
What Else We Observed
- The public tunnel chain
rmm-api.azcomputerguru.com→ Cloudflare → nginx → API (3001) proxies/api/*correctly. The docs in CONTEXT.md implied nginx only served/downloads/; confirmed today that it also proxies API paths, which is why off-LAN admin usage works. - AD2 agent start time
2026-04-11 22:09corresponds to last reboot of AD2; the agent has not restarted since despite sshd port flaps (sshd PID 4012 also continuously running since same moment). Confirms the tunnel infrastructure and the RMM agent are stable; the sshd flap is a separate network-layer issue unrelated to GuruRMM.
Credentials Used
- Admin Email: admin@azcomputerguru.com
- Admin Password: GuruRMM2025
- Public API: https://rmm-api.azcomputerguru.com
Note: op read "op://Infrastructure/GuruRMM Server/Admin Password"
returned a stale value (ClaudeAPI2026!@#) that fails login. The
2026-04-14 session log documents the current password as GuruRMM2025.
1Password entry should be updated to match.
Next Steps
- Update 1Password
Infrastructure/GuruRMM Serverentry — setAdmin Passwordfield toGuruRMM2025to match what server accepts. - Fix
/api/v1/tunnel/status/{id}for closed sessions (see Finding 1). - Add file/event-log output to agent (see Finding 2).
- Begin Phase 2 — Terminal channel first.
Update (evening session): Roadmap evolution + Azure Trusted Signing setup
Substantial architectural planning session. Product direction shifted from "single-tenant RMM tool" to "multi-tenant SaaS for MSPs." Roadmap updated significantly to reflect.
Roadmap additions to ROADMAP.md
-
Terminology (canonical) — locked in the 5-tier hierarchy: Platform → Partner (DB: tenant_id) → Client → Site → Agent. API/UI says "Partner"; DB column is
tenant_id. API path convention/api/public/v1/partners/{pid}/clients/{cid}/sites/{sid}/agents/{aid}. Event topics likeagent.online,partner.upgraded. Full table + rules at top of ROADMAP.md. -
Tunnel Channels (Phase 2) — T1-T8 tracking Terminal/File/Registry/Service channels + tech-side subscriber (T5 is gating dep — browser currently has no way to receive tunnel data,
server/src/ws/mod.rs:808-825discards incomingAgentMessage::TunnelData). -
Logging, Audit & Observability — L1-L10 three-tier design:
- Agent self-logging via OS-native sinks (Windows Event Log custom provider, Linux journald, macOS os_log)
- Client machine health via OS event log pulls — default 15-min delta + force-pull on tunnel open/close; default levels Critical+Error+Warning for delta, 4h bulk for Info/Debug/Audit/Notification; all tenant-configurable
- Tunnel audit direct to DB table
tunnel_audit(already exists, unused) — no scrubbing, sensitive input captured intentionally for tech-behavior audit; 90-day tenant-visible retention default; indefinite system archive to object storage - Agent config push via
ServerMessage::Configon connect + real-time when tenant admin changes settings
-
Multi-tenancy / MSP SaaS (M1-M7) — tenant_id on every table from now forward, tenancy-aware auth middleware, tenant admin dashboard, per-agent/month billing meter, data residency options, tenant export API, onboarding wizard.
-
Modular Architecture & Public APIs (X1-X12) — core vs. module boundary, event bus (NATS JetStream or Redis Streams), module manifest, module-to-core + module-to-module versioned APIs, public REST API
/api/public/v1/with OpenAPI spec + scoped API keys, webhook subscriptions, WASM or OCI sandbox for third-party modules (deferred), per-module billing. Concrete module candidates documented: PSA/CRM, Remote Syslog, Backups, Patch Mgmt, IT-Glue-style Docs, Network Monitoring. -
Protocol Versioning & Stale-Agent Recovery (V1-V10) —
/api/v1/bootstrap/hellodeclared sacred (additive-only forever). Compat shim layer per old protocol version atserver/src/compat/v{N}.rs. Server-initiated forced-upgrade instruction. Per-tenant update channels (stable/current/beta). Auto-sunset policy when old version fleet hits zero. Rollback path viaaction: downgrade_required. Concrete motivating example: Scileppi VP laptop offline for days — must be able to reconnect, get accepted, auto-upgrade. -
Certificates & Trust (C1-C11) — full cost + priority matrix. C1: Azure Trusted Signing for Windows (Public Trust). C2: Apple Developer Program. C3: GPG for Linux. C4-C11: TLS automation, mTLS, SBOM, FP submissions, DKIM.
-
Decisions Log — appended rationale entries for every 2026-04-15 decision so future sessions don't re-litigate.
CONTEXT.md anti-patterns added
- "DO NOT make breaking changes to
/api/v1/bootstrap/hello" — additive-only forever - "DO NOT cross module boundaries by importing another module's internals" — event bus or exposed APIs only
- Hierarchy terminology table added to anti-patterns block (canonical reference)
Azure Trusted Signing — provisioned and IV submitted
Business identity confirmed via D&B profile lookup: Arizona Computer Guru LLC (D-U-N-S 00-566-1506 / 005661506), 7437 E 22ND St, Tucson AZ 85710, (520) 304-8300, mike@azcomputerguru.com. 25+ years operating history → Public Trust eligible (>3yr threshold).
Provisioned in subscription Basic (e507e953-2ce9-4887-ba96-9b654f7d3267):
- Resource group:
gururmm-signing-rg(westus2) - Trusted Signing Account:
gururmm-signing - Account URI:
https://wus2.codesigning.azure.net/ - SKU: Basic (~$9.99/mo billing started 2026-04-16 00:16 UTC)
RBAC granted:
mike@azcomputerguru.com→ roleArtifact Signing Identity Verifierat account scope
Identity Validation submitted:
- IV ID:
03028768-f611-4904-aa58-c755020f436a - Status:
In Progress(Microsoft review, 1-5 business days typical) - Submitted name:
Arizona Computer Guru LLC(state filing); D&B record has olderCOMPUTER GURUCorporation — may need to update D&B profile for consistency - Primary email: mike@; Secondary: admin@azcomputerguru.com
- Microsoft may call 520-304-8300 — voicemail should identify Computer Guru
Pending (blocks on IV approval):
- Certificate Profile creation:
az trustedsigning certificate-profile create --resource-group gururmm-signing-rg --account-name gururmm-signing --profile-name gururmm-public-trust --profile-type PublicTrust --identity-validation-id 03028768-f611-4904-aa58-c755020f436a - Signing role assignment:
Trusted Signing Certificate Profile Signerto CI build principal - Local tooling install: Windows SDK (for signtool.exe), Microsoft.Trusted.Signing.Client NuGet package
All details persisted to vault: D:\vault\services\azure-trusted-signing.sops.yaml (encrypted).
Action items for next session
- Check IV status — portal → Trusted Signing Accounts → gururmm-signing → Identity Validation
- If approved → run the cert profile create command (already staged in vault)
- If Microsoft flags legal name mismatch: reply with AZ Corp Commission LLC Articles; update D&B record
- Start signtool.exe + dlib integration in a local scratch project
- Meanwhile, fix the two backlog items (tunnel status 403 bug, agent logging) — they're both independent of the Azure work and small PRs