400 lines
17 KiB
Markdown
400 lines
17 KiB
Markdown
# Session Log — 2026-04-21
|
|
|
|
## User
|
|
- **User:** Mike Swanson (mike)
|
|
- **Machine:** DESKTOP-0O8A1RL
|
|
- **Role:** admin
|
|
|
|
---
|
|
|
|
## Session Summary
|
|
|
|
Continuation from previous conversation (context compacted). This session covered three areas:
|
|
|
|
1. **BirthBiologic vault save** — fixed a broken vault stub and saved GuruRMM site credentials for the new BirthBiologic client
|
|
2. **MSI build fix** — diagnosed and fixed "MSI build on Pluto failed" error caused by a missing WiX extension flag in `install.rs`
|
|
3. **DESIGN.md created** — comprehensive per-component design guide for GuruRMM covering architectural decisions, rules, and constraints that were previously only in session logs and verbal decisions
|
|
|
|
---
|
|
|
|
## Key Work
|
|
|
|
### 1. BirthBiologic Vault Entry — Fixed and Saved
|
|
|
|
**Problem:** A broken unencrypted stub existed at `D:/vault/clients/birthbiologic/gururmm-site-main.sops.yaml`. `vault.sh add` failed ("file already exists"), `vault.sh create` doesn't exist, and `sops --encrypt` failed with "no matching creation rules found" when the input file wasn't named `.sops.yaml`.
|
|
|
|
**Root cause:** The SOPS `.sops.yaml` creation rule uses `path_regex: '.*\.sops\.yaml$'` — it only matches files already named `.sops.yaml`. Attempting to encrypt a `.plain.yaml` file doesn't match the rule.
|
|
|
|
**Fix:**
|
|
1. Deleted the broken stub
|
|
2. Wrote plaintext to `gururmm-site-main.plain.yaml`
|
|
3. Encrypted with explicit AGE key + `--encrypted-regex` flags: `sops --encrypt --age age1qz7ct84m50u06h97artqddkj3c8se2yu4nxu59clq8rhj945jc0s5excpr --encrypted-regex '^(credentials|...)$' input.plain.yaml > output.sops.yaml`
|
|
4. Deleted plaintext
|
|
5. Verified: `vault.sh get-field clients/birthbiologic/gururmm-site-main.sops.yaml credentials.api_key` returned correct value
|
|
|
|
**BirthBiologic GuruRMM credentials (also in vault):**
|
|
```
|
|
client_id: da526b38-e832-4159-ab13-a3d94e9897a2
|
|
site_id: 3b20ef97-c764-4ef8-9154-79c3d5b486f8
|
|
site_code: BRIGHT-PEAK-5980
|
|
api_key: grmm_1ZB1qV9Q61b9Noq8BIaZGwLNjZMfF49i
|
|
installer_url (landing): https://rmm.azcomputerguru.com/install/BRIGHT-PEAK-5980
|
|
msi_url (direct): https://rmm.azcomputerguru.com/sites/3b20ef97-c764-4ef8-9154-79c3d5b486f8/installer
|
|
```
|
|
|
|
Vault file: `D:/vault/clients/birthbiologic/gururmm-site-main.sops.yaml`
|
|
|
|
---
|
|
|
|
### 2. MSI Build Fix — "MSI build on Pluto failed"
|
|
|
|
**Symptom:** Clicking "Download MSI" in the GuruRMM dashboard for any site returned "MSI build on Pluto failed" in red.
|
|
|
|
**Diagnosis:** Server log showed:
|
|
```
|
|
stdout=C:\gururmm\installer\gururmm-agent.wxs(226) : error WIX0094:
|
|
The identifier 'Binary:Wix4UtilCA_X64' could not be found.
|
|
```
|
|
|
|
**Root cause:** The `build_site_msi_on_pluto` function in `server/src/api/install.rs` was calling `wix build` without `-ext WixToolset.Util.wixext`. The `InstallReportCA` custom action uses `BinaryRef="Wix4UtilCA_X64"` which lives in the Util extension. The base-MSI build in `build-agents.sh` had the flag; the on-demand per-site build did not.
|
|
|
|
**Fix:** Added `-ext WixToolset.Util.wixext` to the WiX command in `build_site_msi_on_pluto`:
|
|
```
|
|
"cd C:\\gururmm\\installer && wix.exe build gururmm-agent.wxs \
|
|
-arch x64 -d Version={version} -d SITEKEY={site_id} \
|
|
-o {remote_out} -ext WixToolset.Util.wixext"
|
|
```
|
|
|
|
Applied directly on Jupiter via `sed -i`, rebuilt server (`cargo build --release` in `server/`), restarted `gururmm-server`. Then committed and pushed the fix to Gitea.
|
|
|
|
**Fix commit:** `6106087` — "fix: add WixToolset.Util.wixext to site MSI build command"
|
|
|
|
**Note:** This was a discrepancy between `build-agents.sh` (had the flag) and `install.rs` (didn't). Added to DESIGN.md as a documented rule.
|
|
|
|
---
|
|
|
|
### 3. DESIGN.md — GuruRMM Design Guide Created
|
|
|
|
Created `docs/DESIGN.md` in the GuruRMM repo. This is a new document capturing per-component design decisions and hard constraints that were previously scattered across session logs and verbal decisions.
|
|
|
|
**Committed:** `6b76dd7` — "docs: add DESIGN.md — per-component architectural decisions and rules"
|
|
|
|
**Sections:**
|
|
- Project-Wide Rules (no TOML/config for endpoints, registry as source of truth)
|
|
- Agent (auto-install, per-agent enrollment keys, legacy OS support, .old cleanup, downgrade guard)
|
|
- Installer/MSI (WiX v4 only, Pluto-only, required extension, Wait="no" rationale, install-report CA as debug logger, no UI extension)
|
|
- Build Pipeline (webhook-only builds, parallelism, signing, toolchain self-bootstrapping, build lock)
|
|
- Server (PostgreSQL not MariaDB, INET sqlx pattern, ConnectInfo extractor, stop-before-replace, migration recording)
|
|
- Dashboard (useMemo pitfall, sidebar colors, modal key reset, theme support)
|
|
- Tray Application (separate crate, user session, policy-controlled, named pipe IPC)
|
|
- Protocol / Wire Format (WebSocket message types, heartbeat)
|
|
|
|
---
|
|
|
|
## Files Created / Modified
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `D:/vault/clients/birthbiologic/gururmm-site-main.sops.yaml` | Created (encrypted vault entry for BirthBiologic RMM site) |
|
|
| `/home/guru/gururmm/server/src/api/install.rs` | Added `-ext WixToolset.Util.wixext` to Pluto WiX build command |
|
|
| `docs/DESIGN.md` (in gururmm repo) | Created — comprehensive design guide |
|
|
|
|
---
|
|
|
|
## Commits (gururmm repo)
|
|
|
|
| SHA | Message |
|
|
|-----|---------|
|
|
| `6106087` | fix: add WixToolset.Util.wixext to site MSI build command |
|
|
| `6b76dd7` | docs: add DESIGN.md — per-component architectural decisions and rules |
|
|
|
|
---
|
|
|
|
## Update: 19:25 UTC — MSI Still Failing, Root Cause Found and Fixed
|
|
|
|
### Problem
|
|
|
|
After the earlier `install.rs` fix and server rebuild, MSI generation was still failing with the same `WIX0094` error.
|
|
|
|
### Root Cause
|
|
|
|
Two compounding issues:
|
|
|
|
**1. Wrong binary deployed.** The `gururmm-server` service runs from `/opt/gururmm/gururmm-server`, not `/usr/local/bin/gururmm-server`. The rebuild at 17:53 placed the new binary in `/home/guru/gururmm/server/target/release/gururmm-server` but it was never copied to `/opt/gururmm/`. The old binary (from 2026-04-20 18:32) kept running.
|
|
|
|
```
|
|
ExecStart=/opt/gururmm/gururmm-server ← service path
|
|
/usr/local/bin/gururmm-server ← wrong path (stale, Apr 20)
|
|
/home/guru/gururmm/server/target/release/gururmm-server ← new binary (never deployed)
|
|
```
|
|
|
|
**2. Migration 013 not registered.** Once the correct binary was deployed and the service restarted, it crashed immediately on startup:
|
|
```
|
|
Error: while executing migration 13: error returned from database:
|
|
relation "install_reports" already exists
|
|
```
|
|
Migration 013 (`install_reports` table) had been applied to the DB in a prior session but never recorded in `_sqlx_migrations`. sqlx tried to re-run it, hit the conflict, and crashed.
|
|
|
|
### Fix
|
|
|
|
1. Deployed the correct binary:
|
|
```bash
|
|
sudo systemctl stop gururmm-server
|
|
sudo cp /home/guru/gururmm/server/target/release/gururmm-server /opt/gururmm/gururmm-server
|
|
```
|
|
|
|
2. Registered migration 013 in `_sqlx_migrations`:
|
|
```sql
|
|
INSERT INTO _sqlx_migrations (version, description, installed_on, success, checksum, execution_time)
|
|
VALUES (
|
|
13,
|
|
'install reports',
|
|
NOW(),
|
|
true,
|
|
decode('76d53ea1c51f9ce70c01f5b8b545d17f63eab5b2c447e880cdb1f25807ed30c626df818aadea6db9d024cdf2e72d3062', 'hex'),
|
|
0
|
|
);
|
|
```
|
|
Checksum was computed via `hashlib.sha384` of the migration file contents.
|
|
|
|
3. Restarted service — came up clean, agents reconnected.
|
|
|
|
### Lesson
|
|
|
|
**Always deploy to `/opt/gururmm/gururmm-server`** — that is the path in the systemd `ExecStart`. `/usr/local/bin/gururmm-server` is a stale copy from early setup and is not used. This should be added to CONTEXT.md / DESIGN.md anti-patterns.
|
|
|
|
---
|
|
|
|
## Pending / Next Tasks
|
|
|
|
From previous session (still pending):
|
|
- [ ] Test MSI installer on BirthBiologic server — install via `https://rmm.azcomputerguru.com/install/BRIGHT-PEAK-5980` or MSI from dashboard
|
|
- [ ] Consent `tenant-admin` and `user-manager` apps in BirthBiologic tenant (only `investigator` consented so far)
|
|
- [ ] BirthBiologic Datto → SharePoint migration script (PowerShell, tenant-admin Graph API, app-only auth, reads Datto Workplace local file server, uploads to SharePoint via Sites.ReadWrite.All)
|
|
- [ ] mvaninc CA policy — create policy requiring MFA for all sign-ins (Mike to do in portal, not scriptable)
|
|
- [ ] Legacy build deployment — still needs first trigger via webhook push to produce legacy binaries
|
|
|
|
---
|
|
|
|
## Infrastructure
|
|
|
|
| Component | Location | Notes |
|
|
|-----------|----------|-------|
|
|
| GuruRMM server | guru@172.16.3.30 | `gururmm-server` service |
|
|
| Pluto build VM | Administrator@172.16.3.36 | Windows MSVC + WiX |
|
|
| Downloads dir | /var/www/gururmm/downloads/ | binaries, MSIs |
|
|
| Build log | /var/log/gururmm-build.log | |
|
|
| Vault | D:/vault/ | SOPS AGE-encrypted |
|
|
|
|
---
|
|
|
|
## Credentials
|
|
|
|
- **PostgreSQL (gururmm):** `gururmm` / `43617ebf7eb242e814ca9988cc4df5ad` @ 172.16.3.30:5432/gururmm
|
|
- **Build server SSH:** guru@172.16.3.30
|
|
- **Pluto SSH:** Administrator@172.16.3.36
|
|
- **Webhook secret:** `gururmm-build-secret`
|
|
- **Gitea internal API:** http://172.16.3.20:3000
|
|
- **BirthBiologic RMM site:** api_key `grmm_1ZB1qV9Q61b9Noq8BIaZGwLNjZMfF49i` (also in vault)
|
|
|
|
---
|
|
|
|
## Update: 21:30 UTC — Cleanup EXE, Debug Agent, BB-SERVER MSI Troubleshooting
|
|
|
|
### Context
|
|
|
|
Continuing from the previous compacted conversation. All work in this update is in the GuruRMM project (gururmm repo on Jupiter, local copy at D:\claudetools\projects\msp-tools\guru-rmm).
|
|
|
|
---
|
|
|
|
### 1. Cleanup EXE Deployment
|
|
|
|
Resumed deploying `gururmm-cleanup.exe` to Jupiter. Method used: base64-encode the EXE on Pluto via RMM agent command, capture the output, decode locally, SCP to Jupiter.
|
|
|
|
**Pluto agent ID:** `5316f56f-a1b3-4ac5-97ac-71ddf6a74d2e`
|
|
|
|
**JWT generation (Pluto admin user):**
|
|
```python
|
|
import json, base64, hmac, hashlib, time
|
|
secret_bytes = 'ZNzGxghru2XUdBVlaf2G2L1YUBVcl5xH0lr/Gpf/QmE='.encode('utf-8')
|
|
# User sub: 490e2d0f-067d-4130-98fd-83f06ed0b932 (admin@azcomputerguru.com)
|
|
```
|
|
|
|
**SCP to Pluto failed** (SYSTEM account has no SSH private key at `C:\Windows\System32\config\systemprofile\.ssh\`). Fell back to base64-through-agent approach.
|
|
|
|
**Base64 command sent to Pluto:**
|
|
```powershell
|
|
[Convert]::ToBase64String([IO.File]::ReadAllBytes('C:/gururmm/agent/target/debug-agent/release/gururmm-agent.exe'))
|
|
```
|
|
File size: 3.8 MB (3,948,544 bytes). B64 length: 5,264,728 chars.
|
|
|
|
**Decode locally and SCP to Jupiter:**
|
|
```bash
|
|
py -c "import base64; ..." # decode to D:/tmp/gururmm-agent-debug.exe
|
|
scp D:/tmp/gururmm-agent-debug.exe guru@172.16.3.30:/tmp/gururmm-agent-debug.exe
|
|
ssh guru@172.16.3.30 'sudo cp /tmp/gururmm-agent-debug.exe /var/www/gururmm/downloads/gururmm-agent-debug.exe'
|
|
```
|
|
|
|
**Result:** `/var/www/gururmm/downloads/gururmm-agent-debug.exe` deployed (3.8 MB).
|
|
`http://172.16.3.30:3001/install/debug/download` → HTTP 200 (3,948,544 bytes). ✓
|
|
|
|
**Note:** Cloudflare challenges `https://rmm.azcomputerguru.com/install/debug/download` for non-browser requests — this is expected/normal. Browser downloads work fine.
|
|
|
|
**Note on cleanup.exe:** Not yet built. The `gururmm-cleanup.exe` will be produced automatically by `build-agents.sh` on the next triggered build. The server route `/install/cleanup/download/exe` returns 503 until that first build completes.
|
|
|
|
---
|
|
|
|
### 2. Pluto's SSH Public Key (for future reference)
|
|
|
|
Pluto SYSTEM account does NOT have `id_ed25519`. The pubkey retrieved earlier (`system@PLUTO`) was incorrect or from a different context.
|
|
|
|
**Pluto's SYSTEM .ssh dir** contains only `known_hosts` (94 bytes).
|
|
|
|
**Jupiter's authorized_keys** was updated to add Pluto pubkey:
|
|
```
|
|
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFWaMV0U3WZG3kuts7mqVaF9SN0TsKqPAC37GdVGbq0Y system@PLUTO
|
|
```
|
|
(Added to `/home/guru/.ssh/authorized_keys` — may be irrelevant since SYSTEM has no private key.)
|
|
|
|
---
|
|
|
|
### 3. Debug Agent Feature — build-agents.sh and Server Routes
|
|
|
|
**Already committed in prior session:**
|
|
- `build-agents.sh`: added `--features debug-agent --target-dir target\debug-agent` to Pluto SSH build command + SCP + deploy block
|
|
- `agent/Cargo.toml`: added `debug-agent = []` feature
|
|
- `agent/src/service.rs`: cfg-gated `SERVICE_NAME`, `SERVICE_DISPLAY_NAME`, `INSTALL_DIR`, `CONFIG_DIR` constants
|
|
- `agent/src/registry.rs`: `REGISTRY_KEY` = `SOFTWARE\GuruRMM-Debug` when feature enabled
|
|
- `agent/src/device_id.rs`: stores device ID in `C:\ProgramData\GuruRMM-Debug\.device-id`
|
|
- `agent/src/updater/mod.rs`: `detect_binary_path()` and `detect_config_dir()` use debug paths
|
|
- `server/src/main.rs` on Jupiter: routes for `/install/debug/download` and cleanup endpoints
|
|
- `server/src/api/install.rs` on Jupiter: `download_debug_exe()` handler
|
|
|
|
---
|
|
|
|
### 4. GuruRMM Debug Site Created
|
|
|
|
Created a new site for the debug agent to enroll into:
|
|
|
|
| Field | Value |
|
|
|-------|-------|
|
|
| Site ID | `d6b8233a-6cc1-4a44-888d-01ee49123fba` |
|
|
| Site name | GuruRMM Debug |
|
|
| Site code | `BOLD-HARBOR-1855` |
|
|
| API key | `grmm_mm2DnrF6kt9Ml8AyJCuHJJHnBTyXHX_4` |
|
|
| Client | AZ Computer Guru (`417420f4-c3f4-482a-acd4-d6f63c8cddde`) |
|
|
|
|
**Issue identified:** The debug agent currently prompts for a site code on first run because:
|
|
1. No config file exists
|
|
2. No site code embedded in the binary
|
|
|
|
**Fix needed (not yet done):** Hardcode the debug site API key into the `debug-agent` feature using a `cfg`-gated constant. Or embed it at build time. This would allow the debug EXE to auto-install silently without prompting.
|
|
|
|
**Current workaround:** User entered `BRIGHT-PEAK-5980` (BirthBiologic) when prompted.
|
|
|
|
---
|
|
|
|
### 5. BB-SERVER Connected
|
|
|
|
Debug agent installed on BB-SERVER (BirthBiologic's server) and is now online in the RMM.
|
|
|
|
| Field | Value |
|
|
|-------|-------|
|
|
| Agent ID | `6c02baa7-0f1c-4990-b466-c9ab9eaefd3b` |
|
|
| Hostname | BB-SERVER |
|
|
| OS | Windows Server 2016 (build 14393) |
|
|
| Agent version | 0.6.2 |
|
|
| Site | BirthBiologic Main Office (`3b20ef97-c764-4ef8-9154-79c3d5b486f8`) |
|
|
| Status | online |
|
|
|
|
---
|
|
|
|
### 6. MSI Installer Troubleshooting via BB-SERVER
|
|
|
|
Using BB-SERVER's debug agent to test the MSI installer and capture verbose logs.
|
|
|
|
**Problem 1 — Cloudflare blocks non-browser downloads:**
|
|
- `Invoke-WebRequest` without a browser UA gets Cloudflare's JS challenge page instead of the MSI
|
|
- Fix: pass `-UserAgent 'Mozilla/5.0 ...'` to Invoke-WebRequest
|
|
|
|
**Problem 2 — msiexec doesn't accept forward slashes:**
|
|
- Error 2203 "Cannot open database file" with C:/grmm.msi
|
|
- Fix: use `C:\\grmm.msi` (JSON-escaped backslash)
|
|
|
|
**Working command format:**
|
|
```
|
|
Invoke-WebRequest -Uri '...' -OutFile C:\\grmm.msi -UserAgent $ua -UseBasicParsing;
|
|
msiexec /i C:\\grmm.msi /quiet /l*v C:\\grmm.log;
|
|
Get-Content C:\\grmm.log -Tail 100
|
|
```
|
|
|
|
**Command in flight** (cmd ID `fa68659e-3395-48a2-adee-9624dfd40cd7`) — still running as of session save. Check with:
|
|
```bash
|
|
curl -s "http://172.16.3.30:3001/api/commands/fa68659e-3395-48a2-adee-9624dfd40cd7" \
|
|
-H "Authorization: Bearer <JWT>"
|
|
```
|
|
|
|
---
|
|
|
|
### 7. RMM API — Correct Endpoints
|
|
|
|
| Operation | Endpoint |
|
|
|-----------|----------|
|
|
| Send command | `POST http://172.16.3.30:3001/api/agents/:id/command` |
|
|
| Get command status | `GET http://172.16.3.30:3001/api/commands/:id` |
|
|
| List agents | `GET http://172.16.3.30:3001/api/agents` |
|
|
| Get site install info | `GET http://172.16.3.30:3001/api/sites/:id/install-info` |
|
|
| Download site MSI (auth) | `GET http://172.16.3.30:3001/api/sites/:id/installer` |
|
|
| Download site MSI (public) | `GET https://rmm.azcomputerguru.com/install/BRIGHT-PEAK-5980/download/msi` |
|
|
|
|
**JWT generation for API calls:**
|
|
- Secret (raw bytes): `ZNzGxghru2XUdBVlaf2G2L1YUBVcl5xH0lr/Gpf/QmE=`
|
|
- Admin user sub: `490e2d0f-067d-4130-98fd-83f06ed0b932` (admin@azcomputerguru.com)
|
|
- Claims: `sub`, `role: "admin"`, `orgs: []`, `exp: now+3600`, `iat: now`
|
|
- Algorithm: HS256, key = secret string encoded as UTF-8 bytes (NOT base64-decoded)
|
|
|
|
**Known user IDs:**
|
|
```
|
|
490e2d0f-067d-4130-98fd-83f06ed0b932 admin@azcomputerguru.com (admin)
|
|
4d754f36-0763-4f35-9aa2-0b98bbcdb309 claude-api@azcomputerguru.com (admin)
|
|
294c1242-68ac-42e7-85b0-564c8b155dba howard@azcomputerguru.com (admin)
|
|
```
|
|
|
|
---
|
|
|
|
### 8. JSON Escaping Issue with Agent Commands
|
|
|
|
The RMM server's serde_json is strict about JSON escape sequences. Commands containing `\"` embedded double-quotes cause "invalid escape" errors when passed via `--data-binary @file` from curl if there are edge cases.
|
|
|
|
**Working approach:** Use shell single-quote wrapping with `'"'"'` technique for embedded single-quoted PowerShell strings in the curl -d argument. Avoids file escaping entirely.
|
|
|
|
**Key rules:**
|
|
- Never use `\g`, `\L`, `\D`, etc. — only valid JSON escapes: `\\`, `\"`, `\/`, `\b`, `\f`, `\n`, `\r`, `\t`, `\uXXXX`
|
|
- Forward slashes are fine in JSON strings
|
|
- Backslashes in PowerShell paths need `\\` in JSON (gives `\` in the actual string)
|
|
|
|
---
|
|
|
|
### Pending Tasks
|
|
|
|
| Task | Status | Notes |
|
|
|------|--------|-------|
|
|
| Cleanup EXE on Pluto | Pending | Needs first build trigger. Route ready, will 503 until built. |
|
|
| Debug agent auto-install | Not done | Needs hardcoded debug site key in `debug-agent` feature |
|
|
| MSI 2762 test on BB-SERVER | In progress | Command running, awaiting result |
|
|
| BirthBiologic — MSI verified working | Pending | Testing now |
|
|
| BirthBiologic — consent tenant-admin/user-manager | Pending | |
|
|
| BirthBiologic — Datto→SharePoint migration script | Pending | |
|
|
| mvaninc CA policy (MFA) | Pending | Mike to do manually in portal |
|
|
| Remote uninstall feature | Pending | New WS message + server DELETE endpoint + dashboard button |
|
|
|
|
---
|
|
|
|
### Infrastructure Additions This Update
|
|
|
|
| Item | Value |
|
|
|------|-------|
|
|
| Debug site | BOLD-HARBOR-1855, api_key `grmm_mm2DnrF6kt9Ml8AyJCuHJJHnBTyXHX_4` |
|
|
| BB-SERVER agent | ID `6c02baa7-...`, online, BirthBiologic Main Office |
|
|
| Debug EXE | `/var/www/gururmm/downloads/gururmm-agent-debug.exe` (3.8 MB) |
|