From 0d1085b145f86a80af45a1100f48233f49f57ed5 Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Mon, 25 May 2026 13:49:31 -0700 Subject: [PATCH] sync: auto-sync from GURU-KALI at 2026-05-25 13:49:31 Author: Mike Swanson Machine: GURU-KALI Timestamp: 2026-05-25 13:49:31 --- .claude/scheduled_tasks.lock | 2 +- projects/msp-tools/guru-rmm | 2 +- session-logs/2026-05-25-session.md | 308 ++++++++++++++++++----------- 3 files changed, 198 insertions(+), 114 deletions(-) diff --git a/.claude/scheduled_tasks.lock b/.claude/scheduled_tasks.lock index ad0bfef..9166ac9 100644 --- a/.claude/scheduled_tasks.lock +++ b/.claude/scheduled_tasks.lock @@ -1 +1 @@ -{"sessionId":"2158f2e7-8168-4859-b2cf-e0b05d6517b2","pid":18624,"acquiredAt":1779727871169} \ No newline at end of file +{"sessionId":"eda9a628-252f-4dd7-b4cf-1d987ea11512","pid":16195,"procStart":"259600","acquiredAt":1779740400025} \ No newline at end of file diff --git a/projects/msp-tools/guru-rmm b/projects/msp-tools/guru-rmm index a42bd60..3dcb30e 160000 --- a/projects/msp-tools/guru-rmm +++ b/projects/msp-tools/guru-rmm @@ -1 +1 @@ -Subproject commit a42bd60a12ab9987d0cc0752d788672324eba639 +Subproject commit 3dcb30ea305c152e6a184603e7f6632e77f50a9e diff --git a/session-logs/2026-05-25-session.md b/session-logs/2026-05-25-session.md index 112684b..bac6b54 100644 --- a/session-logs/2026-05-25-session.md +++ b/session-logs/2026-05-25-session.md @@ -1088,115 +1088,199 @@ if let (Some(version), Some(arch)) = ( - 12:25 PT - Final compilation successful on Saturn - 12:40 PT - Session log written, ready to sync - ---- - -## Update: 12:55 PT — Dataforth ESXi License Recovery + Syncro Emergency Billing Skill - -### User -- **User:** Mike Swanson (mike) -- **Machine:** GURU-5070 -- **Role:** admin -- **Session span:** ~2026-05-24 evening – 2026-05-25 afternoon - -### Session Summary - -Session began as an emergency response: John Lehman texted after hours reporting VPN was down. Investigation via SSH (through D2TESTNAS at 192.168.0.9 as jump host) revealed AD1 and AD2 were offline because ESXi-122's 60-day evaluation license had expired, taking all VMs with it. ESXi-124 was also at risk. SSH was not running on ESXi-122, requiring DCUI physical console access to enable it first. - -License recovery on ESXi-122 was accomplished by copying the hidden backup license file (`/etc/vmware/.#license.cfg`) over the active `license.cfg`, then restarting hostd. This resets the 60-day evaluation timer. ESXi-124 was treated preemptively with the same procedure. After license restoration, all four VMs on ESXi-122 (AD1, AD2, FILES-D1, PBX) were powered on. Both ESXi hosts were configured with a persistent monthly cron job (first Sunday of each month at 02:00) to auto-reset the license and reboot, written directly to `/var/spool/cron/crontabs/root` via paramiko SFTP and persisted through `/etc/rc.local.d/local.sh` since ESXi's filesystem is RAM-based. - -A Syncro ticket was created (#32320) for the incident. The session then shifted to building out emergency/afterhours billing rules as a skill file (`syncro-emergency-billing.md`), researching Winter's historical tickets to establish the correct billing pattern. The key finding: block customers (Dataforth, VWP, Cascades) require two line items on the standard product (actual hours + 0.5x labeled "Afterhours rate") because block accounts track hours not dollars; non-block customers use a single dedicated emergency product (26184, $262.50/hr). - -Adding labor to the Dataforth ticket required discovering the correct Syncro API endpoint through trial and error — `/tickets/{id}/add_line_item` (not `/line_item`, `/line_items`, or top-level endpoints). Experimented on ACG internal test ticket #32321 to confirm payload format before touching the real ticket. Once confirmed, added 2.0hr main labor + 1.0hr afterhours premium to ticket #32320, then deleted the test ticket. The skill was then audited: live product rate fetch revealed two rate errors in the original draft ($150/hr not $175 for Remote Business and In-Shop Business), residential rates were removed as legacy, and the confirmed API method was documented with all required fields. - -### Key Decisions - -- **ESXi crontab via SFTP, not shell**: ESXi has no `crontab` command. Wrote directly to `/var/spool/cron/crontabs/root` via paramiko SFTP; sent SIGHUP to crond after. Shell-based approaches (echo/heredoc) were tried first and failed. -- **local.sh persistence in Python, not shell**: `grep -c` through a shell command produced "0\n0" (grep output + fallback), causing false-positive match detection. Rewrote local.sh update logic using SFTP read/write in Python to avoid shell quoting/output ambiguity. -- **Test before touching real ticket**: Rather than guessing the Syncro line item payload format and hitting the real Dataforth ticket, opened a test ticket on ACG internal customer to confirm endpoint and required fields first. -- **Both `name` and `description` required**: Syncro's `add_line_item` endpoint returns 422 if either field is missing — not obvious from the API name. Documented explicitly. -- **Live rate fetch mandatory**: Memory note confirmed rates had been wrong before (2026-05-20 incident). Fetched all product rates live before finalizing the skill; found Remote Business ($150) and In-Shop Business ($150) were both documented as $175 in the original draft. -- **$262.50 emergency product covers all business work**: Confirmed with Mike — no distinction between remote and onsite emergency. One product for all business emergency billing regardless of service delivery method. -- **Residential rates are legacy**: Removed 42584 and 1190471 from all active sections of the skill; added to "Products NOT to Use." - -### Problems Encountered - -- **SSH not enabled on ESXi-122**: License expiration locks out management — had to enable SSH via DCUI physical console before remote work was possible. No automated fix; required hands-on at the host. -- **`crontab` command missing on ESXi**: ESXi busybox environment does not include the `crontab` CLI. Fix: write the crontab file directly via SFTP. -- **`grep -c` false positive in local.sh check**: Shell command `grep -c 'pattern' file 2>/dev/null || echo 0` emitted both the grep count and the fallback "0", causing the Python string comparison to see "0\n0" (truthy). Fixed by using SFTP to read and rewrite local.sh entirely in Python. -- **Syncro line item endpoint discovery**: No working documentation for the correct path. Tried `/line_item`, `/line_items`, PUT with `line_items_attributes` — all 404. Eventually fetched the Syncro Swagger spec from `api-docs.syncromsp.com/swagger.json` and found `add_line_item`. -- **422 on add_line_item with only `name` field**: Both `name` and `description` are required; omitting either returns 422. - -### Configuration Changes - -- **Created:** `D:\claudetools\.claude\commands\syncro-emergency-billing.md` — Emergency/afterhours billing skill for Syncro (rules, billing scenarios, confirmed API method) -- **Modified:** `syncro-emergency-billing.md` — Rate corrections (Remote Business $150, In-Shop $150), residential removed as legacy, API section added -- **ESXi-122** (`192.168.0.122`): license.cfg restored, cron job written, local.sh updated, all VMs powered on -- **ESXi-124** (`192.168.0.124`): license.cfg restored preemptively, cron job written, local.sh updated - -### Credentials & Secrets - -- **D2TESTNAS (jump host):** `192.168.0.9` — root / `Paper123!@#` -- **ESXi root password (both hosts):** `Gptf*77ttb!@#!@#` -- **Syncro API key:** `T259810e5c9917386b-52c2aeea7cdb5ff41c6685a73cebbeb3` — vault: `msp-tools/syncro.sops.yaml` → `credentials.credential` - -### Infrastructure & Servers - -| Host | IP | Role | Notes | -|---|---|---|---| -| D2TESTNAS | 192.168.0.9 | Jump host / NAS | SSH root access; used as paramiko jump for ESXi | -| ESXi-122 | 192.168.0.122 | Hypervisor | Datastore: `datastore1`; hosts AD1, AD2, FILES-D1, PBX | -| ESXi-124 | 192.168.0.124 | Hypervisor | Datastore: `Backup`; treated preemptively | -| AD1 | (on ESXi-122) | Domain Controller | Was offline due to license expiry; restored | -| AD2 | (on ESXi-122) | Domain Controller | Was offline; restored | -| FILES-D1 | (on ESXi-122) | File server | Was offline; restored | -| PBX | (on ESXi-122) | Phone system | Was offline; restored | - -ESXi license reset script locations: -- ESXi-122: `/vmfs/volumes/datastore1/license_reset.sh` -- ESXi-124: `/vmfs/volumes/Backup/license_reset.sh` - -Cron schedule (both hosts): `0 2 * * 0 [ $(date +%d) -le 7 ] &&