From 159c7e16fffb2ece44c61b4a6750006b78c37ed4 Mon Sep 17 00:00:00 2001 From: Howard Enos Date: Sat, 30 May 2026 07:29:53 -0700 Subject: [PATCH] sync: auto-sync from HOWARD-HOME at 2026-05-30 07:29:45 Author: Howard Enos Machine: HOWARD-HOME Timestamp: 2026-05-30 07:29:45 --- projects/msp-tools/guru-connect | 2 +- session-logs/2026-05-30-howard-bitdefender.md | 122 ++++++++++++++++++ 2 files changed, 123 insertions(+), 1 deletion(-) create mode 100644 session-logs/2026-05-30-howard-bitdefender.md diff --git a/projects/msp-tools/guru-connect b/projects/msp-tools/guru-connect index 2118942..8cb0b5b 160000 --- a/projects/msp-tools/guru-connect +++ b/projects/msp-tools/guru-connect @@ -1 +1 @@ -Subproject commit 21189423f26024da41d255b145fc39c6e1eceb76 +Subproject commit 8cb0b5b16beffc0ff9439537f110212f98c9bb1c diff --git a/session-logs/2026-05-30-howard-bitdefender.md b/session-logs/2026-05-30-howard-bitdefender.md new file mode 100644 index 0000000..e31ef63 --- /dev/null +++ b/session-logs/2026-05-30-howard-bitdefender.md @@ -0,0 +1,122 @@ +# Session Log — 2026-05-30 — Bitdefender GravityZone Skill (Howard) + +## User +- **User:** Howard Enos (howard) +- **Machine:** Howard-Home +- **Role:** tech + +## Session Summary + +Built a new `/bitdefender` Claude Code skill that drives the Arizona Computer +Guru GravityZone Cloud partner tenant through the JSON-RPC Public API. The work +started from the observation that the web console cannot be accessed +programmatically, but a partner-level API key already existed in the SOPS vault +(`msp-tools/gravityzone.sops.yaml`) and a read-only client already lived in the +server code at `api/services/gravityzone_service.py`. The new skill is a +standalone CLI that reuses that proven JSON-RPC + HTTP Basic auth pattern and +extends it with management operations, an identity-tier JSON cache, and +`--confirm` gating on destructive actions. + +The build proceeded in phases. First, the core skill (client `gz_client.py`, +CLI `gz.py`, `SKILL.md`, `references/api-reference.md`) covering read commands +(status, companies, endpoints, sweep, policies, packages, quarantine, +inventory) and management (create-package, install-links, scan, move, +make-group, gated deletes). The API method specs were verified via web research +plus live read-only probes against the tenant rather than guessing signatures. +Next, security hardening from a code review: API key loaded from the vault at +runtime only (never on disk/logs/argv/cache), error-body truncation, and `raw` +destructive-method gating. Then EDR/incident-response commands from the +`incidents` module (blocklist, isolate/unisolate, blocklist-add/remove). + +A live all-clients security sweep was run, which exposed a real bug: the +all-clients path passed the companies *container* ID to `getEndpointsList`, +which the API rejects. Fixed by iterating each company individually (mirroring +the original server service). The corrected sweep covered 303 endpoints across +40 active companies: 1 infected (BOOKKEEPER at Reliant Well Drilling), 201 with +outdated signatures, 6 with outdated agents, 183 stale (>14 days). Most of the +"stale/outdated" volume traced to Glaz-Tech Industries (145 endpoints, ~78% +stale) which appears to be dead inventory never removed from GravityZone. + +A systematic bug-hunting pass followed, backed by a new 29-check read-only +self-test harness (`selftest.py`). It found and fixed: the `detection_active` +field was misleadingly named and inverted (renamed `threat_detected`, now +correctly meaning "active threat, tracks with infected"); the `status` table +rendered nested dicts as unreadable one-line blobs (added a sectioned +renderer); and the `quarantine` command 404'd due to two bugs — wrong module +path (needs service-scoped `quarantine/computers`) and wrong parameter +(`companyId`, not `parentId`). All 29 self-tests pass. Work was committed and +synced. + +## Key Decisions + +- Built the skill standalone rather than extending `api/services/gravityzone_service.py`, reusing its JSON-RPC/auth pattern but keeping the skill self-contained and CLI-driven. +- Cache only the identity/structure tier (company/endpoint/policy id<->name maps, package list) with a 24h TTL; never cache volatile status (infected/last-seen/signature freshness) — those always pull live. A stale "all clean" is worse than a slow truth. +- Gate all destructive actions (delete-*, isolate, blocklist-add/remove) behind `--confirm`; `raw` also refuses destructive method names without it. UNVERIFIED API methods are reachable only via `raw`, never exposed as convenience subcommands. +- Verified API method signatures via live read-only probes before wiring them, rather than trusting docs alone — this is how the quarantine `companyId`/path issue and the `incidents` license-gating were discovered. +- Stored `CLAUDETOOLS_ROOT` in `.claude/settings.local.json` (gitignored, per-machine) rather than the shared `settings.json`, so it doesn't break other fleet machines. +- Scoped v1 to "full management" but deferred the GuruRMM push-deploy and the GravityZone Push webhook to phase 2. + +## Problems Encountered + +- All-clients sweep crashed (`Invalid value for 'parentId'`): the no-`--company` path passed the companies container to `getEndpointsList`. Fixed by adding `security_sweep_all_clients()` that iterates each company. +- `detection_active` field was inverted/mislabeled, producing a false "302/303 detection off" reading. Root cause: `malwareStatus.detection` means "threat active now" (True=bad), not "engine on." Renamed to `threat_detected` with corrected semantics. +- `status` table mode dumped nested `apiKey`/`license` dicts as giant single lines. Added `_print_status` sectioned renderer; `--json` output unchanged. +- `quarantine` returned HTTP 404. Two bugs: bare `quarantine` module path 404s (needs `quarantine/computers`), and the param is `companyId` not `parentId`. Both fixed; ACG returns 2 real items. +- Process error: marked the quarantine task "completed" before verifying; the self-test caught it still failing. Reopened, probed the real signature, fixed, re-verified. Lesson: verify before closing. +- `incidents.getIncidentsList` returns "Method not found" on this tenant — EDR/incidents license feature is off (consistent with `managePatchManagement`/`managePHASR` being off). Blocklist (same module) works; network isolation likely also needs EDR licensing enabled. Not a code bug. +- Empty `CLAUDETOOLS_ROOT` in the Bash tool's non-interactive shell caused early vault.sh path failures. Resolved by adding the env var to `settings.local.json` and inline-setting it during the session. + +## Configuration Changes + +Created: +- `.claude/skills/bitdefender/SKILL.md` +- `.claude/skills/bitdefender/scripts/gz_client.py` +- `.claude/skills/bitdefender/scripts/gz.py` +- `.claude/skills/bitdefender/scripts/selftest.py` +- `.claude/skills/bitdefender/references/api-reference.md` + +Modified: +- `.gitignore` — added `.claude/skills/bitdefender/.cache/` and `.claude/scheduled_tasks.lock` +- `.claude/settings.local.json` — added `env.CLAUDETOOLS_ROOT = "C:/claudetools"` (gitignored, per-machine) + +Runtime (gitignored, not committed): +- `.claude/skills/bitdefender/.cache/inventory.json` — identity cache (55 companies, 532 endpoints, 9 policies, 100 packages) + +## Credentials & Secrets + +- GravityZone Public API key: vault entry `msp-tools/gravityzone.sops.yaml`, field `credentials.api_key`. Auth = HTTP Basic, key as username + empty password. Loaded at runtime only; never written to disk/logs/cache. Test override: `GRAVITYZONE_API_KEY` env var. +- No new secrets created this session. + +## Infrastructure & Servers + +- GravityZone Cloud Public API base: `https://cloud.gravityzone.bitdefender.com/api/v1.0/jsonrpc` +- Module path form: `/` (e.g. `/network`, `/packages`, `/incidents`). Quarantine requires service-scoped `/quarantine/computers`. +- ACG tenant IDs: `ACG_ROOT_COMPANY_ID = 5c4280716c0318f3478b456a`, `ACG_COMPANIES_CONTAINER_ID = 5c4280716c0318f3478b456e`. In `getNetworkInventoryItems`, `type == 1` = company node. +- Tenant scale: 55 client companies, ~532 managed endpoints, 128 used license slots, key expires 2030-02-17. Enabled API modules: companies, licensing, packages, network, integrations, policies, maintenancewindows, reports, accounts, incidents, push, quarantine, phasr, patchmanagement. License features OFF: managePatchManagement, managePHASR, EDR/incidents listing. + +## Commands & Outputs + +- `py .claude/skills/bitdefender/scripts/gz.py status` — license/slots/enabled modules. +- `py .../gz.py sweep [--company ]` — live posture; no `--company` sweeps all clients. +- `py .../gz.py quarantine --company ` — fixed; ACG returned 2 items (`Gen:Heur.Ransom.HiddenTears.1` on ACG-DC16 — assessed by Howard as a Datto program false positive, to review later). +- `py .../gz.py inventory --refresh` — rebuild identity cache. +- `py .../scripts/selftest.py` -> `29/29 passed, 0 failed`. +- Sweep result: 303 endpoints / 40 active companies; infected=1 (BOOKKEEPER @ Reliant Well Drilling), signature_outdated=201, product_outdated=6, stale>14d=183. Glaz-Tech Industries dominates stale/outdated (145 eps, ~78% stale — likely dead inventory). +- Live API errors observed (correct handling, exit 1): `Invalid value for 'parentId' parameter` (bad company), `Invalid value for 'endpointId' parameter` (bad endpoint), `Method not found` (incidents.getIncidentsList — license gated). + +## Pending / Incomplete Tasks + +- ACG-DC16 quarantine entry (`Gen:Heur.Ransom.HiddenTears.1`) — believed to be a Datto program false positive; Howard to review further. Not remediated. +- Glaz-Tech Industries dead-inventory cleanup — separate genuinely-offline-but-live machines from decommissioned records (inflating counts + burning license seats). Not started. +- EDR network isolation (`isolate`/`unisolate`) is wired but untested live; likely needs EDR/XDR licensing enabled on the tenant (a Bitdefender toggle). Mike decision. +- UNVERIFIED methods (`assignPolicy`, uninstall/reconfigure, quarantine restore/remove) remain `raw`-only until signatures confirmed live. +- Phase 2 (deferred): GuruRMM push-deploy of installer links; GravityZone Push webhook to keep cache event-fresh instead of polling. + +## Reference Information + +- Skill: `.claude/skills/bitdefender/` — `SKILL.md`, `scripts/gz_client.py`, `scripts/gz.py`, `scripts/selftest.py`, `references/api-reference.md` +- Existing server-side client (read-only, not modified): `api/services/gravityzone_service.py`, `api/routers/gravityzone.py`, `api/schemas/gravityzone.py` +- GravityZone Public API docs: https://www.bitdefender.com/business/support/en/77209-125277-public-api.html +- Push/SIEM (phase 2): https://www.bitdefender.com/business/support/en/77209-158570-sumo-logic.html ; methods `setPushEventSettings`, `getPushEventSettings`, `sendTestPushEvent` +- Commits: `bc0fb89` (initial skill), `ba7e5ed` (raw gating + --json fix), `a8b5a56` (sweep/quarantine/EDR/self-test fixes) +- Vault: `msp-tools/gravityzone.sops.yaml`