sync: auto-sync from HOWARD-HOME at 2026-05-30 07:29:45
Author: Howard Enos Machine: HOWARD-HOME Timestamp: 2026-05-30 07:29:45
This commit is contained in:
122
session-logs/2026-05-30-howard-bitdefender.md
Normal file
122
session-logs/2026-05-30-howard-bitdefender.md
Normal file
@@ -0,0 +1,122 @@
|
||||
# Session Log — 2026-05-30 — Bitdefender GravityZone Skill (Howard)
|
||||
|
||||
## User
|
||||
- **User:** Howard Enos (howard)
|
||||
- **Machine:** Howard-Home
|
||||
- **Role:** tech
|
||||
|
||||
## Session Summary
|
||||
|
||||
Built a new `/bitdefender` Claude Code skill that drives the Arizona Computer
|
||||
Guru GravityZone Cloud partner tenant through the JSON-RPC Public API. The work
|
||||
started from the observation that the web console cannot be accessed
|
||||
programmatically, but a partner-level API key already existed in the SOPS vault
|
||||
(`msp-tools/gravityzone.sops.yaml`) and a read-only client already lived in the
|
||||
server code at `api/services/gravityzone_service.py`. The new skill is a
|
||||
standalone CLI that reuses that proven JSON-RPC + HTTP Basic auth pattern and
|
||||
extends it with management operations, an identity-tier JSON cache, and
|
||||
`--confirm` gating on destructive actions.
|
||||
|
||||
The build proceeded in phases. First, the core skill (client `gz_client.py`,
|
||||
CLI `gz.py`, `SKILL.md`, `references/api-reference.md`) covering read commands
|
||||
(status, companies, endpoints, sweep, policies, packages, quarantine,
|
||||
inventory) and management (create-package, install-links, scan, move,
|
||||
make-group, gated deletes). The API method specs were verified via web research
|
||||
plus live read-only probes against the tenant rather than guessing signatures.
|
||||
Next, security hardening from a code review: API key loaded from the vault at
|
||||
runtime only (never on disk/logs/argv/cache), error-body truncation, and `raw`
|
||||
destructive-method gating. Then EDR/incident-response commands from the
|
||||
`incidents` module (blocklist, isolate/unisolate, blocklist-add/remove).
|
||||
|
||||
A live all-clients security sweep was run, which exposed a real bug: the
|
||||
all-clients path passed the companies *container* ID to `getEndpointsList`,
|
||||
which the API rejects. Fixed by iterating each company individually (mirroring
|
||||
the original server service). The corrected sweep covered 303 endpoints across
|
||||
40 active companies: 1 infected (BOOKKEEPER at Reliant Well Drilling), 201 with
|
||||
outdated signatures, 6 with outdated agents, 183 stale (>14 days). Most of the
|
||||
"stale/outdated" volume traced to Glaz-Tech Industries (145 endpoints, ~78%
|
||||
stale) which appears to be dead inventory never removed from GravityZone.
|
||||
|
||||
A systematic bug-hunting pass followed, backed by a new 29-check read-only
|
||||
self-test harness (`selftest.py`). It found and fixed: the `detection_active`
|
||||
field was misleadingly named and inverted (renamed `threat_detected`, now
|
||||
correctly meaning "active threat, tracks with infected"); the `status` table
|
||||
rendered nested dicts as unreadable one-line blobs (added a sectioned
|
||||
renderer); and the `quarantine` command 404'd due to two bugs — wrong module
|
||||
path (needs service-scoped `quarantine/computers`) and wrong parameter
|
||||
(`companyId`, not `parentId`). All 29 self-tests pass. Work was committed and
|
||||
synced.
|
||||
|
||||
## Key Decisions
|
||||
|
||||
- Built the skill standalone rather than extending `api/services/gravityzone_service.py`, reusing its JSON-RPC/auth pattern but keeping the skill self-contained and CLI-driven.
|
||||
- Cache only the identity/structure tier (company/endpoint/policy id<->name maps, package list) with a 24h TTL; never cache volatile status (infected/last-seen/signature freshness) — those always pull live. A stale "all clean" is worse than a slow truth.
|
||||
- Gate all destructive actions (delete-*, isolate, blocklist-add/remove) behind `--confirm`; `raw` also refuses destructive method names without it. UNVERIFIED API methods are reachable only via `raw`, never exposed as convenience subcommands.
|
||||
- Verified API method signatures via live read-only probes before wiring them, rather than trusting docs alone — this is how the quarantine `companyId`/path issue and the `incidents` license-gating were discovered.
|
||||
- Stored `CLAUDETOOLS_ROOT` in `.claude/settings.local.json` (gitignored, per-machine) rather than the shared `settings.json`, so it doesn't break other fleet machines.
|
||||
- Scoped v1 to "full management" but deferred the GuruRMM push-deploy and the GravityZone Push webhook to phase 2.
|
||||
|
||||
## Problems Encountered
|
||||
|
||||
- All-clients sweep crashed (`Invalid value for 'parentId'`): the no-`--company` path passed the companies container to `getEndpointsList`. Fixed by adding `security_sweep_all_clients()` that iterates each company.
|
||||
- `detection_active` field was inverted/mislabeled, producing a false "302/303 detection off" reading. Root cause: `malwareStatus.detection` means "threat active now" (True=bad), not "engine on." Renamed to `threat_detected` with corrected semantics.
|
||||
- `status` table mode dumped nested `apiKey`/`license` dicts as giant single lines. Added `_print_status` sectioned renderer; `--json` output unchanged.
|
||||
- `quarantine` returned HTTP 404. Two bugs: bare `quarantine` module path 404s (needs `quarantine/computers`), and the param is `companyId` not `parentId`. Both fixed; ACG returns 2 real items.
|
||||
- Process error: marked the quarantine task "completed" before verifying; the self-test caught it still failing. Reopened, probed the real signature, fixed, re-verified. Lesson: verify before closing.
|
||||
- `incidents.getIncidentsList` returns "Method not found" on this tenant — EDR/incidents license feature is off (consistent with `managePatchManagement`/`managePHASR` being off). Blocklist (same module) works; network isolation likely also needs EDR licensing enabled. Not a code bug.
|
||||
- Empty `CLAUDETOOLS_ROOT` in the Bash tool's non-interactive shell caused early vault.sh path failures. Resolved by adding the env var to `settings.local.json` and inline-setting it during the session.
|
||||
|
||||
## Configuration Changes
|
||||
|
||||
Created:
|
||||
- `.claude/skills/bitdefender/SKILL.md`
|
||||
- `.claude/skills/bitdefender/scripts/gz_client.py`
|
||||
- `.claude/skills/bitdefender/scripts/gz.py`
|
||||
- `.claude/skills/bitdefender/scripts/selftest.py`
|
||||
- `.claude/skills/bitdefender/references/api-reference.md`
|
||||
|
||||
Modified:
|
||||
- `.gitignore` — added `.claude/skills/bitdefender/.cache/` and `.claude/scheduled_tasks.lock`
|
||||
- `.claude/settings.local.json` — added `env.CLAUDETOOLS_ROOT = "C:/claudetools"` (gitignored, per-machine)
|
||||
|
||||
Runtime (gitignored, not committed):
|
||||
- `.claude/skills/bitdefender/.cache/inventory.json` — identity cache (55 companies, 532 endpoints, 9 policies, 100 packages)
|
||||
|
||||
## Credentials & Secrets
|
||||
|
||||
- GravityZone Public API key: vault entry `msp-tools/gravityzone.sops.yaml`, field `credentials.api_key`. Auth = HTTP Basic, key as username + empty password. Loaded at runtime only; never written to disk/logs/cache. Test override: `GRAVITYZONE_API_KEY` env var.
|
||||
- No new secrets created this session.
|
||||
|
||||
## Infrastructure & Servers
|
||||
|
||||
- GravityZone Cloud Public API base: `https://cloud.gravityzone.bitdefender.com/api/v1.0/jsonrpc`
|
||||
- Module path form: `<base>/<module>` (e.g. `/network`, `/packages`, `/incidents`). Quarantine requires service-scoped `/quarantine/computers`.
|
||||
- ACG tenant IDs: `ACG_ROOT_COMPANY_ID = 5c4280716c0318f3478b456a`, `ACG_COMPANIES_CONTAINER_ID = 5c4280716c0318f3478b456e`. In `getNetworkInventoryItems`, `type == 1` = company node.
|
||||
- Tenant scale: 55 client companies, ~532 managed endpoints, 128 used license slots, key expires 2030-02-17. Enabled API modules: companies, licensing, packages, network, integrations, policies, maintenancewindows, reports, accounts, incidents, push, quarantine, phasr, patchmanagement. License features OFF: managePatchManagement, managePHASR, EDR/incidents listing.
|
||||
|
||||
## Commands & Outputs
|
||||
|
||||
- `py .claude/skills/bitdefender/scripts/gz.py status` — license/slots/enabled modules.
|
||||
- `py .../gz.py sweep [--company <id>]` — live posture; no `--company` sweeps all clients.
|
||||
- `py .../gz.py quarantine --company <id>` — fixed; ACG returned 2 items (`Gen:Heur.Ransom.HiddenTears.1` on ACG-DC16 — assessed by Howard as a Datto program false positive, to review later).
|
||||
- `py .../gz.py inventory --refresh` — rebuild identity cache.
|
||||
- `py .../scripts/selftest.py` -> `29/29 passed, 0 failed`.
|
||||
- Sweep result: 303 endpoints / 40 active companies; infected=1 (BOOKKEEPER @ Reliant Well Drilling), signature_outdated=201, product_outdated=6, stale>14d=183. Glaz-Tech Industries dominates stale/outdated (145 eps, ~78% stale — likely dead inventory).
|
||||
- Live API errors observed (correct handling, exit 1): `Invalid value for 'parentId' parameter` (bad company), `Invalid value for 'endpointId' parameter` (bad endpoint), `Method not found` (incidents.getIncidentsList — license gated).
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
|
||||
- ACG-DC16 quarantine entry (`Gen:Heur.Ransom.HiddenTears.1`) — believed to be a Datto program false positive; Howard to review further. Not remediated.
|
||||
- Glaz-Tech Industries dead-inventory cleanup — separate genuinely-offline-but-live machines from decommissioned records (inflating counts + burning license seats). Not started.
|
||||
- EDR network isolation (`isolate`/`unisolate`) is wired but untested live; likely needs EDR/XDR licensing enabled on the tenant (a Bitdefender toggle). Mike decision.
|
||||
- UNVERIFIED methods (`assignPolicy`, uninstall/reconfigure, quarantine restore/remove) remain `raw`-only until signatures confirmed live.
|
||||
- Phase 2 (deferred): GuruRMM push-deploy of installer links; GravityZone Push webhook to keep cache event-fresh instead of polling.
|
||||
|
||||
## Reference Information
|
||||
|
||||
- Skill: `.claude/skills/bitdefender/` — `SKILL.md`, `scripts/gz_client.py`, `scripts/gz.py`, `scripts/selftest.py`, `references/api-reference.md`
|
||||
- Existing server-side client (read-only, not modified): `api/services/gravityzone_service.py`, `api/routers/gravityzone.py`, `api/schemas/gravityzone.py`
|
||||
- GravityZone Public API docs: https://www.bitdefender.com/business/support/en/77209-125277-public-api.html
|
||||
- Push/SIEM (phase 2): https://www.bitdefender.com/business/support/en/77209-158570-sumo-logic.html ; methods `setPushEventSettings`, `getPushEventSettings`, `sendTestPushEvent`
|
||||
- Commits: `bc0fb89` (initial skill), `ba7e5ed` (raw gating + --json fix), `a8b5a56` (sweep/quarantine/EDR/self-test fixes)
|
||||
- Vault: `msp-tools/gravityzone.sops.yaml`
|
||||
Reference in New Issue
Block a user