Commit Graph

707 Commits

Author SHA1 Message Date
8152476ee4 remediation-tool: document the 365 app suite + build consent-audit
Root-caused the recurring '365 suite isn't documented' pain: the apps are fine (tiered by
privilege) but per-tenant consent is NOT uniform and there was no way to see a tenant's
actual grant state. VWP had the Tenant Admin app but no SharePoint app-only role -> silent
401s until this session.

- references/app-suite.md: authoritative, live-verified map of every app, App ID, and
  actually-granted permission per tier; the consent-drift problem + both fix methods
  (adminconsent URL, direct appRoleAssignment grant).
- scripts/consent-audit.sh: audits a tenant (or --all) vs the baseline, grades
  GREEN/AMBER/RED, prints the exact fix per gap. Extends the assign-exchange-role --verify
  pattern to Graph scopes + SharePoint role + EXO role. Verified: BirthBio GREEN, VWP/Cascades
  AMBER (caught real drift - both missing grants).
- SKILL.md: run consent-audit FIRST on any tenant task. Memory + errorlog correction.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 15:15:08 -07:00
Winter Williams
ac23f17e23 sync: auto-sync from GURU-BEAST-ROG at 2026-07-02 10:55:41
Author: Mike Swanson
Machine: GURU-BEAST-ROG
Timestamp: 2026-07-02 10:55:41
2026-07-02 10:57:33 -07:00
26f47fdd10 sync: auto-sync from HOWARD-HOME at 2026-07-02 09:08:36
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-07-02 09:08:36
2026-07-02 09:10:02 -07:00
b07613c127 sync: auto-sync from GURU-5070 at 2026-07-01 20:07:01
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-07-01 20:07:01
2026-07-01 20:07:42 -07:00
2937b00ebf sync: auto-sync from GURU-5070 at 2026-07-01 15:49:56
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-07-01 15:49:56
2026-07-01 15:50:54 -07:00
c3aeef60fb sync: auto-sync from GURU-5070 at 2026-07-01 15:06:42
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-07-01 15:06:42
2026-07-01 15:07:39 -07:00
9e78a153f3 sync: auto-sync from HOWARD-HOME at 2026-07-01 13:22:23
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-07-01 13:22:23
2026-07-01 13:24:58 -07:00
af8a3de00e sync: auto-sync from GURU-5070 at 2026-07-01 13:06:10
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-07-01 13:06:10
2026-07-01 13:07:50 -07:00
6f7f939a62 sync: auto-sync from GURU-5070 at 2026-07-01 09:32:17
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-07-01 09:32:17
2026-07-01 09:33:09 -07:00
01613697c6 sync: auto-sync from GURU-5070 at 2026-06-30 17:21:06
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-30 17:21:06
2026-06-30 17:21:47 -07:00
51335db124 sync: auto-sync from HOWARD-HOME at 2026-06-30 11:27:16
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-30 11:27:16
2026-06-30 11:27:47 -07:00
5e92c33b73 sync: auto-sync from HOWARD-HOME at 2026-06-30 10:37:25
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-30 10:37:25
2026-06-30 10:37:58 -07:00
31f2bdb84f sync: auto-sync from HOWARD-HOME at 2026-06-29 16:55:22
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-29 16:55:22
2026-06-29 16:55:55 -07:00
9a6e1157a7 sync: auto-sync from GURU-5070 at 2026-06-29 15:30:34
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-29 15:30:34
2026-06-29 15:31:35 -07:00
ca37a606c6 harness: add Definition-of-Done skill routing (auto-gate work with matching check-skills)
Skill-first rule now has two halves: route the request to a doing-skill,
then gate the result with the matching check-skill before 'done' --
inferred from the request, not user-named. Adds .claude/SKILL_ROUTING.md
(on-demand request->doing-skill->check-skill map). Enforcement tier A+B
(CORE rule + map; Stop-hook backstop deferred). Calibrate to stakes,
Ollama Tier-0 for cheap passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 11:57:44 -07:00
5bace24371 sync: auto-sync from HOWARD-HOME at 2026-06-26 11:40:19
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-26 11:40:19
2026-06-26 11:40:52 -07:00
10a90bb213 sync: auto-sync from HOWARD-HOME at 2026-06-26 08:41:22
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-26 08:41:22
2026-06-26 08:42:30 -07:00
270e294938 sync: auto-sync from HOWARD-HOME at 2026-06-26 07:19:00
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-26 07:19:00
2026-06-26 07:20:04 -07:00
79789a8815 sync: auto-sync from GURU-5070 at 2026-06-26 04:15:16
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-26 04:15:16
2026-06-26 04:16:39 -07:00
1d99dc93ed sync: auto-sync from HOWARD-HOME at 2026-06-25 23:09:59
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 23:09:59
2026-06-25 23:10:26 -07:00
04b0d12150 sync: auto-sync from HOWARD-HOME at 2026-06-25 21:48:38
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 21:48:38
2026-06-25 21:49:05 -07:00
563ff9e8fa sync: auto-sync from HOWARD-HOME at 2026-06-25 21:21:56
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 21:21:56
2026-06-25 21:23:24 -07:00
730d26437b sync: auto-sync from GURU-5070 at 2026-06-25 21:13:47
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-25 21:13:47
2026-06-25 21:15:00 -07:00
cf960d1b2a sync: auto-sync from HOWARD-HOME at 2026-06-25 20:23:53
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 20:23:53
2026-06-25 20:24:23 -07:00
9f5fedda06 memory: RMM Set-Acl/icacls timeout drops stdout (lost password); generate secrets locally 2026-06-25 19:28:11 -07:00
93bd5379e3 sync: auto-sync from HOWARD-HOME at 2026-06-25 15:21:30
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 15:21:30
2026-06-25 15:22:06 -07:00
79bda6fab9 datto-edr: apply code-review fixes (gating + footgun hardening)
- deploy-cmd: require explicit --regkey or --group; never auto-pick an
  arbitrary cross-client registration key (would enroll into wrong org).
- raw: block POST to any */scan endpoint with no non-empty `where`
  (same tenant-wide footgun the scan command guards against).
- main(): catch-all for unexpected exceptions -> [ERROR] + errorlog,
  plus clean KeyboardInterrupt (130).
- isolate: forgiving extension-name match (exact, then substring),
  excludes the paired "Restore" ext; errors on ambiguous match.
- detections: --site -> --target-group; Alert.targetGroupId is a
  scan-target id, not a Location id (distinct from `agents --site`).
- status: relabel "Target groups (sites)" -> "Scan target groups".
- SKILL.md + docstrings updated to match.

Verified: py_compile clean, selftest green (216 agents), guards fire
on no-key/empty-where/no-agent, deploy-cmd --group picks the group's key.
2026-06-25 15:14:18 -07:00
befd2650c8 fix(bitdefender): fifth-pass - companies lists full fleet, drop unused import
Convergence-pass LOW/NIT cleanup:
- cmd_companies uses list_all_companies() so a >100-company tenant isn't truncated
  in the listing (was page-1 only); matches sweep/inventory.
- removed unused 'field' import from dataclasses.
Deliberately NOT changed: id validation on delete-package/report-delete/blocklist-
remove/quarantine-remove/restore - those ids are not pinned 24-hex format, so
validating could reject valid input; they are --confirm-gated and bad ids match
the expected-error markers (no mislog). 81/81 selftest.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 14:05:32 -07:00
3d6cb467bf fix(bitdefender): fourth-pass - urllib reset safety, Retry-After clamp, sweep/install-links id validation
From a third review pass (converging - all MEDIUM/LOW):
- urllib fallback: a post-send reset (RemoteDisconnected/ConnectionReset, which
  urllib wraps in URLError) was misclassified as always-safe 'connect' and could
  retry a non-idempotent write after a server commit. Now only ConnectionRefused/
  DNS (socket.gaierror) -> 'connect'; everything else -> 'timeout' (write-gated).
- _retry_delay clamps a negative numeric Retry-After to 0 (was -> time.sleep(-1) ValueError).
- cmd_sweep + cmd_install_links now validate --company; cmd_company_create validates
  --parent (finished _require_oid consistency - these mislogged as errorlog noise).
- cmd_push_test parses --extra-json before gating (validate->gate order, matches siblings).
- selftest: +sweep/install-links bad-company assertions. 81/81. Units: clamp + reset classification.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 13:59:19 -07:00
1e80fb24db datto-edr: fix scan to verified Agents/scan endpoint + harden
- Scan now uses POST Agents/scan with AND-wrapped where {and:[{id:[...]}]}
  (the Infocyte targets/{id}/scan routes are dead/404; bare {id:{inq}} returns
  HTTP 412 ambiguous-column). Verified live: single-agent scan -> 'Scanning 1 host'.
- scan/isolate REQUIRE explicit --agent ids; empty list refused (tenant-wide footgun).
- isolate rides Agents/scan with the Host Isolation extension in options.extensions;
  resolves --extension-name -> id via /Extensions.
- New subcommands: tasks, task, cancel, create-group, mint-key.
- deploy-cmd emits full -URL (not -InstanceName; cname 'azcomp4587' trips the
  install script's .com regex and leaves --url empty).
- Docs (SKILL.md + api-reference.md) rewritten to the verified endpoints + footguns.

Lifecycle verified end-to-end on RMM-TEST-MACHINE (create-group/mint-key/install/
register/scan/cancel).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 13:55:22 -07:00
778e12d83e fix(bitdefender): third-pass - companies pagination, connect-retry, Retry-After ceiling, ctx mgr
Remaining LOW/NIT items from the second review pass:
- list_all_companies() paginates the company list; sweep-all + refresh_inventory
  no longer truncate a >100-company tenant.
- Pre-send connection failures (httpx ConnectError/ConnectTimeout; urllib URLError
  not wrapping a timeout) are now retried as 'connect' - always safe (no side
  effect) even for non-idempotent writes; ambiguous read-timeouts stay idempotent-gated.
- Explicit Retry-After honored up to RETRY_AFTER_MAX_SECONDS (120s) instead of the
  30s exponential cap, so a server-mandated cooldown isn't cut short.
- GravityZoneClient is now a context manager (__enter__/__exit__ -> close()).
- incident-status/note reject an empty --set-json (rc2), matching account-update/notif.
- selftest: +connect/Retry-After/ctx-mgr unit coverage, incident empty-json assertion. 79/79.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 13:52:33 -07:00
a96a15a1d2 sync: auto-sync from HOWARD-HOME at 2026-06-25 13:43:47
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 13:43:47
2026-06-25 13:44:18 -07:00
1852f755ad fix(bitdefender): doc LOWs + sweep pagination + selftest alignment
Batched the audit doc/LOW findings plus the two pagination LOWs:
- Pagination (gz_client): security_sweep and refresh_inventory stopped on a
  'total' field some responses omit, truncating after page 1. Now page until a
  short page (< per_page) - the reliable last-page signal.
- isolate/restore docstrings (gz_client): removed the stale 'v1.2 takes an ARRAY
  endpointIds' lines that contradicted the verified single-endpointId code.
- Cache 'no PII' wording corrected (gz_client header + SKILL.md): cache holds infra
  identifiers (hostnames/FQDNs); no secrets. Dead _require_company_for_sweep removed.
- Doc drift fixed: delete-package is '--id <packageId>' not '--package <name>'
  (SKILL.md + api-reference.md, verified live); module docstring + sweep --company
  help corrected (sweep with no --company fans out to ALL companies).
- selftest aligned to the improved behavior: malformed ids now exit rc2 client-side
  (H3) instead of rc1; gate-refusal 'Would' messages now assert on stderr (they
  moved off stdout so --json isn't polluted). 75/75 pass; live sweep verified.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 13:28:38 -07:00
5b3dd84fb9 synology: fix SSH backend syno* CLI resolution (full pre-test verification)
Found during a full command-surface recheck: every privileged SSH recipe
(shares/users/groups/acl) was broken — sudo secure_path drops /usr/syno/{bin,sbin}
so synoshare/synouser/synogroup/synoacltool were "command not found" (non-sudo
plain recipes worked because the admin login PATH has them).

- Inject SYNO_PATH into priv()/plain(); run priv via `sh -c` so operators work.
- synouser/synogroup use `--enum local` (not the invalid `--list`).
- acl quotes the share path (handles spaces, e.g. "Sandra Fish").
- services repointed to Web API (no synoservice on DSM 7.2; synosystemctl has no list-all).

Verified live: all Web API reads, all SSH reads (acl returns real Windows ACEs),
write path (share create/delete), and every destructive command correctly gated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 13:28:38 -07:00
4d7508382d fix(bitdefender): atomic cache writes + advisory lock on write-through (M3)
Audit fix M3: _write_cache did a non-atomic CACHE_FILE.write_text and the
write-through helpers did unlocked read-modify-write, so a crash mid-write could
truncate inventory.json and two concurrent gz.py runs could lose an update.
- _write_cache now writes a temp file (fsync) then os.replace() - atomic on the
  same filesystem; a reader/crash can never see a partial file, and a failed
  write leaves the prior cache intact and no .tmp residue.
- Added a best-effort cross-platform advisory lock (_cache_lock) around the
  read-modify-write in _cache_add_group/_cache_add_package; steals a stale lock,
  proceeds unlocked on timeout (a lost update is tolerable, a hang is not).
- Dropped the dead cache.setdefault('companies', ...) line in _cache_add_group.
- Verified: compile clean; unit tests for round-trip, lock acquire/release/steal,
  write-through, temp cleanup on failure, and prior-cache survival.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 12:54:50 -07:00
d4fd71baab sync: auto-sync from HOWARD-HOME at 2026-06-25 12:53:21
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 12:53:21
2026-06-25 12:54:04 -07:00
51751e6473 fix(bitdefender): retry 429/5xx/timeout with backoff + reuse one httpx client
Audit fix H2 (+ M2): the live GravityZone tenant is rate-limited and sweeps fan
out one getManagedEndpointDetails per endpoint across every company, which hit a
real HTTP 429 (errorlog 2026-06-21). _post had zero retry and opened a fresh
httpx.Client (new TLS handshake) per request.
- _post now retries 429/500/502/503/504/timeout up to RETRY_MAX_ATTEMPTS with
  bounded exponential backoff + jitter, honoring Retry-After (numeric or HTTP-date).
  Retry notices go to stderr (don't pollute --json). Terminal errors still raise.
- M2: a single httpx.Client is created lazily and reused (connection pooling),
  closed via client.close() in main()'s finally. Makes the docstring's pooling
  claim true and cuts handshake overhead + 429 pressure during sweeps.
- Verified: compile clean; offline unit tests (persistent 429 -> 4 attempts then
  raise, flaky 503 -> recovers, Retry-After honored); live status read OK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 12:51:08 -07:00
d8f0974e0f fix(bitdefender): gate move/scan/create-package/make-group + validate object IDs
Audit cluster C1/C2/H1/H3/M1 on the live GravityZone tenant:
- C1/H1/M1: move, scan, create-package, make-group called the live API with
  no --confirm; added _gated() + a --confirm flag to each (move can change an
  endpoint's inherited policy posture).
- C2: extend raw's destructive-method denylist with moveEndpoints/moveCustomGroup/
  createScanTask/createPackage/createCustomGroup so 'raw' can't bypass the gates.
- H3: add _require_oid() 24-char-hex validation to endpoint/policy/endpoints +
  the gated handlers, so malformed ids no longer hit the tenant or get mislogged
  as functional errors (source of the 2026-06-21 errorlog noise).
- Gate refusals now print to stderr (don't pollute --json). SKILL.md gating list
  updated. Verified: compile clean; gates exit 3, bad ids exit 2, raw denylist hits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 12:47:45 -07:00
e9ece35c2a sync: auto-sync from HOWARD-HOME at 2026-06-25 12:45:08
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 12:45:08
2026-06-25 12:45:43 -07:00
bd1e84d32c skills: add datto-edr (Datto EDR / Infocyte HUNT) + syncro-rmm memory
New /datto-edr skill — standalone CLI for the Datto EDR REST API (azcomp4587,
rebranded Infocyte HUNT, raw-token LoopBack). Live-verified reads across the whole
MSP fleet: orgs/sites/agents/detections/sweep + deploy-cmd. Scan/isolate gated
behind --confirm (shape-verified, not run against prod). Token vaulted at
msp-tools/datto-edr.sops.yaml.

Also: reference_syncro_rmm_api_gui_only memory (Syncro RMM policy mgmt is
GUI-only) and the guru-rmm submodule pointer bump (Feature 6 EDR research).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 12:39:58 -07:00
e61b39b5c8 sync: auto-sync from GURU-5070 at 2026-06-25 12:35:22
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-25 12:35:22
2026-06-25 12:37:54 -07:00
0f803c2d9c sync: auto-sync from HOWARD-HOME at 2026-06-25 12:31:56
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 12:31:56
2026-06-25 12:32:31 -07:00
b9d4cfde98 sync: auto-sync from HOWARD-HOME at 2026-06-25 12:30:38
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 12:30:38
2026-06-25 12:31:14 -07:00
974be13f4c synology: code-review hardening + trim over-budget description
Addresses 5 verified findings from /code-review high:
- SynoError carries DSM code + handled flag; call() no longer logs eagerly.
  Top-level handler logs only genuine unhandled failures, so the handled
  FileStation denial + VPN-down connect errors stop polluting errorlog.md
  (was a CLAUDE.md rule violation: don't log handled conditions).
- FileStation-denial detection is numeric (code in 400/407), not substring.
- SSH hint now also fires on the generic `call` path, not just `ls`.
- `services` falls back get->list on 103 for older DSM builds (multi-device).
- BrokenPipe flush moved inside try so small piped output can't leak a traceback.
- Trim SKILL description 755->515 chars (was the longest of 32 skills; self-check
  registry-budget WARN).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 12:23:43 -07:00
21ef1f2570 synology: fix --confirm arg position + verify write path live
- Fix argparse: --confirm/--vault were only accepted BEFORE the subcommand, so
  every documented gated-write (e.g. `call X set k=v --confirm`) failed. Moved to
  a shared parent parser (SUPPRESS defaults) -> both flags work in either position.
- Verified the CSRF write path live on cascadesDS: Share create -> verify ->
  delete -> verify gone. Both mutating calls succeeded; device left pristine.
- SKILL.md: write/setter path marked VERIFIED; confirmed share-create signature.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:56:32 -07:00
4a63b583b7 sync: auto-sync from HOWARD-HOME at 2026-06-25 11:42:29
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-25 11:42:29
2026-06-25 11:42:58 -07:00
fc36f98450 synology: full read-surface sweep + FileStation ls graceful fallback
Exercised every Web API + SSH read command live against cascadesDS.
- All reads OK; `ls <folder>` (FileStation list) is 407-denied for the admin
  account on this box (confirmed on-box as SYSTEM_ADMIN) -> now catches the
  400/407 and prints an SSH file-browse hint. `ls` share-roots still works.
- SSH backend (info/df/run + privileged synowebapi) verified.
- Documented MSYS path-mangling of bare `ls /path` arg on Windows.
- SKILL.md: per-command results; flagged write/setter path as not-yet-live.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:19:04 -07:00
7173f4dc12 synology: live-verify against cascadesDS, fix services method + pipe handling
First live exercise of the skill against the Cascades DS718+ (DSM 7.2.1, VPN up).
- Fix `services`: SYNO.Core.Service.list 103s on DSM 7.2.1 -> method is `get`.
- Fix `apis | head` BrokenPipeError traceback -> caught, clean exit.
- SKILL.md status PLUMBED -> VERIFIED 2026-06-25 with live device facts.
- wiki/cascades-tucson: add NAS specs, resolve model/RAM/DSM TODO.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:08:28 -07:00
2a1a275511 sync: auto-sync from HOWARD-HOME at 2026-06-24 17:37:00
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-24 17:37:00
2026-06-24 17:37:35 -07:00
9d68db953f sync: auto-sync from HOWARD-HOME at 2026-06-24 15:39:19
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-24 15:39:19
2026-06-24 15:39:54 -07:00