Commit Graph

583 Commits

Author SHA1 Message Date
d5bfe76780 Merge remote-tracking branch 'origin/ad2' 2026-06-18 18:59:31 -07:00
783c5f653a fix(wiki-compile): release coord lock by ID, not resource path
coord.py 'lock release' takes the lock ID; the documented path form no-ops
and strands the lock until TTL. Capture the lock ID at claim (5.0), release
it in Phase 6. Recurring friction (errorlog 2x).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 15:31:38 -07:00
248eb2c049 sync: auto-sync from HOWARD-HOME at 2026-06-18 15:31:12
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-18 15:31:12
2026-06-18 15:31:20 -07:00
9c4181aea5 sync: auto-sync from AD2 at 2026-06-18 14:07:45
Author: Mike Swanson
Machine: AD2
Timestamp: 2026-06-18 14:07:45
2026-06-18 14:08:42 -07:00
9c04c23ab0 dataforth(datasheet): wire DSCA33/45 Hoffman-mined templates (gated; accuracy-data WIP)
Per the 5070 handoff (DSCA33-45-HOFFMAN-RECOVERY): the lost DSCA33/45 specs are
recoverable from Hoffman, not John. Wired the mined dsca33-45-templates.json (56
models) into the renderer:

- datasheet-exact.js: load DSCA3345_TEMPLATES; for family DSCA, the Hoffman-mined
  template takes PRECEDENCE over the stale staged-extraction entry (which shadowed 25
  models with accOut "?"/no accHeader). Emit the verbatim 2-line accHeader for these
  families (Vin (mVAC)/Iin (AAC)/Frequency (Hz), Output (VDC)/(mADC)). Per-model
  `validated` GATE: a DSCA33/45 model renders only after byte-matching its Hoffman
  original; until then it returns null (skipped) so an unverified render can never
  overwrite a pristine live original. DSCA_VALIDATE_MODE env opens the gate for the
  validation harness only. Exposed rendersWithoutSpecs().
- render-datasheet.js: allow a null-specs render for DSCA33/45 (their spec files were
  lost; template-driven) instead of bailing on missing specs.
- derive-dsca-slotmaps.js: DSCA_TPL env to target the 3345 templates; derived 43 slot
  maps into them (22 models need none, 8 DSCA33 still below threshold).
- validate-dsca3345.js (new): renders each model's _srcSerial, fetches the live
  Hoffman original (GET TestReportDataFiles/{serial}, deployed uploader token — no
  vault needed), content-normalized compare; --apply marks validated.

STATUS: gate is CLOSED — 0 models validated, all DSCA33/45 still render null, nothing
published, no risk. Final-Test block + accuracy headers now byte-match the Hoffman
originals for all 56 models; the remaining blocker is accuracy-DATA numeric quirks that
must match to pass the gate:
  - DSCA33 calc column stored in A but displayed in mADC (x1000); measured stored in
    mA (not scaled) — an original-software unit quirk.
  - sign conventions differ per layout (DSCA33 stim/calc/meas unsigned, error signed;
    DSCA45 stim unsigned, calc/meas/error signed).
  - DSCA45 frequency-input stim formatting.
These need per-layout reverse-engineering against the originals (the validation harness
is the oracle). 8 DSCA33 models (DSCA33-02/03/03A/04/04A/05/05A/1642) also lack a slot
map (below threshold). DSCA33-1948 + DSCA45-1746 (24 units) have no Hoffman original.

Cleanups: deleted superseded memory project_dsca33_45_spec_gap; struck the obsolete
"ask John" TODO 2 from the handoff note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 13:32:37 -07:00
b71626a36d memory: DSCA33/DSCA45 spec gap (missing main specs, not a bug)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 13:02:34 -07:00
b00dbb8311 memory: AD2 sync.sh pushes main not ad2 (fork push gotcha)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 13:02:32 -07:00
3c0ec0d390 memory: record AD2 Dataforth-fork structure + sync gotchas
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 13:02:31 -07:00
6f4cadb16f sync: auto-sync from GURU-5070 at 2026-06-18 12:49:38
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-18 12:49:38
2026-06-18 12:51:08 -07:00
c5643ee419 dataforth/dsca33-45: recover lost specs from Hoffman API (56/58 models)
The DSCA33/DSCA45 main spec files lost in the cryptolocker wipe are recoverable:
the original software published correct certs to the Hoffman product API before
the wipe and our null-skipping renderer never overwrote them. Mine per-model
Final-Test templates (names + specs + verbatim accuracy headers) straight from
those originals instead of requesting spec files from Dataforth/John.

- dsca33-45-templates.json: 56 models (DSCA33 34/35, DSCA45 22/23); only
  DSCA33-1948 + DSCA45-1746 (24 units) lack an original.
- mine-hoffman-dsca.py: the re-runnable miner.
- DSCA33-45-HOFFMAN-RECOVERY handoff for the AD2 session (incl. the gate:
  validate each render vs its Hoffman original before enabling live rendering).
- memories: Hoffman recovery (supersedes the spec-gap "need John" note) and the
  AD2 SSH MTU-blackhole root cause/fix; errorlog entries (syncro jq, ssh correction).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 12:50:43 -07:00
bfe375044d sync: auto-sync from GURU-5070 at 2026-06-18 05:58:48
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-18 05:58:48
2026-06-18 05:59:05 -07:00
f36fb97eb8 sync: auto-sync from HOWARD-HOME at 2026-06-17 22:46:27
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-17 22:46:27
2026-06-17 22:46:37 -07:00
dc4560cf27 sync: auto-sync from HOWARD-HOME at 2026-06-17 17:49:01
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-17 17:49:01
2026-06-17 17:49:20 -07:00
ed2819ac87 sync: auto-sync from GURU-5070 at 2026-06-17 16:18:26
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-17 16:18:26
2026-06-17 16:18:44 -07:00
cabbc0eb6e sync: auto-sync from HOWARD-HOME at 2026-06-17 12:34:44
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-17 12:34:44
2026-06-17 12:35:36 -07:00
2b792ee5d1 agy(gemini): fix false auth-abort in retry loop + add quota fallback to default model
While using the new 3-retry gemini path for live VPN research, two bugs surfaced:
- emit_or_fail checked auth_failed INSIDE the retry loop; a benign mid-run token-refresh line
  matched the over-broad auth regex (bare login|credential|authenticat|oauth|401) and aborted the
  retries with a false "auth error" - even though `gemini -p` auth tested fine. Moved auth-classify
  to AFTER the retries (it only picks the final error message now) and tightened auth_failed to real
  signatures (invalid_grant, not authenticated, login with google, token expired, ...).
- Added quota_exhausted() + a QUOTA FALLBACK: the pinned strong model (gemini-3.1-pro-preview) hit
  "exhausted your capacity on this model" mid-session; emit_or_fail now retries once on the default
  (lighter) model by stripping -m (separate quota). Validated: capped pro run -> fell back -> 2.9KB answer.

CT_THOUGHTS Thought 2 Resolution updated with both. (Search-bot reliability hardening continues.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 12:09:58 -07:00
315f45bf7c search-bots: fix reliability (diagnosed) - gemini 3-retry + grok xsearch auto-fallback to gemini
Mike's must-fix. Diagnosed from RAW output of failing queries (not guessed):
- grok xsearch = TIMEOUT: grok-4.20-multi-agent web_search runs past budget on multi-part queries
  (286s/280s, rc=124, still searching - 183 thoughts, only progress-noise text); buffered json => total loss.
- gemini search = INTERMITTENT empty turn (a clean re-run gave a real 2.6KB answer in 122s); the wrapper
  retried only once, so two empties in a row failed spuriously.

Fixes:
- ask-gemini.sh emit_or_fail: retry up to 3x with 3s/6s backoff (was 1).
- ask-grok.sh xsearch: --output-format streaming-json (salvage partials) + AUTO-FALLBACK to
  ask-gemini.sh search when grok doesn't finish (rc!=0 or empty). Validated e2e: grok timed out
  (rc=124) -> fell back -> gemini returned a real sourced answer (UniFi Teleport invite-link API).

grok's own multi-agent timeout is an xAI-side limitation; the fallback makes xsearch reliable regardless.
Docs: grok SKILL.md xsearch row + CT_THOUGHTS Thought 2 Resolution.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 10:38:44 -07:00
1dd2f208a0 ct-thoughts: web-search bots reliability = MUST FIX (Mike) + research-method correction
Mike's correction: web search (grok xsearch + gemini search) carries at least as much weight as
live API probing - the searches gave the real leads this session (connector proxy, teleport setting
path); blind endpoint-probing is "highly suspect" (mostly 404s). And the search bots MUST be properly
fixed - both returned empty repeatedly on UniFi research despite the same-day partial grok fix.

- docs/CT_THOUGHTS.md: Thought 2 (HIGH PRIORITY) - web-search reliability must-fix, with the observed
  failures + a proper-fix investigation plan (capture failing-query JSON; max-turns/streaming-json/
  retry; cross-fallback grok<->gemini; 5/5 acceptance).
- memory feedback_web_search_over_probing: lead with web search/docs; probe only to CONFIRM a
  hypothesis, never as primary discovery. Reading our own config is fine; guessing paths is not.
- errorlog correction logged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 09:36:36 -07:00
8f0e576c49 unifi-wifi: correct the Teleport finding - config API IS reachable via the connector
Earlier "no usable Teleport API" was wrong (probed /rest/teleport, /stat/teleport, /v1/teleport).
Gemini research + live verification: Teleport config lives at /api/s/<site>/rest/setting/teleport
(GET/PUT, also under /get/setting key 'teleport') - reachable via the connector. Brooklyn confirmed
enabled, subnet 192.168.1.1/24. Invite generate/revoke is reportedly POST /api/s/<site>/cmd/teleport
{"cmd":"generate-invite"|"revoke-invite"} (untested - it creates a live VPN access link; gate as a
write). Invites are WiFiman-app-only. Proxy path is /v1/connector/consoles/{id}/proxy/... (Gemini's
/v1/hosts/{id}/proxy form 404s). Doc updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 09:22:05 -07:00
80723d159d unifi-wifi: neighbor-collect connector-capable (remote disables) + document VPN/Teleport reach
neighbor-collect.sh: add `--console <name> [--site <short>]` so the AP name/BSSID/IP map can come
from the cloud connector (/v1/connector/.../stat/device) instead of a UOS direct-login -- lets the
disable-analysis collector run against ANY console we have AP-VLAN reach to (the AP SSH harvest of
/proc/ui_neighbor is unchanged and still needs L3 reach). UOS path untouched. Validated against
Cascades via connector: source=CONNECTOR, built 77-mac + 450-bssid map for the 75 online APs.

This completes the hybrid (don't-lose-functionality): connector for airtime everywhere + neighbor-
collect (any source) for the SNR matrix -> NEIGHBOR_JSON -> optimize-radios disables on remote sites.

Documented (references/site-manager-api.md): the neighbor-collect --console flow, and the gateway
VPN/Teleport reach -- connector reaches /rest/networkconf (VPN servers: wireguard-server/openvpn-
server, site-to-site) read+writable in principle (gate writes like gw-control); Teleport has no
usable API (v1/ea/teleport 404, per-console /teleport 403).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 09:05:50 -07:00
dccd381820 unifi-wifi: validate connector RF analysis vs UOS (Cascades) - macs[] fix + --site passthrough
Validated the cloud-connector analysis against a KNOWN entity (Cascades, normally UOS-Mongo).
The connector reaches the self-hosted "UOS Server" host; Cascades is its site `va6iba3v`.

Two fixes from the validation:
- rf-analyze.py: pass macs:[<all uap macs>] to /stat/report/*.ap. The UniFi report endpoint
  returns only a small DEFAULT subset otherwise -- Cascades came back as 10 of 77 APs until the
  MAC list was supplied. Now profiles all 75 (uaps with 2.4 radios), matching the UOS path.
- model-rank.sh / optimize-radios.sh: --console now accepts --site <name> (internal short name
  from /api/self/sites) for multi-site controllers like the UOS Server (Cascades = va6iba3v).

Result lines up with the known UOS-Mongo figures: 75 APs, 2.4GHz util 65-90% / interf 53-78% /
~1 client each, all power-down, 0 disables (roam graph absent via connector -> same coverage-safe
degradation; disables still need NEIGHBOR_JSON). Apples-to-apples confirmed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 08:53:36 -07:00
1a90a48c82 unifi-wifi: model-rank + optimize-radios run on cloud-connector data (non-UOS consoles)
Both analyses now accept `--console "<name>"` and run against the UniFi cloud connector
instead of the UOS Mongo server, so RF airtime tuning works on standalone/non-UOS consoles
(e.g. Brooklyn/Skybar). The UOS Mongo path is unchanged.

- New shared analyzer scripts/rf-analyze.py: pulls per-AP/band airtime history via the
  connector POST /stat/report/hourly.ap (SAME schema as ace_stat.stat_hourly) + /stat/device
  for names/zones, derives cu_interf = cu_total - cu_self_rx - cu_self_tx, and runs the SAME
  model-rank ranking and optimize-radios greedy power-down/disable logic (ported faithfully).
- Roam graph (/stat/event) is usually empty on small/stationary sites -> graceful degrade:
  model-rank ranks by airtime pressure; optimize-radios returns power-down candidates + 0
  disables (coverage-safe). NEIGHBOR_JSON (SNR matrix) still enables disables, as on UOS.
- model-rank.sh / optimize-radios.sh: added the `--console` route (resolves the key from
  vault services/unifi-site-manager, execs rf-analyze.py). Validated on Brooklyn/Skybar:
  2.4GHz saturated (Yoga AP cu 63%/interf 55%), 5GHz idle (1-5%) - the expected pain-band split.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 08:43:09 -07:00
7e7358957c unifi-wifi: cloud Site Manager backend (gw-sitemanager.sh) + UOS-parity connector tier
New backend reaching ANY of the ~36 ACG UniFi consoles remotely via api.ui.com with the
account key (vault services/unifi-site-manager) - no UOS server, no LAN/VPN. Mapped the API
surface empirically (key live), corroborated by grok+gemini web search:

- Tier 1 (Site Manager): fleet/devices/sites/isp commands - inventory, site health (counts,
  IPS, ISP/ASN), and WAN/ISP time-series (latency/throughput/downtime).
- Tier 2 (CLOUD CONNECTOR -> console LOCAL Network API = UOS PARITY): the `net` command proxies
  /v1/connector/consoles/<id>/proxy/network/api/s/<site>/stat/{device,sta}, returning the SAME
  ace_stat depth as the UOS Mongo path - per-radio cu_total airtime/channel/bw/tx_power/num_sta/
  satisfaction and per-client rssi/signal/noise/satisfaction/rates. Verified live on Brooklyn/
  Skybar (standalone UDM, WAN-firewalled): `net brooklyn radios` + `net brooklyn clients` work.

This achieves parity with (and broader coverage than) the UOS server for non-UOS consoles.
Added references/site-manager-api.md (full catalog + 3 tiers), a Plane 3 note in SKILL.md, and
updated the reference memory. Read-only; POST actions (device restart, client block) exist, not wired.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 08:32:00 -07:00
7ac55e56fe sync: auto-sync from HOWARD-HOME at 2026-06-16 21:34:19
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-16 21:34:19
2026-06-16 21:34:40 -07:00
294ee5f8f6 unifi-wifi: fix apply-wlan wlan_bands 6e->6g; add 5GHz + 6GHz phases to Cascades runbook
- apply-wlan.sh: wlan_bands token was "6e" but this controller stores "6g" (verified live on Cascades
  Guest SSID) -> setting 6 GHz membership would have failed. Fixed band values + option names (5g6g/6g/all).
- Cascades 2.4 runbook: folded in Phase 5 (5 GHz: width 80->40 on 76 radios; channel plan with the
  DFS decision flagged -- DFS empirically clean here, so including clean-DFS gives ~20 channels vs ~5
  non-DFS-only for 77 APs) and Phase 6 (6 GHz: root cause = production SSID CSCNet not on 6 GHz [bands
  2g,5g only]; add 6g + enable bss-transition; band-steering already on). Per Howard.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 21:34:40 -07:00
ef4577cf75 skills(brainstorming): make request-only, not auto-trigger
Upstream description ("You MUST use this before any creative work...") would
auto-fire the brainstorming skill on routine feature/code work. Rewrote the
frontmatter description to invoke ONLY when the user explicitly asks to
brainstorm/design. Methodology body (incl. HARD-GATE) unchanged. Noted in SOURCE.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 20:22:59 -07:00
7f243e15b8 skills: harvest 4 MIT dev skills from obra/superpowers (awesome-claude-skills)
From the ComposioHQ/awesome-claude-skills list. Checked licenses BEFORE copying:
- threat-hunting-with-sigma-rules: repo is gone (GitHub 404) -- not harvested.
- forensics (mhattingpete): repo restructured, those skills no longer exist -- not harvested.
- pdf / mcp-builder (Anthropic official): LICENSE.txt FORBIDS copying out of the
  Service / derivatives / redistribution -- NOT harvestable into this repo (install via
  the official Claude Code marketplace instead if wanted).
- obra/superpowers: MIT -> the only legally harvestable set; imported with attribution.

Imported (each with its own MIT LICENSE copy + SOURCE.md provenance, commit a21956e48c13,
ASCII-normalized to house style, no emojis):
- using-git-worktrees
- test-driven-development (+ testing-anti-patterns.md)
- root-cause-tracing (+ find-polluter.sh helper, emojis -> ASCII markers)
- brainstorming (methodology only; upstream visual websocket server intentionally omitted)

Faithful imports -- content not reworded beyond ASCII typography/emoji normalization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 20:22:59 -07:00
38cff65fba ollama: fix broken endpoint auto-detect in OLLAMA.md one-liner (RTFM audit)
Audited the Ollama reference (no wrapper script — it's the OLLAMA.md doc + inline
HTTP-API call pattern) against the live server (Ollama 0.30.8 on GURU-5070):
- /api/chat + think:false + res['message']['content'] confirmed working (clean
  output, no thinking leak) -- the core call pattern is correct.
- All referenced models exist on the server (qwen3:8b, qwen3.6:latest, qwen3:14b,
  codestral:22b, nomic-embed-text).

Real bug found + fixed: the "Preferred one-liner" auto-detected the endpoint with
`urlopen(...)` used as a truthiness test. urlopen RAISES URLError on a down host
(proven), so the ternary's fallback branch was dead code -- it crashed on a down
localhost instead of failing over to Beast, and it did a per-call probe that
contradicts the doc's own "read endpoint from identity.json, no probe" rule 30 lines
above. Replaced with the identity.json endpoint+model pattern (also swaps the
hardcoded qwen3:14b for the per-machine prose_model). Validated verbatim end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 20:22:59 -07:00
f031ef9137 agy(gemini): RTFM audit — confirmed healthy, version + verified-date refresh
Audited the Gemini wrapper against the CLI's bundled help/README (gemini 0.45.2),
same pass as the grok skill. Unlike grok, found NO functional bug:
- All flags correct and real: -p, --skip-trust, -o json, --approval-mode plan|yolo,
  --include-directories, -m (verified against `gemini --help`).
- JSON schema {session_id, response, stats} -> .response confirmed via live probe.
- Pinned model gemini-3.1-pro-preview STILL VALID (live PONG); the GA-looking
  gemini-3.1-pro and gemini-3-pro both ModelNotFoundError -> keep the -preview suffix.
- Default text model is gemini-3.1-flash-lite (by design; verify/review/search/image
  pin pro). No thought-suppression flag exists in the CLI, so the gresponse() reasoning
  -leak scrub stays (justified, signature-gated, byte-exact otherwise).
- Live `search` re-validated end-to-end through the wrapper (58s, grounded sources).

Only change: version 0.45.1 -> 0.45.2 in SKILL.md + wrapper header, and refreshed the
verified-date notes with the 2026-06-17 re-validation findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 19:25:07 -07:00
a3ce9434de grok: fix xsearch (multi-agent web_search), pin grok-build, RTFM doc sweep
Root-caused the long-standing `ask-grok.sh xsearch` "no result (stopReason=)"
failure by reading Grok's bundled docs (~/.grok/docs/user-guide + README) instead
of probing:
- web_search runs a SEPARATE multi-agent model (grok-4.20-multi-agent), so the
  wrapper's blanket --no-subagents strangled it -> indefinite hang, 0 bytes. Scoped
  --no-subagents OFF xsearch; use --yolo (documented headless tool-run posture).
- xsearch prompt mandated X/Twitter search on every call (slow multi-agent) and the
  budget was 240s -> still timed out. Now web-primary (X only when relevant), 300s.
  Validated end-to-end through the wrapper: 23s, correct answer + 3 sources.

Model: pin -m grok-build (xAI flagship, 512k, the documented default) for the
reasoning modes (text/verify/review*) so quality is deterministic and not at the
mercy of the runtime default (this machine drifted to grok-composer-2.5-fast, a fast
Cursor coding model). xsearch + image/video keep the runtime default. Validated text
mode on grok-build (13s).

Doc accuracy (SKILL.md): corrected the model facts (default, the separate web_search
model, --effort unsupported on grok-build per supports_reasoning_effort:false);
documented the xsearch subagent exception. Fixed a stale in-script comment claiming
--rules/--disallowed-tools "tripped the CLI" (both are valid headless flags).

memory: add feedback_interview_ai_read_docs (read bundled docs / interview the model
before probing) + index; errorlog correction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 19:25:07 -07:00
b3c8ee2828 unifi-wifi: pfSense gateway access via SSH (pfSense-ssh.sh) + pfSense health section; layer OFF HOLD
DECISION (Mike, 2026-06-16): drop the RESTAPI package — VPN + SSH shell reads the same data and makes
changes. Confirmed Cascades pfSense is Plus 25.07-RELEASE (current; the "too old" premise was wrong) and
admin SSH = real shell (no menu). The upgrade/package blocker is moot; compat layer is off hold.

- NEW scripts/pfsense-ssh.sh: audit (version/WAN-media/gateway-events/DHCP-exhaustion/states/DNS/load/NIC),
  dhcp (pool utilization + no-free-leases), run "<cmd>" (arbitrary, incl changes; operator-gated). Cred
  from clients/<slug>/pfsense-firewall; system OpenSSH via askpass. Validated live on Cascades.
- audit report: added "pfSense health check (2026-06-16)" — DHCP NOT exhausted (192.168.0.0/22 pool 270/507,
  0 no-free-leases), DNS up, dual-WAN stable (no gateway flaps), states/load healthy => gateway is NOT a
  WiFi factor; the 2.4 GHz RF work is the sole fix. (Minor: igc3/WAN2 I225 2.5G counter quirk, not a fault.)
- ROADMAP §E + SKILL.md updated to the SSH backend decision; REST pfsense-backend.sh kept dormant/optional.
- Remaining: named gated CONTROL verbs over SSH (easyrule block-ips, pf/fw toggles) + optional gw-* dispatch.
- Closed obsolete coord todo (upgrade-pfSense-for-RESTAPI).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 18:42:54 -07:00
a5e851a144 sync: auto-sync from HOWARD-HOME at 2026-06-16 18:23:40
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-16 18:23:40
2026-06-16 18:23:49 -07:00
db38e50e49 sync: auto-sync from HOWARD-HOME at 2026-06-16 18:10:13
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-16 18:10:13
2026-06-16 18:10:43 -07:00
bf04924f2c harness: PS2 guard for onboarding probe + Windows quote-stripping memory
onboarding-diagnostic.ps1: add a PowerShell-version guard. The probe is PS3+ by
design (Get-CimInstance, [ordered], ConvertTo-Json); on stock PS2 (Win7 SP1 /
2008 R2 without WMF) it crashed with cryptic [ordered] errors and emitted empty
DIAG-JSON (first hit: AMT-PC). Now on PS<3 it emits a legible, parseable result
inside the DIAG-JSON markers (hand-built JSON) with a WMF 5.1 / KB3191566
remediation hint instead. Parses clean. True PS2-native probe stays an RMM Thought.

memory: add feedback_windows_quote_stripping (+ index) consolidating the two
recent embedded-double-quote incidents (PowerShell->curl.exe CommandLineToArgvW,
RMM->cmd.exe shutdown /c) into one root cause + fix, so future ref= entries land.

errorlog: the two self-logged entries from #32333 (preview-skip friction,
AMT-PC/Scileppi conflation correction).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 18:10:11 -07:00
25b5d060d6 unifi-wifi: coverage-thin mesh-awareness — never disable wireless-mesh APs or their parents
Howard caught a real hazard: coverage-thin was mesh-blind. At Cascades, 2nd Floor Atrium is the
wireless-mesh PARENT for CC Bridge + salon (backhaul ch36/5GHz), and 206 U7 Pro carries 108. The tool
had listed 2nd Floor Atrium / CC Bridge / 206 as 2.4 disable targets. Although the backhaul is 5GHz
(so a 2.4-radio disable wouldn't drop it), touching infra APs that feed others is needless risk.

Fix: fetch live uplink topology (stat/device); build the mesh set = wireless-uplink APs UNION their
parents; exclude them from disable (kept as coverers if their 2.4 is on); print MESH-PROTECTED line.
Falls back with a clear WARNING if no controller cred. Cascades now auto-excludes 108/206/2nd Atrium/
CC Bridge/salon; resilient plan 34->33. Also verified SSIDs are not AP-pinned (broadcasting_aps off),
so no client is orphaned by a radio disable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 16:43:08 -07:00
203fe95680 unifi-wifi: coverage-thin apply hint -> per --ap (was --zone, which would disable a whole floor)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 16:28:35 -07:00
4459189f7d unifi-wifi: add coverage-thin.sh — 2.4 coverage-redundancy disable planner (active-2.4 aware)
Answers "which 2.4 radios can we turn OFF given over-coverage, based on AP proximity." Greedy
dominating-set on the AP-to-AP 2.4 SNR layer: disables radios whose area stays covered by a nearby
ACTIVE-2.4 neighbor, maximizing interference-airtime removed without opening a 2.4 hole. Caps per-zone,
guards coverer capacity, flags single-coverer (low-resilience) disables, reports co-channel before/after.

Why separate from optimize-radios: optimize uses band-AGNOSTIC physical adjacency, so it counts an AP
whose ng radio is DISABLED as a "coverer" via its 5/6 GHz (observed: it proposed disabling 127/229/330/428
"covered by 128" — but 128's 2.4 is already disabled => those would be 2.4 holes). coverage-thin uses the
2.4 SNR layer specifically and only counts neighbors whose 2.4 stays ON.

Cascades (live): aggressive MINCOV=1 -> disable 36/76; resilient MINCOV=2 -> disable 34/76 with >=2 active
2.4 coverers each; co-channel ch6 28->13, ch11 25->13, ch1 20->13; ~2400 interference-airtime pts removed.
Read-only; needs NEIGHBOR_JSON. SKILL.md step 3b.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 16:17:54 -07:00
55ff67bfe3 unifi-wifi: radio-usage.sh --ap mode — per-device 2.4 history + steerable-vs-legacy tagging
Adds `radio-usage.sh <site> <band> --ap "<AP name>"`: lists the devices on one AP's band by merging
live clients (stat/sta) with recent association events (wifi_connectivity_event, band-aware), enriched
from ace.user identity. Tags each device steerable vs legacy:
  - from events: DUAL (also seen on 5/6 GHz -> steerable) vs NG-ONLY (2.4-only -> legacy/IoT)
  - fallback when no event in the (short ~1d) retention window: randomized MAC = modern phone/laptop
    (likely 5G/steerable) vs fixed vendor OUI = likely IoT/legacy.
Decision value: steerable -> fix via band-steering/min-RSSI; a legacy/IoT device present argues AGAINST
disabling that 2.4 radio. Needs controller cred for the live BSSID (vap_table) map; honest about the
short event retention. Validated live on Cascades (347, Dining Room).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 15:51:13 -07:00
1f2575640e unifi-wifi: add radio-usage.sh — per-AP band client-usage history (disable-safe vs power-down)
Answers "is this 2.4 radio actually used?" from accumulated controller stats (ace_stat.stat_daily,
~77d). Reports per-AP time-avg concurrent users (<radio>-num_sta_avg) + peak station snapshot
(<band>-num_sta), distinguishing avg~0/peak>0 (takes bursts -> POWER-DOWN) from peak==0 (genuinely
unused -> disable-safe). With NEIGHBOR_JSON it crosses low-use APs against the AP-to-AP SNR matrix to
emit a defensible safe-to-disable shortlist (low-use AP + strong overlapping neighbor with headroom),
noting mutual-coverage conflicts and deferring conflict-free selection to optimize-radios.

Validated live on Cascades: of 76 APs only 1 has peak==0 over 77d (the offline AP 108); every other
2.4 radio takes real client bursts (peaks 5-58) at very low avg (12 APs <0.5 concurrent). I.e. the
usage history independently CONFIRMS the conservative power-down-not-disable call. Read-only (Mongo
plane). Uses var-assignment to avoid the legacy-mongo REPL echo. SKILL.md documents it as step 2b.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 15:38:26 -07:00
412c0eff31 unifi-wifi: skill health pass — fix optimize-radios stray REPL echo + ASCII-clean all output
Full verification of the skill against Cascades (live):
- All 19 scripts syntax-clean.
- Controller-side read-only validated live: sites, audit-site, switch-audit, live-stats, model-rank,
  optimize-radios, monitor-run, gw-audit. Dry-run apply paths validated: apply-radio, apply-wlan,
  client-control, device-control. AP-side mechanism validated: SSH auth + /proc/ui_neighbor read on a
  sample AP; full neighbor-collect (74-AP SNR sweep) -> channel-plan end-to-end produced a 1/6/11 plan.

Fixes:
- optimize-radios.sh: the `for(k in prof)` loop's numeric completion value was REPL-echoed by the legacy
  mongo shell (stray "94.56..." line in output). Terminated the loop body with `void 0` to suppress it.
- ASCII-clean printed output (CLAUDE.md no-non-ASCII): replaced em-dashes / Unicode arrows / § that
  reached stdout and rendered as `?`/mojibake on the Windows console, across optimize-radios,
  neighbor-collect, survey-collect, dfs-check, audit-site, sites, monitor-run, apply-radio, apply-wlan,
  pfsense-backend. (Comment-only non-ASCII left as-is; never printed.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 15:08:35 -07:00
1defd51c66 unifi-wifi: pfSense compat layer ON HOLD — Cascades pfSense too old for RESTAPI pkg, needs upgrade first
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 14:11:33 -07:00
fdcb20d7c9 unifi-wifi: pfSense gateway compat layer (§E) — REST backend + dispatch inside gw-audit/gw-control
Per Howard's decision (2026-06-16, "try what Mike wanted"): Mike's §E open decisions resolved as
REST API package backend + dispatch INSIDE the existing gateway verbs (his lean), not sibling scripts.

- NEW scripts/pfsense-backend.sh: pfSense REST API (pfSense-pkg-RESTAPI v2, X-API-Key) driver exposing
  the same verbs as gw-control (audit, pf-list/disable/enable/delete/set-ports, fw-list/disable/enable,
  block-ips) + a `setup` helper. Writes --apply-gated with per-object rollback to .claude/tmp + firewall/apply.
- gw-audit.sh: when num_gw=0 and a clients/<slug>/pfsense-api cred is vaulted (or --pfsense <slug>),
  appends the pfSense WAN/DHCP/firewall audit; else prints the setup hint. (captures num_gw to gate.)
- gw-control.sh: same-verb auto-dispatch to pfsense-backend when a pfSense cred resolves for the site.
- SKILL.md [PROPOSED]->[SCAFFOLDED]; ROADMAP §E open decisions marked resolved.

STATUS: scaffolded. BLOCKED/setup/no-cred paths tested; gw-audit dispatch validated live (Cascades
num_gw=0 -> hint). Live REST calls pending a reachable pfSense with the API pkg + a vaulted key; v2
endpoint paths must be verified against the installed API version on first live run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 13:51:10 -07:00
e89d815896 sync: auto-sync from HOWARD-HOME at 2026-06-16 13:12:16
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-16 13:12:16
2026-06-16 13:12:26 -07:00
4651bd52a6 sync: auto-sync from GURU-5070 at 2026-06-16 09:02:24
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-16 09:02:24
2026-06-16 09:02:39 -07:00
cc15177ce3 syncro: invoice-note policy — block hours remaining, low-block (<4hr) renew + Winter tag, recurring sweep
Extends the invoice Message (note) automation into a single reusable helper
set_invoice_note <invoice_id> <customer_id> [pre_billing_prepay]:
  - no block (prepay_hours==0)  -> "Interested in discounted labor? Ask us about block-rate pricing."
  - block, >=4 hrs left         -> "Block hours remaining: N."
  - block, <4 hrs left          -> remaining + renew line, AND tags Winter (<@624666486362996755>)
                                   in #bot-alerts (low-block heads-up; mentions ping, no allowed_mentions)
Pre-billing prepay arg keeps a just-depleted block counted as a block customer (shows renew, not upsell).
Never clobbers a non-empty note.

Wired into billing Step 3 (set_invoice_note "$INVOICE_ID" "$CUST_ID" "$PREPAY"), and a new
"Recurring invoice note sweep" applies the same policy to Syncro's auto-generated recurring invoices
(schedule_id != null, recent, current balance) — idempotent, run after each recurring run.

Branch logic + a real e2e note set/restore validated on the ACG internal test account (#67741); the
<4hr Winter alert was stubbed in testing so no real ping fired.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 08:56:42 -07:00
f5c284444b syncro: document Invoice Message (note field) + auto block-rate hint for non-block customers
The on-screen "Invoice Message" text block IS the invoice `note` field, editable via
PUT /invoices/{id} {"note": "..."} (response {"invoice": {...}}). Verified on the ACG
internal test account (#67741: set/verify/restore).

Billing flow now sets a one-line upsell hint on the invoice note — "Interested in
discounted labor? Ask us about block-rate pricing." — ONLY for customers with no prepaid
block (prepay_hours == 0). Block customers (prepay_hours > 0) get no hint; never clobber
a non-empty note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 08:48:25 -07:00
f75882987a onboarding-diagnostic: fix two Server-SKU false positives
Both surfaced on GND-SERVER (Server 2019 DC), would mis-grade every Windows Server:

1. OS EOL: build numbers are SHARED between client and server SKUs (17763 = Win10
   1809 AND Server 2019; 14393 = 1607/Server2016; 26100 = 24H2/Server2025). The map
   only had client dates, so Server 2019 (supported to 2029) was flagged EOL-2020 =
   false critical. Now branch on SKU ($caption -match 'Server') with a Server EOL map.

2. Stability disk errors: ids 7/51/153 are shared across providers; provider 'disk'
   = real I/O error, but 'Microsoft-Windows-Kernel-Boot' id 153 = "VBS disabled" boot
   noise. The unfiltered fallback counted that noise as disk errors (false warning on
   healthy boxes). Now count only true storage providers, no unfiltered fallback.

Parses clean. Re-run on GND-SERVER should drop from RED to AMBER (both false findings gone).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 08:18:27 -07:00
9f760c1724 memory: AAD Connect AdminSDHolder writeback-permission pattern
Reference memory + index entry: diagnosing/fixing AAD Connect "completed-export-errors"
(8344 INSUFF_ACCESS_RIGHTS) where AdminSDHolder strips the connector account's write
permission on a protected admin object. Covers msDS-KeyCredentialLink (Russo) and
msExchSafeSendersHash (Glaztech); csexport /f:x diagnosis + dsacls AdminSDHolder grant.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 07:45:32 -07:00
e79dd49636 sync: auto-sync from HOWARD-HOME at 2026-06-16 07:44:03
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-16 07:44:03
2026-06-16 07:44:15 -07:00
4ebdff7d23 unifi-wifi: roadmap — pfSense gateway compatibility layer (§ E)
Capture the "UniFi APs/switches behind a pfSense gateway" topology (Cascades, our
office, several clients) as a first-class roadmap item: make the gateway verbs
(gw-audit / gw-control / VPN) work against pfSense via a thin driver behind the
same verbs (gw-audit already detects num_gw=0 = third-party firewall).

Includes the verb->pfSense mapping (NAT port-forwards, filter rules,
easyrule block-ips, native OpenVPN/IPsec/WireGuard), ranked backend options
(REST-API pkg vs stock SSH easyrule/pfSsh.php vs diag_command.php vs config.xml),
existing vaulted pfSense creds (Cascades + office), and open decisions. SKILL.md
status block notes the proposed layer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 07:39:53 -07:00