Files
claudetools/clients/dataforth/session-logs/2026-07/2026-07-01-mike-dataforth-test-data-chain-audit.md
Mike Swanson 7f897ce93f sync: auto-sync from GURU-5070 at 2026-07-01 13:09:08
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-07-01 13:09:08
2026-07-01 13:10:01 -07:00

8.4 KiB

Dataforth test-data-chain audit via RMM-spawned AD2 Claude + multi-AI verification

User

  • User: Mike Swanson (mike)
  • Machine: GURU-5070
  • Role: admin

Session Summary

Ran a full audit of the Dataforth test-data chain in support of Syncro #32489 (DOS test stations not pulling updated spec files). The pivotal enabler was proving a new capability: spawning a headless claude -p on the AD2 domain controller via its GuruRMM agent rather than the async git-sync handoff. AD2 (192.168.0.6) is isolated from the ACG coord API but its RMM agent phones home, and it is online (agent cfa93bb6-...). A staged probe confirmed: sysadmin is logged into the console (so context: user_session works and returns an elevated token), Claude Code v2.1.181 is at C:\Users\sysadmin\.local\bin\claude.exe, node v20 is system-wide, and the repo is at C:\ClaudeTools. Headless claude -p initially failed with Invalid API key because a stale machine-level ANTHROPIC_API_KEY (108 chars) shadowed sysadmin's good OAuth creds; unsetting it (Remove-Item Env:\ANTHROPIC_API_KEY) let it fall back to .credentials.json and return PROBE_OK.

With the pattern proven, launched a strictly read-only autonomous Claude auditor on AD2, detached (runner pid 4748) so it survived the RMM command-timeout window, writing to C:\Users\sysadmin\ad2-audit\FINDINGS.md with a DONE.txt marker. A background monitor polled the marker; the run completed in ~17 min (exit 0, 25 KB report). The report is excellent, evidence-backed, and corrected several stale assumptions in CLAUDE.dataforth.md (dated 2026-03-29).

Key findings: F1 (HIGH) — deployed NWTOC.BAT v5.0 (COMMON\ProdSW, verified on AD2 and NAS) copies only *.BAT and *.EXE, zero .DAT — the confirmed root cause of #32489; and no NWTOC version ever distributed the shared COMMON masters, so v5.1 must ADD a DATA copy, not restore one. F2 (HIGH, new)NWTOC.BAT v5.0 and CTONWTXT.BAT v2.3 use COPY /Y, which is not a valid MS-DOS 6.22 switch (introduced in MS-DOS 7.0/Win95); on true 6.22 this errors and copies nothing, meaning NWTOC may be failing entirely. Stale-assumption corrections: datastore is now PostgreSQL 18 (SQLite is a 4.4 GB archive), the scheduled task runs Sync-FromNAS-rsync.ps1, web delivery is a live HTTP API uploader (472,290 records flagged, not the dead For_Web), and CTONWTXT IS invoked. Also F3 (stray TS-21\ProdSW file breaks rsync push every run), F4 (server generates from a frozen 2026-03-27 specdata snapshot), F5 (plaintext creds in scripts).

Then ran the requested multi-AI cross-verification of the load-bearing DOS-6.22 claim. Grok (verify mode) independently confirmed F2 in full: 6.22 COPY supports only /A /B /V; /Y arrived in MS-DOS 7.0/Win95; COPY /Y src dstInvalid switch - /Y, 0 files copied; plain COPY overwrites silently on 6.22 (the correct form). Gemini was unavailable this session (quota exhausted on gemini-3.1-pro-preview + an OAuth fallback error; logged). The pivotal unresolved question is empirical and station-only: do the stations run genuine 6.22 (→ NWTOC has been copying nothing since 2026-03-16) or MS-DOS 7.x (→ /Y is fine, F1 is the sole cause)? Stations have no RMM agent (they are DOS), so VER + a COPY /Y test must be done on a station.

Key Decisions

  • Used RMM-spawned headless Claude on AD2 (not the sync handoff) for live ground truth, since the committed docs were 3 months stale and AD2's RMM agent is reachable despite coord isolation.
  • Ran the AD2 agent strictly READ-ONLY with an ironclad brief (no writes/git/state changes, deliverable file only) and detached + polled, given it runs on a production domain controller.
  • Cross-verified only the highest-value, easy-to-get-wrong claim (DOS-6.22 COPY /Y) with a second vendor, rather than re-running a redundant Claude fan-out over an already-thorough report.
  • Did NOT apply any fix or touch #32489 yet — the v5.1 design and severity both hinge on the unresolved station DOS-version question.

Problems Encountered

  • Headless claude -p on AD2 returned Invalid API key — a stale machine ANTHROPIC_API_KEY shadowed the OAuth creds. Fixed by unsetting the env var before invoking (OAuth fallback).
  • RMM dispatch failed once with /mingw64/bin/curl: Permission denied (transient AV lock on the Git-Bash curl); nothing dispatched. Retried using /c/Windows/System32/curl.exe.
  • Gemini (agy) verify failed — gemini-3.1-pro-preview quota exhausted and the default-model fallback errored on OAuth _doSetupUser. Logged; used Grok as the cross-vendor check.

Configuration Changes

Created:

  • clients/dataforth/docs/audits/2026-07-01-test-data-chain-audit-AD2.md — the full AD2 audit report (pulled back via RMM, gzip+base64).
  • .claude/memory/reference_rmm_spawn_headless_claude.md — the validated RMM-spawn-Claude pattern + the ANTHROPIC_API_KEY-shadow gotcha.
  • MEMORY.md index line for the above.
  • On AD2 (NOT in repo): C:\Users\sysadmin\ad2-audit\{brief.txt,run.ps1,run.log,FINDINGS.md,DONE.txt}.

Modified:

  • errorlog.md — logged the Gemini unavailability.

Credentials & Secrets

  • No new credentials created. The AD2 audit surfaced pre-existing plaintext creds in Dataforth scripts (F5) — flagged for rotation + vaulting, NOT copied anywhere: rsync daemon (rsync/IQ2...19) in Sync-FromNAS-rsync.ps1; NAS root SSH password in the dormant Sync-FromNAS.ps1; Postgres testdatadb_app password in testdatadb\database\db.js.
  • AD2 has a stale machine-level ANTHROPIC_API_KEY (invalid) that must be unset before running headless Claude there; sysadmin's OAuth creds at C:\Users\sysadmin\.claude\.credentials.json are the working auth.

Infrastructure & Servers

  • AD2: 192.168.0.6, Dataforth DC, Windows Server 2019 (10.0.17763), RMM agent cfa93bb6-0cdc-4d4e-a29e-1609cda6f047, client "Dataforth Corp". claude.exe v2.1.181, node v20.10.0, repo C:\ClaudeTools (branch ad2).
  • NAS D2TESTNAS 192.168.0.9 (Linux/Samba), rsync daemon port 873 module test/data/test.
  • TestDataDB: PostgreSQL 18.3 service postgresql-18 on AD2 (::1:5432), db testdatadb (test_records 475,553; work_orders 34,149). App at C:\Shares\testdatadb\.
  • Shares on AD2: C:\Shares\test\ (COMMON\ProdSW deployed batch set; Ate\ProdSW<type>DATA master specs), C:\Shares\webshare\ (Test_Datasheets active; For_Web dead since 2026-05-11).
  • Engineering master 5BMAIN.DAT 83,200 B mtime 2026-06-26 present on AD2 + NAS; server-side testdatadb\specdata\5BMAIN.DAT is a frozen 2026-03-27 copy (F4).

Commands & Outputs

  • RMM dispatch to AD2 with context:user_session, timeout_seconds (not timeout), via /c/Windows/System32/curl.exe. Headless launch: detached Start-Process powershell -File run.ps1 -WindowStyle Hidden; runner unsets ANTHROPIC_API_KEY, cd C:\ClaudeTools, runs claude -p <brief> --permission-mode bypassPermissions --output-format text.
  • Probe result: PROBE_OK after unsetting the env key (exit 0, 13s).
  • Audit result: DONE exit=0, FINDINGS.md 25,337 B, ~17 min.
  • Grok verify verdict: claim correct; nitpicks = "Invalid switch" not "Invalid parameter", and the error is visible (not literally "silent").

Pending / Incomplete Tasks

  • PIVOTAL: confirm station DOS version (VER + COPY /Y NUL C:\TEST.TXT on a station). Determines whether F2 is catastrophic (6.22 → NWTOC copies nothing since 2026-03-16) or moot (7.x). Stations have no RMM agent — reach one via NAS/console.
  • Draft DOS-6.22-safe NWTOC v5.1: plain COPY (no /Y), one-way pull of master .DATs into a distinct local dir (avoids the cyclic-overwrite v5.0 guarded against), 6.22-valid IF EXIST/ GOTO. Grok-review before it touches a station. Correct on both 6.22 and 7.x.
  • Update Syncro #32489 with the confirmed root cause (F1 + F2) and plan.
  • Address F3 (remove stray TS-21\ProdSW file), F4 (feed specdata from masters), F5 (rotate/vault creds) — recommendations only, nothing applied.
  • Retry Gemini verification later for true two-vendor triangulation.

Reference Information

  • Audit report: clients/dataforth/docs/audits/2026-07-01-test-data-chain-audit-AD2.md.
  • Syncro #32489 (id 113201089, Scheduled, due 2026-07-02); John Lehman contact 2851723; Dataforth Corp customer 578095; appointment 5626864474.
  • AD2 RMM agent cfa93bb6-0cdc-4d4e-a29e-1609cda6f047.
  • Memory: reference_rmm_spawn_headless_claude, gururmm-command-timeout-seconds.