Files
claudetools/projects/dataforth-dos/DATASHEET-FIX2-5-HANDOFF-2026-06-18.md
Mike Swanson 67de39a9d0 dataforth: handoff doc for AD2 session — Fix 2 (DSCA rebuild) STAGE 2-3 + Fix 5 + cleanup
Remote SSH/VPN to AD2 keeps flapping; hand the remaining datasheet fixes to the
local AD2 session. Includes the per-subtype approach (DSCA_TEMPLATES from staged
originals — STAGE 1 done, dsca-templates.json on AD2 = 126 models), the render-wiring
+ per-subtype byte-validation gate, Fix 5 (379 backfill via legacy_cert_text), the
discipline (backup/save-state/validate-before-publish), and the derive-dsca-templates
tool. Ref ticket #32441.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 05:59:05 -07:00

7.8 KiB
Raw Blame History

Dataforth Datasheet Fixes — Handoff to the AD2 Local Session (2026-06-18)

For: the Claude session running locally on AD2 (C:\Shares\testdatadb). Why you: the remote operator's SSH/VPN to AD2 keeps flapping; you run on the box, so you have stable access. Pick up the remaining work below. Ref Syncro ticket #32441.

You are working on the deployed pipeline at C:\Shares\testdatadb (Node + PostgreSQL 18). The repo copies under projects/dataforth-dos/datasheet-pipeline/implementation/ are STALE — edit the DEPLOYED files, reconcile the repo after.


What's already DONE (live on the website)

  • Fix 1 (RTD label): datasheet-exact.js folds RTD (sensorNum 7) into the temperature path (Temp. (C), signed). Deployed; all ~24k RTD certs re-pushed (8B35/DSCA34/SCM5B34/SCM5B35). Audit finding resolved.
  • Fix 4 (encoded serials): parsers/multiline.js decode rule ^[A-Z]\d{3,}- + raw_serial_number; import.js cross-model guard. Recovered 603 units, 3 quarantined (test_records_quarantine). Schema added: raw_serial_number, test_records_quarantine.
  • Fix 3 (retest latest-wins): import.js conflict rule = same-model AND (FAIL OR newer-by-date OR (same-date AND raw_data IS DISTINCT AND higher ingest_seq)). Parser stamps ingest_seq = source-mtime*1e6 + line. Schema: ingest_seq. Settled + validated (old-vs-new render diff vs the 16:30 dump: no corruption).
  • Save-states on AD2: datasheet-exact.js.bak-2026-06-17-1646, multiline.js/import.js .bak-2026-06-17-1713 (Fix4) and .bak-2026-06-17-1726 (Fix3).

DISCIPLINE (follow exactly — these are customer calibration certs)

  1. Backup first: a pg_dump exists at C:\Shares\testdatadb\_backups\testdatadb-2026-06-17-1630.dump. Take a FRESH pg_dump + a VSS shadow before any schema change or re-import. (superuser postgres / Paper123!@#; app testdatadb_app / DfTestDB2026! — both vaulted at clients/dataforth/testdatadb-postgres.)
  2. Per-file save-state: copy any file to <file>.bak-YYYY-MM-DD-HHMM before editing.
  3. Atomic, asserted patches: prepare all string-replacements in memory, assert each matches exactly once, then write; require() load-check after.
  4. Validate before WRITE; validate before PUBLISH. Never re-push to Hoffman until the per-subtype byte-validation passes.
  5. Re-publication: the testdatadb service caches the template — restart it (Restart-Service testdatadb) after changing datasheet-exact.js so the live service uses the new template; do re-pushes from a fresh node process via uploadBySerialNumbers(serials). Hoffman is idempotent (returns Unchanged/Updated/Created). Re-push in batches; the DSCA fleet is ~78k certs.

FIX 2 — DSCA Final-Test rebuild (Defect B). STAGE 1 done; do STAGE 23.

Root cause: datasheet-exact.js has ONE hardcoded DATA_LINES['DSCA'] + a single DSCA branch in buildTSpecs(). Real DSCA modules have 26 distinct Final-Test layouts (per subtype), so names/specs/row-alignment are wrong and lines drop (e.g. Output Noise on DSCA38-05). The ACCURACY block also uses 5B/8B titles (Vout (V) + ====) instead of DSCA's (Output (V|mA) + ----).

STAGE 1 (DONE): C:\Shares\testdatadb\dsca-templates.json already exists — 126 DSCA models, each { "accOut": "Output (V)"|"Output (mA)", "rows": [ {"name": "...", "spec": "..."}, ... ] }, extracted byte-accurately from the staged originals (the extractor used the === separator under the Final-Test header for exact column spans). Verified DSCA38-05 now has Output Noise | <= 2000 uVrms. This JSON is the authoritative template source — use it, don't reverse DSCFIN.DAT.

STAGE 2 (do this): wire it into templates/datasheet-exact.js.

  1. Read the current DSCA render path FIRST: parseRawData() (how the Final-Test STATUS groups are parsed from raw_data — they're 5-per-line groups like "PASS 28.42","PASS","PASS 252.2",...), buildTSpecs() DSCA branch, the Final-Test render loop in generateExactDatasheet(), and the accuracy header line (~line 567, currently inputHeader + ' Vout (V) Vout (V)* Error (%) Status' with ========== separators).
  2. Load dsca-templates.json at module top. For family === 'DSCA':
    • Replace the param names and specs source: instead of DATA_LINES['DSCA'] + buildTSpecs DSCA specs, use DSCA_TEMPLATES[record.model_number].rows (each row gives the name + spec text directly — no spec-file lookup needed for DSCA).
    • Map the parsed raw_data STATUS groups positionally onto the template rows for the measured value + PASS/FAIL status. The staged template row order == the raw_data group order (same DOS source). Reconcile the existing skip rule (if (status.length <= 4) continue): rows like 240VAC Withstand/Hi-Pot have NO measured value and an empty spec — they must still render (blank measured + blank spec + PASS), so don't drop them; align by position, show the value when the group carries one.
    • Fix the ACCURACY block for DSCA: use DSCA_TEMPLATES[model].accOut (Output (V) or Output (mA)) in place of Vout (V), and ---------- dash separators in place of ==========. (The input column is already correct post-Fix-1.)
    • If a model is missing from dsca-templates.json (no staged original): do not guess — skip/flag it.
  3. Save-state + atomic patch + load-check + restart the service.

STAGE 3 — validate per subtype (the gate): render ALL DSCA certs and byte-compare vs the staged originals, grouped by the 26 layouts; require zero content-delta per layout before any re-push. Use a content-normalized compare (strip the leading === letterhead-separator line and trailing whitespace — those are the known, deferred cosmetic gaps; focus on the Final-Test + ACCURACY content matching). Report match/mismatch per layout; investigate any mismatch before publishing. Then re-push DSCA in batches via uploadBySerialNumbers, diff-gated, watching Updated/Unchanged.

Scope: ~78k DSCA certs across DSCLOG. Do NOT re-push until STAGE 3 is clean per subtype.


FIX 5 — Backfill 379 cryptolocker-era units (operational)

379 units (Oct 2025Jan 2026, 3 stations) have no surviving .DAT but their staged .TXT still exist — and those were rendered by the ORIGINAL (correct) software, so publish them directly, don't round-trip through raw_data.

  1. Add column legacy_cert_text TEXT to test_records (backup first).
  2. Identify the 379 (staged .TXT present, serial absent from DB — the MISSING-UNITS-REPORT-FOR-JOHN-2026-06-17.md on the ad2 branch has the method/list; Cause 2 set).
  3. Insert them: raw_data = NULL, legacy_cert_text = the staged .TXT content, plus model/serial/test_date/station/overall_result parsed from the .TXT.
  4. Modify render-datasheet.renderContent() to return legacy_cert_text when raw_data IS NULL.
  5. Validate: cross-reference the 379 serials against the ERP/work-order system to confirm valid shipped units before publishing; spot-check rendered vs staged. Then publish.

Minor cleanup

  • 1 Hoffman push error during the Fix-3 publish — find which serial (check notify logs / re-run that batch with per-record fallback) and resolve.
  • ~420 "skipped" units (no spec entry / model not registered in Hoffman — upload-to-api.js UNREGISTERED_MODELS + null-render skips). They're safely in the DB; decide per model whether to register in Hoffman or add spec coverage.

Report back

Commit your work to the ad2 branch and update ticket #32441 (hidden internal notes for engineering detail; the customer-facing thread is John Lehman). The remote operator will see your commits on sync. Byte-for-byte DOS fidelity (a leading === line + ~1-space input-column spacing, on ALL families) is intentionally deferred — there's a ticket note to disclose it to John when the whole effort is done.