Files
claudetools/projects/dataforth-dos/DATASHEET-FIX2-5-HANDOFF-2026-06-18.md
Mike Swanson 67de39a9d0 dataforth: handoff doc for AD2 session — Fix 2 (DSCA rebuild) STAGE 2-3 + Fix 5 + cleanup
Remote SSH/VPN to AD2 keeps flapping; hand the remaining datasheet fixes to the
local AD2 session. Includes the per-subtype approach (DSCA_TEMPLATES from staged
originals — STAGE 1 done, dsca-templates.json on AD2 = 126 models), the render-wiring
+ per-subtype byte-validation gate, Fix 5 (379 backfill via legacy_cert_text), the
discipline (backup/save-state/validate-before-publish), and the derive-dsca-templates
tool. Ref ticket #32441.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 05:59:05 -07:00

62 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Dataforth Datasheet Fixes — Handoff to the AD2 Local Session (2026-06-18)
**For:** the Claude session running locally on AD2 (`C:\Shares\testdatadb`).
**Why you:** the remote operator's SSH/VPN to AD2 keeps flapping; you run on the box, so you have stable access. Pick up the remaining work below. **Ref Syncro ticket #32441.**
You are working on the **deployed** pipeline at `C:\Shares\testdatadb` (Node + PostgreSQL 18). The repo copies under `projects/dataforth-dos/datasheet-pipeline/implementation/` are **STALE** — edit the DEPLOYED files, reconcile the repo after.
---
## What's already DONE (live on the website)
- **Fix 1 (RTD label):** `datasheet-exact.js` folds RTD (sensorNum 7) into the temperature path (`Temp. (C)`, signed). Deployed; **all ~24k RTD certs re-pushed** (8B35/DSCA34/SCM5B34/SCM5B35). Audit finding resolved.
- **Fix 4 (encoded serials):** `parsers/multiline.js` decode rule `^[A-Z]\d{3,}-` + `raw_serial_number`; `import.js` cross-model guard. Recovered 603 units, 3 quarantined (`test_records_quarantine`). Schema added: `raw_serial_number`, `test_records_quarantine`.
- **Fix 3 (retest latest-wins):** `import.js` conflict rule = same-model AND (FAIL OR newer-by-date OR (same-date AND `raw_data IS DISTINCT` AND higher `ingest_seq`)). Parser stamps `ingest_seq = source-mtime*1e6 + line`. Schema: `ingest_seq`. Settled + validated (old-vs-new render diff vs the 16:30 dump: no corruption).
- Save-states on AD2: `datasheet-exact.js.bak-2026-06-17-1646`, `multiline.js`/`import.js` `.bak-2026-06-17-1713` (Fix4) and `.bak-2026-06-17-1726` (Fix3).
## DISCIPLINE (follow exactly — these are customer calibration certs)
1. **Backup first:** a `pg_dump` exists at `C:\Shares\testdatadb\_backups\testdatadb-2026-06-17-1630.dump`. **Take a FRESH `pg_dump` + a VSS shadow before any schema change or re-import.** (superuser `postgres` / `Paper123!@#`; app `testdatadb_app` / `DfTestDB2026!` — both vaulted at `clients/dataforth/testdatadb-postgres`.)
2. **Per-file save-state:** copy any file to `<file>.bak-YYYY-MM-DD-HHMM` before editing.
3. **Atomic, asserted patches:** prepare all string-replacements in memory, assert each matches exactly once, then write; `require()` load-check after.
4. **Validate before WRITE; validate before PUBLISH.** Never re-push to Hoffman until the per-subtype byte-validation passes.
5. **Re-publication:** the `testdatadb` service caches the template — **restart it** (`Restart-Service testdatadb`) after changing `datasheet-exact.js` so the live service uses the new template; do re-pushes from a **fresh node process** via `uploadBySerialNumbers(serials)`. Hoffman is idempotent (returns Unchanged/Updated/Created). Re-push in batches; the DSCA fleet is ~78k certs.
---
## FIX 2 — DSCA Final-Test rebuild (Defect B). STAGE 1 done; do STAGE 23.
**Root cause:** `datasheet-exact.js` has ONE hardcoded `DATA_LINES['DSCA']` + a single DSCA branch in `buildTSpecs()`. Real DSCA modules have **26 distinct Final-Test layouts** (per subtype), so names/specs/row-alignment are wrong and lines drop (e.g. Output Noise on DSCA38-05). The ACCURACY block also uses 5B/8B titles (`Vout (V)` + `====`) instead of DSCA's (`Output (V|mA)` + `----`).
**STAGE 1 (DONE):** `C:\Shares\testdatadb\dsca-templates.json` already exists — **126 DSCA models**, each `{ "accOut": "Output (V)"|"Output (mA)", "rows": [ {"name": "...", "spec": "..."}, ... ] }`, extracted byte-accurately from the staged originals (the extractor used the `===` separator under the Final-Test header for exact column spans). Verified DSCA38-05 now has `Output Noise | <= 2000 uVrms`. **This JSON is the authoritative template source — use it, don't reverse DSCFIN.DAT.**
**STAGE 2 (do this):** wire it into `templates/datasheet-exact.js`.
1. Read the current DSCA render path FIRST: `parseRawData()` (how the Final-Test STATUS groups are parsed from `raw_data` — they're 5-per-line groups like `"PASS 28.42","PASS","PASS 252.2",...`), `buildTSpecs()` DSCA branch, the Final-Test render loop in `generateExactDatasheet()`, and the accuracy header line (~line 567, currently `inputHeader + ' Vout (V) Vout (V)* Error (%) Status'` with `==========` separators).
2. Load `dsca-templates.json` at module top. For `family === 'DSCA'`:
- Replace the param **names** and **specs** source: instead of `DATA_LINES['DSCA']` + `buildTSpecs` DSCA specs, use `DSCA_TEMPLATES[record.model_number].rows` (each row gives the name + spec text directly — no spec-file lookup needed for DSCA).
- Map the parsed `raw_data` STATUS groups **positionally** onto the template rows for the **measured value + PASS/FAIL status**. The staged template row order == the raw_data group order (same DOS source). **Reconcile the existing skip rule** (`if (status.length <= 4) continue`): rows like `240VAC Withstand`/`Hi-Pot` have NO measured value and an empty spec — they must still render (blank measured + blank spec + PASS), so don't drop them; align by position, show the value when the group carries one.
- Fix the ACCURACY block for DSCA: use `DSCA_TEMPLATES[model].accOut` (`Output (V)` or `Output (mA)`) in place of `Vout (V)`, and `----------` dash separators in place of `==========`. (The input column is already correct post-Fix-1.)
- If a model is missing from `dsca-templates.json` (no staged original): **do not guess** — skip/flag it.
3. Save-state + atomic patch + load-check + restart the service.
**STAGE 3 — validate per subtype (the gate):** render ALL DSCA certs and **byte-compare vs the staged originals, grouped by the 26 layouts**; require zero content-delta per layout before any re-push. Use a content-normalized compare (strip the leading `===` letterhead-separator line and trailing whitespace — those are the known, deferred cosmetic gaps; focus on the Final-Test + ACCURACY content matching). Report match/mismatch per layout; investigate any mismatch before publishing. Then re-push DSCA in batches via `uploadBySerialNumbers`, diff-gated, watching Updated/Unchanged.
**Scope:** ~78k DSCA certs across DSCLOG. **Do NOT re-push until STAGE 3 is clean per subtype.**
---
## FIX 5 — Backfill 379 cryptolocker-era units (operational)
379 units (Oct 2025Jan 2026, 3 stations) have no surviving `.DAT` but their staged `.TXT` still exist — and those were rendered by the ORIGINAL (correct) software, so **publish them directly, don't round-trip through raw_data**.
1. Add column `legacy_cert_text TEXT` to `test_records` (backup first).
2. Identify the 379 (staged `.TXT` present, serial absent from DB — the `MISSING-UNITS-REPORT-FOR-JOHN-2026-06-17.md` on the `ad2` branch has the method/list; Cause 2 set).
3. Insert them: `raw_data = NULL`, `legacy_cert_text` = the staged `.TXT` content, plus model/serial/test_date/station/overall_result parsed from the `.TXT`.
4. Modify `render-datasheet.renderContent()` to return `legacy_cert_text` when `raw_data IS NULL`.
5. **Validate:** cross-reference the 379 serials against the ERP/work-order system to confirm valid shipped units before publishing; spot-check rendered vs staged. Then publish.
---
## Minor cleanup
- **1 Hoffman push error** during the Fix-3 publish — find which serial (check `notify` logs / re-run that batch with per-record fallback) and resolve.
- **~420 "skipped" units** (no spec entry / model not registered in Hoffman — `upload-to-api.js` `UNREGISTERED_MODELS` + null-render skips). They're safely in the DB; decide per model whether to register in Hoffman or add spec coverage.
## Report back
Commit your work to the `ad2` branch and update ticket #32441 (hidden internal notes for engineering detail; the customer-facing thread is John Lehman). The remote operator will see your commits on sync. **Byte-for-byte DOS fidelity** (a leading `===` line + ~1-space input-column spacing, on ALL families) is intentionally deferred — there's a ticket note to disclose it to John when the whole effort is done.