Remote SSH/VPN to AD2 keeps flapping; hand the remaining datasheet fixes to the local AD2 session. Includes the per-subtype approach (DSCA_TEMPLATES from staged originals — STAGE 1 done, dsca-templates.json on AD2 = 126 models), the render-wiring + per-subtype byte-validation gate, Fix 5 (379 backfill via legacy_cert_text), the discipline (backup/save-state/validate-before-publish), and the derive-dsca-templates tool. Ref ticket #32441. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
62 lines
7.8 KiB
Markdown
62 lines
7.8 KiB
Markdown
# Dataforth Datasheet Fixes — Handoff to the AD2 Local Session (2026-06-18)
|
||
|
||
**For:** the Claude session running locally on AD2 (`C:\Shares\testdatadb`).
|
||
**Why you:** the remote operator's SSH/VPN to AD2 keeps flapping; you run on the box, so you have stable access. Pick up the remaining work below. **Ref Syncro ticket #32441.**
|
||
|
||
You are working on the **deployed** pipeline at `C:\Shares\testdatadb` (Node + PostgreSQL 18). The repo copies under `projects/dataforth-dos/datasheet-pipeline/implementation/` are **STALE** — edit the DEPLOYED files, reconcile the repo after.
|
||
|
||
---
|
||
|
||
## What's already DONE (live on the website)
|
||
- **Fix 1 (RTD label):** `datasheet-exact.js` folds RTD (sensorNum 7) into the temperature path (`Temp. (C)`, signed). Deployed; **all ~24k RTD certs re-pushed** (8B35/DSCA34/SCM5B34/SCM5B35). Audit finding resolved.
|
||
- **Fix 4 (encoded serials):** `parsers/multiline.js` decode rule `^[A-Z]\d{3,}-` + `raw_serial_number`; `import.js` cross-model guard. Recovered 603 units, 3 quarantined (`test_records_quarantine`). Schema added: `raw_serial_number`, `test_records_quarantine`.
|
||
- **Fix 3 (retest latest-wins):** `import.js` conflict rule = same-model AND (FAIL OR newer-by-date OR (same-date AND `raw_data IS DISTINCT` AND higher `ingest_seq`)). Parser stamps `ingest_seq = source-mtime*1e6 + line`. Schema: `ingest_seq`. Settled + validated (old-vs-new render diff vs the 16:30 dump: no corruption).
|
||
- Save-states on AD2: `datasheet-exact.js.bak-2026-06-17-1646`, `multiline.js`/`import.js` `.bak-2026-06-17-1713` (Fix4) and `.bak-2026-06-17-1726` (Fix3).
|
||
|
||
## DISCIPLINE (follow exactly — these are customer calibration certs)
|
||
1. **Backup first:** a `pg_dump` exists at `C:\Shares\testdatadb\_backups\testdatadb-2026-06-17-1630.dump`. **Take a FRESH `pg_dump` + a VSS shadow before any schema change or re-import.** (superuser `postgres` / `Paper123!@#`; app `testdatadb_app` / `DfTestDB2026!` — both vaulted at `clients/dataforth/testdatadb-postgres`.)
|
||
2. **Per-file save-state:** copy any file to `<file>.bak-YYYY-MM-DD-HHMM` before editing.
|
||
3. **Atomic, asserted patches:** prepare all string-replacements in memory, assert each matches exactly once, then write; `require()` load-check after.
|
||
4. **Validate before WRITE; validate before PUBLISH.** Never re-push to Hoffman until the per-subtype byte-validation passes.
|
||
5. **Re-publication:** the `testdatadb` service caches the template — **restart it** (`Restart-Service testdatadb`) after changing `datasheet-exact.js` so the live service uses the new template; do re-pushes from a **fresh node process** via `uploadBySerialNumbers(serials)`. Hoffman is idempotent (returns Unchanged/Updated/Created). Re-push in batches; the DSCA fleet is ~78k certs.
|
||
|
||
---
|
||
|
||
## FIX 2 — DSCA Final-Test rebuild (Defect B). STAGE 1 done; do STAGE 2–3.
|
||
|
||
**Root cause:** `datasheet-exact.js` has ONE hardcoded `DATA_LINES['DSCA']` + a single DSCA branch in `buildTSpecs()`. Real DSCA modules have **26 distinct Final-Test layouts** (per subtype), so names/specs/row-alignment are wrong and lines drop (e.g. Output Noise on DSCA38-05). The ACCURACY block also uses 5B/8B titles (`Vout (V)` + `====`) instead of DSCA's (`Output (V|mA)` + `----`).
|
||
|
||
**STAGE 1 (DONE):** `C:\Shares\testdatadb\dsca-templates.json` already exists — **126 DSCA models**, each `{ "accOut": "Output (V)"|"Output (mA)", "rows": [ {"name": "...", "spec": "..."}, ... ] }`, extracted byte-accurately from the staged originals (the extractor used the `===` separator under the Final-Test header for exact column spans). Verified DSCA38-05 now has `Output Noise | <= 2000 uVrms`. **This JSON is the authoritative template source — use it, don't reverse DSCFIN.DAT.**
|
||
|
||
**STAGE 2 (do this):** wire it into `templates/datasheet-exact.js`.
|
||
1. Read the current DSCA render path FIRST: `parseRawData()` (how the Final-Test STATUS groups are parsed from `raw_data` — they're 5-per-line groups like `"PASS 28.42","PASS","PASS 252.2",...`), `buildTSpecs()` DSCA branch, the Final-Test render loop in `generateExactDatasheet()`, and the accuracy header line (~line 567, currently `inputHeader + ' Vout (V) Vout (V)* Error (%) Status'` with `==========` separators).
|
||
2. Load `dsca-templates.json` at module top. For `family === 'DSCA'`:
|
||
- Replace the param **names** and **specs** source: instead of `DATA_LINES['DSCA']` + `buildTSpecs` DSCA specs, use `DSCA_TEMPLATES[record.model_number].rows` (each row gives the name + spec text directly — no spec-file lookup needed for DSCA).
|
||
- Map the parsed `raw_data` STATUS groups **positionally** onto the template rows for the **measured value + PASS/FAIL status**. The staged template row order == the raw_data group order (same DOS source). **Reconcile the existing skip rule** (`if (status.length <= 4) continue`): rows like `240VAC Withstand`/`Hi-Pot` have NO measured value and an empty spec — they must still render (blank measured + blank spec + PASS), so don't drop them; align by position, show the value when the group carries one.
|
||
- Fix the ACCURACY block for DSCA: use `DSCA_TEMPLATES[model].accOut` (`Output (V)` or `Output (mA)`) in place of `Vout (V)`, and `----------` dash separators in place of `==========`. (The input column is already correct post-Fix-1.)
|
||
- If a model is missing from `dsca-templates.json` (no staged original): **do not guess** — skip/flag it.
|
||
3. Save-state + atomic patch + load-check + restart the service.
|
||
|
||
**STAGE 3 — validate per subtype (the gate):** render ALL DSCA certs and **byte-compare vs the staged originals, grouped by the 26 layouts**; require zero content-delta per layout before any re-push. Use a content-normalized compare (strip the leading `===` letterhead-separator line and trailing whitespace — those are the known, deferred cosmetic gaps; focus on the Final-Test + ACCURACY content matching). Report match/mismatch per layout; investigate any mismatch before publishing. Then re-push DSCA in batches via `uploadBySerialNumbers`, diff-gated, watching Updated/Unchanged.
|
||
|
||
**Scope:** ~78k DSCA certs across DSCLOG. **Do NOT re-push until STAGE 3 is clean per subtype.**
|
||
|
||
---
|
||
|
||
## FIX 5 — Backfill 379 cryptolocker-era units (operational)
|
||
379 units (Oct 2025–Jan 2026, 3 stations) have no surviving `.DAT` but their staged `.TXT` still exist — and those were rendered by the ORIGINAL (correct) software, so **publish them directly, don't round-trip through raw_data**.
|
||
1. Add column `legacy_cert_text TEXT` to `test_records` (backup first).
|
||
2. Identify the 379 (staged `.TXT` present, serial absent from DB — the `MISSING-UNITS-REPORT-FOR-JOHN-2026-06-17.md` on the `ad2` branch has the method/list; Cause 2 set).
|
||
3. Insert them: `raw_data = NULL`, `legacy_cert_text` = the staged `.TXT` content, plus model/serial/test_date/station/overall_result parsed from the `.TXT`.
|
||
4. Modify `render-datasheet.renderContent()` to return `legacy_cert_text` when `raw_data IS NULL`.
|
||
5. **Validate:** cross-reference the 379 serials against the ERP/work-order system to confirm valid shipped units before publishing; spot-check rendered vs staged. Then publish.
|
||
|
||
---
|
||
|
||
## Minor cleanup
|
||
- **1 Hoffman push error** during the Fix-3 publish — find which serial (check `notify` logs / re-run that batch with per-record fallback) and resolve.
|
||
- **~420 "skipped" units** (no spec entry / model not registered in Hoffman — `upload-to-api.js` `UNREGISTERED_MODELS` + null-render skips). They're safely in the DB; decide per model whether to register in Hoffman or add spec coverage.
|
||
|
||
## Report back
|
||
Commit your work to the `ad2` branch and update ticket #32441 (hidden internal notes for engineering detail; the customer-facing thread is John Lehman). The remote operator will see your commits on sync. **Byte-for-byte DOS fidelity** (a leading `===` line + ~1-space input-column spacing, on ALL families) is intentionally deferred — there's a ticket note to disclose it to John when the whole effort is done.
|