dataforth: handoff doc for AD2 session — Fix 2 (DSCA rebuild) STAGE 2-3 + Fix 5 + cleanup
Remote SSH/VPN to AD2 keeps flapping; hand the remaining datasheet fixes to the local AD2 session. Includes the per-subtype approach (DSCA_TEMPLATES from staged originals — STAGE 1 done, dsca-templates.json on AD2 = 126 models), the render-wiring + per-subtype byte-validation gate, Fix 5 (379 backfill via legacy_cert_text), the discipline (backup/save-state/validate-before-publish), and the derive-dsca-templates tool. Ref ticket #32441. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,61 @@
|
||||
# Dataforth Datasheet Fixes — Handoff to the AD2 Local Session (2026-06-18)
|
||||
|
||||
**For:** the Claude session running locally on AD2 (`C:\Shares\testdatadb`).
|
||||
**Why you:** the remote operator's SSH/VPN to AD2 keeps flapping; you run on the box, so you have stable access. Pick up the remaining work below. **Ref Syncro ticket #32441.**
|
||||
|
||||
You are working on the **deployed** pipeline at `C:\Shares\testdatadb` (Node + PostgreSQL 18). The repo copies under `projects/dataforth-dos/datasheet-pipeline/implementation/` are **STALE** — edit the DEPLOYED files, reconcile the repo after.
|
||||
|
||||
---
|
||||
|
||||
## What's already DONE (live on the website)
|
||||
- **Fix 1 (RTD label):** `datasheet-exact.js` folds RTD (sensorNum 7) into the temperature path (`Temp. (C)`, signed). Deployed; **all ~24k RTD certs re-pushed** (8B35/DSCA34/SCM5B34/SCM5B35). Audit finding resolved.
|
||||
- **Fix 4 (encoded serials):** `parsers/multiline.js` decode rule `^[A-Z]\d{3,}-` + `raw_serial_number`; `import.js` cross-model guard. Recovered 603 units, 3 quarantined (`test_records_quarantine`). Schema added: `raw_serial_number`, `test_records_quarantine`.
|
||||
- **Fix 3 (retest latest-wins):** `import.js` conflict rule = same-model AND (FAIL OR newer-by-date OR (same-date AND `raw_data IS DISTINCT` AND higher `ingest_seq`)). Parser stamps `ingest_seq = source-mtime*1e6 + line`. Schema: `ingest_seq`. Settled + validated (old-vs-new render diff vs the 16:30 dump: no corruption).
|
||||
- Save-states on AD2: `datasheet-exact.js.bak-2026-06-17-1646`, `multiline.js`/`import.js` `.bak-2026-06-17-1713` (Fix4) and `.bak-2026-06-17-1726` (Fix3).
|
||||
|
||||
## DISCIPLINE (follow exactly — these are customer calibration certs)
|
||||
1. **Backup first:** a `pg_dump` exists at `C:\Shares\testdatadb\_backups\testdatadb-2026-06-17-1630.dump`. **Take a FRESH `pg_dump` + a VSS shadow before any schema change or re-import.** (superuser `postgres` / `Paper123!@#`; app `testdatadb_app` / `DfTestDB2026!` — both vaulted at `clients/dataforth/testdatadb-postgres`.)
|
||||
2. **Per-file save-state:** copy any file to `<file>.bak-YYYY-MM-DD-HHMM` before editing.
|
||||
3. **Atomic, asserted patches:** prepare all string-replacements in memory, assert each matches exactly once, then write; `require()` load-check after.
|
||||
4. **Validate before WRITE; validate before PUBLISH.** Never re-push to Hoffman until the per-subtype byte-validation passes.
|
||||
5. **Re-publication:** the `testdatadb` service caches the template — **restart it** (`Restart-Service testdatadb`) after changing `datasheet-exact.js` so the live service uses the new template; do re-pushes from a **fresh node process** via `uploadBySerialNumbers(serials)`. Hoffman is idempotent (returns Unchanged/Updated/Created). Re-push in batches; the DSCA fleet is ~78k certs.
|
||||
|
||||
---
|
||||
|
||||
## FIX 2 — DSCA Final-Test rebuild (Defect B). STAGE 1 done; do STAGE 2–3.
|
||||
|
||||
**Root cause:** `datasheet-exact.js` has ONE hardcoded `DATA_LINES['DSCA']` + a single DSCA branch in `buildTSpecs()`. Real DSCA modules have **26 distinct Final-Test layouts** (per subtype), so names/specs/row-alignment are wrong and lines drop (e.g. Output Noise on DSCA38-05). The ACCURACY block also uses 5B/8B titles (`Vout (V)` + `====`) instead of DSCA's (`Output (V|mA)` + `----`).
|
||||
|
||||
**STAGE 1 (DONE):** `C:\Shares\testdatadb\dsca-templates.json` already exists — **126 DSCA models**, each `{ "accOut": "Output (V)"|"Output (mA)", "rows": [ {"name": "...", "spec": "..."}, ... ] }`, extracted byte-accurately from the staged originals (the extractor used the `===` separator under the Final-Test header for exact column spans). Verified DSCA38-05 now has `Output Noise | <= 2000 uVrms`. **This JSON is the authoritative template source — use it, don't reverse DSCFIN.DAT.**
|
||||
|
||||
**STAGE 2 (do this):** wire it into `templates/datasheet-exact.js`.
|
||||
1. Read the current DSCA render path FIRST: `parseRawData()` (how the Final-Test STATUS groups are parsed from `raw_data` — they're 5-per-line groups like `"PASS 28.42","PASS","PASS 252.2",...`), `buildTSpecs()` DSCA branch, the Final-Test render loop in `generateExactDatasheet()`, and the accuracy header line (~line 567, currently `inputHeader + ' Vout (V) Vout (V)* Error (%) Status'` with `==========` separators).
|
||||
2. Load `dsca-templates.json` at module top. For `family === 'DSCA'`:
|
||||
- Replace the param **names** and **specs** source: instead of `DATA_LINES['DSCA']` + `buildTSpecs` DSCA specs, use `DSCA_TEMPLATES[record.model_number].rows` (each row gives the name + spec text directly — no spec-file lookup needed for DSCA).
|
||||
- Map the parsed `raw_data` STATUS groups **positionally** onto the template rows for the **measured value + PASS/FAIL status**. The staged template row order == the raw_data group order (same DOS source). **Reconcile the existing skip rule** (`if (status.length <= 4) continue`): rows like `240VAC Withstand`/`Hi-Pot` have NO measured value and an empty spec — they must still render (blank measured + blank spec + PASS), so don't drop them; align by position, show the value when the group carries one.
|
||||
- Fix the ACCURACY block for DSCA: use `DSCA_TEMPLATES[model].accOut` (`Output (V)` or `Output (mA)`) in place of `Vout (V)`, and `----------` dash separators in place of `==========`. (The input column is already correct post-Fix-1.)
|
||||
- If a model is missing from `dsca-templates.json` (no staged original): **do not guess** — skip/flag it.
|
||||
3. Save-state + atomic patch + load-check + restart the service.
|
||||
|
||||
**STAGE 3 — validate per subtype (the gate):** render ALL DSCA certs and **byte-compare vs the staged originals, grouped by the 26 layouts**; require zero content-delta per layout before any re-push. Use a content-normalized compare (strip the leading `===` letterhead-separator line and trailing whitespace — those are the known, deferred cosmetic gaps; focus on the Final-Test + ACCURACY content matching). Report match/mismatch per layout; investigate any mismatch before publishing. Then re-push DSCA in batches via `uploadBySerialNumbers`, diff-gated, watching Updated/Unchanged.
|
||||
|
||||
**Scope:** ~78k DSCA certs across DSCLOG. **Do NOT re-push until STAGE 3 is clean per subtype.**
|
||||
|
||||
---
|
||||
|
||||
## FIX 5 — Backfill 379 cryptolocker-era units (operational)
|
||||
379 units (Oct 2025–Jan 2026, 3 stations) have no surviving `.DAT` but their staged `.TXT` still exist — and those were rendered by the ORIGINAL (correct) software, so **publish them directly, don't round-trip through raw_data**.
|
||||
1. Add column `legacy_cert_text TEXT` to `test_records` (backup first).
|
||||
2. Identify the 379 (staged `.TXT` present, serial absent from DB — the `MISSING-UNITS-REPORT-FOR-JOHN-2026-06-17.md` on the `ad2` branch has the method/list; Cause 2 set).
|
||||
3. Insert them: `raw_data = NULL`, `legacy_cert_text` = the staged `.TXT` content, plus model/serial/test_date/station/overall_result parsed from the `.TXT`.
|
||||
4. Modify `render-datasheet.renderContent()` to return `legacy_cert_text` when `raw_data IS NULL`.
|
||||
5. **Validate:** cross-reference the 379 serials against the ERP/work-order system to confirm valid shipped units before publishing; spot-check rendered vs staged. Then publish.
|
||||
|
||||
---
|
||||
|
||||
## Minor cleanup
|
||||
- **1 Hoffman push error** during the Fix-3 publish — find which serial (check `notify` logs / re-run that batch with per-record fallback) and resolve.
|
||||
- **~420 "skipped" units** (no spec entry / model not registered in Hoffman — `upload-to-api.js` `UNREGISTERED_MODELS` + null-render skips). They're safely in the DB; decide per model whether to register in Hoffman or add spec coverage.
|
||||
|
||||
## Report back
|
||||
Commit your work to the `ad2` branch and update ticket #32441 (hidden internal notes for engineering detail; the customer-facing thread is John Lehman). The remote operator will see your commits on sync. **Byte-for-byte DOS fidelity** (a leading `===` line + ~1-space input-column spacing, on ALL families) is intentionally deferred — there's a ticket note to disclose it to John when the whole effort is done.
|
||||
53
projects/dataforth-dos/tools/derive-dsca-templates.js
Normal file
53
projects/dataforth-dos/tools/derive-dsca-templates.js
Normal file
@@ -0,0 +1,53 @@
|
||||
// Fix 2 STAGE 1 (read-only build): extract per-model DSCA Final-Test templates from staged originals.
|
||||
// Uses the '===' separator line under the Final-Test header to get exact column spans.
|
||||
const fs = require('fs'), path = require('path');
|
||||
const STAGE = 'C:/Shares/test/STAGE';
|
||||
const OUT = 'C:/Shares/testdatadb/dsca-templates.json';
|
||||
function walk(d, out) { let it = []; try { it = fs.readdirSync(d, { withFileTypes: true }); } catch { return out; } for (const e of it) { const p = path.join(d, e.name); if (e.isDirectory()) walk(p, out); else if (/\.txt$/i.test(e.name)) out.push(p); } return out; }
|
||||
function colSpans(sep) { const cols = []; let m; const re = /=+/g; while ((m = re.exec(sep))) cols.push([m.index, m.index + m[0].length]); return cols; }
|
||||
function extract(t) {
|
||||
const lines = t.replace(/\r\n/g, '\n').split('\n');
|
||||
const accHdr = lines.find(l => /Error \(%\)/.test(l) && /Status/.test(l)) || '';
|
||||
const accOut = (accHdr.match(/Output \((?:V|mA)\)|Vout \(V\)/) || ['?'])[0];
|
||||
let fi = lines.findIndex(l => /FINAL TEST RESULTS/.test(l)); if (fi < 0) return null;
|
||||
let hi = -1; for (let i = fi + 1; i < lines.length; i++) { if (/Parameter\s+Measured/.test(lines[i])) { hi = i; break; } } if (hi < 0) return null;
|
||||
const sep = lines[hi + 1] || ''; if (!/=/.test(sep)) return null;
|
||||
const cols = colSpans(sep); if (cols.length < 4) return null;
|
||||
const [pc, mc, sc, stc] = cols;
|
||||
const rows = [];
|
||||
for (let i = hi + 2; i < lines.length; i++) {
|
||||
const l = lines[i];
|
||||
if (/Check List|^\s*_{5,}/.test(l)) break;
|
||||
if (!l.trim()) continue;
|
||||
const name = (l.slice(pc[0], mc[0]) || '').trim();
|
||||
const spec = (l.slice(sc[0], stc[0]) || '').trim();
|
||||
if (!name && !spec) continue;
|
||||
rows.push({ name, spec });
|
||||
}
|
||||
return { accOut, rows };
|
||||
}
|
||||
(async () => {
|
||||
const files = walk(STAGE, []);
|
||||
const byModel = {};
|
||||
for (const f of files) {
|
||||
let t; try { t = fs.readFileSync(f, 'utf8'); } catch { continue; }
|
||||
const model = (t.match(/^\s*Model:\s*(\S+)/m) || [])[1] || '';
|
||||
if (!/^DSCA/i.test(model)) continue;
|
||||
const tpl = extract(t); if (!tpl) continue;
|
||||
// keep the sheet with the MOST rows per model (most complete; avoids truncated samples)
|
||||
if (!byModel[model] || tpl.rows.length > byModel[model].rows.length) byModel[model] = { ...tpl, sheets: (byModel[model] ? byModel[model].sheets : 0) + 1 };
|
||||
else byModel[model].sheets++;
|
||||
}
|
||||
const models = Object.keys(byModel).sort();
|
||||
console.log('DSCA models templated: ' + models.length);
|
||||
const out = {}; for (const m of models) out[m] = { accOut: byModel[m].accOut, rows: byModel[m].rows };
|
||||
fs.writeFileSync(OUT, JSON.stringify(out));
|
||||
console.log('wrote ' + OUT + ' (' + fs.statSync(OUT).size + ' bytes)');
|
||||
const rc = {}; for (const m of models) { const n = byModel[m].rows.length; rc[n] = (rc[n] || 0) + 1; }
|
||||
console.log('row-count distribution (rows:models): ' + Object.entries(rc).sort((a, b) => a[0] - b[0]).map(([n, c]) => n + ':' + c).join(' '));
|
||||
for (const probe of ['DSCA38-05', 'DSCA34-01', 'DSCA38-08C', 'DSCA30-01']) {
|
||||
const s = out[probe];
|
||||
if (s) { console.log('\n' + probe + ' accOut=' + s.accOut + ' rows=' + s.rows.length); s.rows.forEach(r => console.log(' ' + r.name.padEnd(28) + ' | ' + r.spec)); }
|
||||
else console.log('\n' + probe + ': NOT FOUND');
|
||||
}
|
||||
})().catch(e => { console.error('ERR ' + e.message); });
|
||||
Reference in New Issue
Block a user