Consolidates AD2's diagnosis + independent Grok/Gemini review into an implementation spec for the 5 fixes (RTD label, DSCA Final-Test rebuild, retest supersede rule, encoded-serial importer decode, 379 backfill) with per-fix validation gates and a cross-cutting re-publication discipline. Drives the AD2-side implementation. Ref ticket #32441. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
12 KiB
Dataforth Test-Datasheet Pipeline — Fix Spec (hardened)
Date: 2026-06-17 · Host: AD2 (C:\Shares\testdatadb, Node + PostgreSQL 18) · Status: SPEC for review — implementation driven on AD2
Inputs: AD2 diagnosis (DATASHEET-RTD-BUG-DIAGNOSIS, PARSING-FIDELITY-VERDICT, MISSING-UNITS-REPORT, CONFLICT-RULE-FIX-PROPOSAL) + independent multi-AI review (Grok adversarial + Gemini). Owner direction on retest handling (Mike, 2026-06-17).
All defects are in the regeneration/ingestion pipeline that replaced the cryptolocker-destroyed original parser/publisher. Source test data is intact (DB matches staged originals across 11,239 records, 0 parse faults).
0. Ground truth (verified live, 2026-06-17)
test_records: 473,780 rows = 473,780 distinctserial_number→ exactly one row per serial.- Unique constraints:
uq_test_records_snUNIQUE(serial_number) [operative]; redundant UNIQUE(log_type,model_number,serial_number,test_date,test_station). - Columns incl.
raw_data(verbatim .DAT),overall_result,api_uploaded_at,forweb_exported_at,datasheet_exported_at,work_order. - Deployed
database/import.jsusesON CONFLICT (serial_number)withWHERE overall_result='FAIL' OR (EXCLUDED PASS AND EXCLUDED.test_date > test_records.test_date). - WARNING — repo drift: the repo copy
…/implementation/database/import.jsis STALE (shows a 5-tupleON CONFLICT). Edit the DEPLOYED file; reconcile the repo copy after. Verify every file's deployed content before changing it.
0a. CROSS-CUTTING — re-publication discipline (MANDATORY for any fix that changes cert text)
Every fix below that alters rendered output must be published deliberately, not by blanket cache-clear:
- Diff before re-push. For each candidate serial, render OLD vs NEW and only act where output actually changes. Do not clear
api_uploaded_at/forweb_exported_atfor unchanged renders. - Re-POST semantics. Hoffman bulk API is idempotent — returns
Unchangedwhen content matches, overwrites when it differs (per diagnosis §6). Confirm this holds before bulk re-push; watch for dedup/version behavior. - Targeted, staged rollout. Re-publish in bounded batches (start with the Phytec 102), confirm counts, then widen. Log every batch.
- Rollback. Keep the prior rendered text (or the prior template commit) for every re-published serial so a bad batch can be reverted.
- Audit framing. A cert's text changing after initial publication is a bug correction — record it (ticket #32441 + an internal change log of affected serial ranges) so it's defensible in an audit.
Fix 1 — Defect A: RTD input labeled resistance, not temperature (the audit finding)
File: templates/datasheet-exact.js · Scope: ~24,000 certs (8B35, DSCA34, SCM5B34/35) · Status: fix written, needs scope/ground-truth proof.
Root cause: getSensorNum() returns 7 for RTD sentypes (s.includes('RTD')); two branches on sensorNum===7 emit ' Rin (ohms)' header + unsigned value. Dataforth RTD certs report the input as Temperature (deg C); raw_data stimulus is already deg C.
Approach (AD2 diff): fold sensorNum===7 into the temperature branch (3–6) for header (' Temp. (C)') and value (formatSigned). Leave the i===13 ohm/ohm Lead-R override intact.
Hardening (multi-AI):
- "15 renders changed / 0 non-RTD" proves a branch moved, not correctness. Before deploy: (a) byte-compare the fixed render against the staged original
.TXTfor a real RTD sample (8B35 incl. SN 179553-13, DSCA34, SCM5B34/35) — require exact match; (b) confirm RTD-detection coverage: count RTD-family rows in the DB (model_number LIKE '%34%'/'%35%'etc.) and confirm the regenerator'ss.includes('RTD')actually classifies all of them as 7 (only 15 changing across 184 renders may mean the sample was RTD-thin, or some RTD sentypes aren't matched).
Risk: LOW-MODERATE (localized; main risk is under-detecting which modules are RTD). Deploy: after byte-match + coverage check; then targeted re-push (Phytec 102 first).
Fix 2 — Defect B: DSCA Final-Test table wrong / dropped lines
File: templates/datasheet-exact.js (DATA_LINES['DSCA'], buildTSpecs() DSCA branch, accuracy-block titles) · Scope: up to ~78,000 DSCA certs · Status: NEEDS DESIGN — highest structural risk.
Root cause: a single hardcoded DATA_LINES['DSCA'] + single DSCA buildTSpecs branch; real DSCA modules have per-subtype Final-Test layouts → wrong names, garbage specs (< 0 mA, +/- 0 %), rows misaligned, lines dropped (e.g. Output Noise on DSCA38-05). Accuracy block also uses 5B/8B titles (Vout (V) / ====) instead of DSCA's (Output (V|mA) / ----).
Approach (multi-AI consensus — REVISED from AD2's DSCFIN.DAT idea):
- Do NOT reverse-engineer
DSCFIN.DAT(legacy DOS config; QB writer has hardcoded overrides outside the config → guaranteed edge-case drift). - Derive per-subtype templates from the staged original
.TXT(the actual correct customer certs are ground truth): group staged DSCA.TXTby subtype, extract each subtype's Final-Test parameter name/unit/spec list and accuracy titles directly. - Key subtype selection on
model_number(prefix) +SENTYPE/ output-signal type. Build an explicit subtype→layout map. - Fix the DSCA accuracy block titles/separators (
Output (V|mA),----).
Validation (hard gate): generate ALL DSCA certs and byte-for-byte diff vs the staged originals across every subtype; zero-delta required. Any subtype with no staged original → flag, do not guess.
Risk: HIGH (largest population + wrong numeric labels/limits). The longer pole; do after 1/3/4. Deploy: only after zero-delta validation per subtype; re-publish in batches with diff-gating.
Fix 3 — Retest handling: latest test supersedes (OWNER DIRECTION + hardening)
File: deployed database/import.js (ON CONFLICT (serial_number) WHERE clause) · Scope: ~311 stuck units now + all future retests · Status: design owner-set, implementation hardened.
Owner rule (Mike): one row per serial; a new test on the SAME unit (same model / "everything else checks out") supersedes anything prior — latest test wins. A reused serial on a DIFFERENT product is NOT the same unit — recognize the collision, don't blindly overwrite.
Why the current rule fails: strictly-greater date + date-only granularity → same-day reruns can't replace (~311 stuck on a non-final run).
Approach (hardened — both AIs refute pure scan-order as the recency signal):
- Conflict on
serial_number. Update when the incoming row is the SAME unit and is genuinely newer:EXCLUDED.model_number = test_records.model_number(same unit) ANDEXCLUDED.raw_data IS DISTINCT FROM test_records.raw_data(real change; avoids re-push churn) AND incoming is at least as new.- Recency must not rely on import scan order alone. Use
EXCLUDED.test_date >= test_records.test_date, and break same-date ties with a monotonic signal captured at parse time — source.DATmtime or an ingest sequence number (add a column, e.g.ingest_seq/source_mtime). Last-by-(date, tiebreaker) wins.
- Collision handling (different
model_number, same serial): do NOT overwrite. Route to atest_records_quarantinetable (or a flagged status) + alert. These are the reused generic serials (1-1,1-2) — genuinely different units. - Keep the
FAIL → PASSoverride.
Validation: ingest the owner's 4-retest sample IN REVERSE chronological order; final DB state MUST be the mathematically newest run. Re-run tools/validate-parsing.js; same-day violations → ~0. Confirm the 311 settle on the latest run.
Risk: HIGH if scan-order is trusted (a future bulk re-import could overwrite newer with older across the whole DB). MITIGATED by the date+tiebreaker rule. Deploy: after the reverse-order sample passes; then re-import + diff-gated re-push of the 311.
Fix 4 — Importer drops letter-prefixed encoded serials
File: parsers/multiline.js (serial/date regex) · Scope: ~9,510 records / 840 serials / 141 models · Status: needs design (PK-boundary mutation).
Root cause: line.match(/^"(\d+-\d+[A-Za-z]?)","(\d{2}-\d{2}-\d{4})"$/) — \d+-\d+ requires leading digits, so DOS 8.3-encoded serials (10243-1 → A243-1; first two digits → letter, prefix = charCodeAt(0)-55) never match → whole record silently dropped.
Approach (hardened — keep the literal, both AIs):
- Widen regex to allow an optional leading letter:
/^"([A-Za-z]?\d+-\d+[A-Za-z]?)","(\d{2}-\d{2}-\d{4})"$/. - Store BOTH: add
raw_serial_number(the literal file bytes, e.g.A243-1) and keepserial_number= decoded numeric (10243-1). UNIQUE stays onserial_number. Preserves a perfect audit trail. - Decode only when the captured serial matches
^[A-Za-z]\d.
Pre-flight (MANDATORY before any import): run the parser over the ~9,510 dropped records read-only → emit CSV raw_serial, decoded_serial, model_number. Search decoded serials for collisions against the existing 473,780 rows. For each collision decide policy (a decoded A243-1 colliding with a genuine 10243-1 of a different model = a real conflict — quarantine, don't merge two physical units). Only proceed once collisions are enumerated and a policy set.
Risk: MODERATE (transforms the uniqueness key + customer lookup id). Deploy: after the collision CSV is clean/resolved; the re-import re-exercises the upsert path → run under the Fix-3 rule, diff-gated re-push.
Fix 5 — Backfill 379 cryptolocker-era units from staged originals
Scope: 379 units (Oct 2025–Jan 2026, 3 stations), no surviving .DAT; staged .TXT exist · Status: operational.
Key fact: the staged .TXT were produced by the ORIGINAL (pre-crypto) renderer → already correct (no Defect A/B).
Approach (both AIs — publish directly, don't round-trip):
- Add
legacy_cert_textcolumn. Insert the 379 withraw_data = NULL,legacy_cert_text= the staged.TXTcontent. - Publisher serves
legacy_cert_textwhenraw_data IS NULL(bypasses regeneration). - Do NOT reverse
.TXT→raw_data→re-render (two translation-loss points; guarantees drift).
Validation: cross-reference the 379 serials against the ERP / work-order system to confirm they are valid shipped units before exposing via the API. Spot-check rendered vs staged text.
Consistency note: these rows are permanently "original-renderer" output, divergent from what the fixed template would emit. Acceptable (and safer) given no raw source; document the class.
Risk: LOW (bounded 379, immutable text) — but zero tolerance for wrong text (no raw fallback). Deploy: after ERP cross-check.
Risk ranking (combined) & recommended order
| Rank | Fix | Why |
|---|---|---|
| 1 (tie) | Fix 3 retest | scan-order recency could corrupt DB-wide on any future re-import (Gemini #1) |
| 1 (tie) | Fix 2 DSCA | largest scope + wrong numeric labels/limits; design-heavy (Grok #1) |
| 3 | Fix 4 serials | mutates the uniqueness/lookup key; merge risk |
| 4 | Fix 1 RTD | localized; risk is under-scoping RTD detection |
| 5 | Fix 5 backfill | small, immutable, but no raw fallback |
Suggested execution order (lowest-risk customer win first, hardest last):
- Fix 1 (RTD label) — after byte-match + scope check → clears the audit + corrects the Phytec 102 (deploy tonight candidate).
- Re-publish Phytec 102 (diff-gated).
- Fix 4 (serial decode) — after clean collision CSV.
- Fix 3 (retest rule) — after reverse-order sample passes.
- Fix 2 (DSCA rebuild) — after per-subtype zero-delta validation.
- Fix 5 (backfill 379) — after ERP cross-check.
Open items needing an owner decision
- Fix 3: add a real tie-breaker column (
ingest_seq/source_mtime) vs accepttest_date >=+ scan-order? (recommend the column.) - Fix 3/4 collisions: quarantine table + alert vs reject-and-log? (recommend quarantine table.)
- Fix 4: add
raw_serial_numbercolumn (recommended) — schema change. - Fix 5: add
legacy_cert_textcolumn + publisher branch (recommended) — schema + publisher change. - Tonight: scope = Fix 1 + Phytec re-push only (smallest safe win); the rest staged after.