Files
claudetools/projects/dataforth-dos/PARSING-FIDELITY-VERDICT-2026-06-17.md
Mike Swanson d58d1dd76c dataforth(datasheet): same-day retest faithfulness — exposure sweep + fix proposal
Whole-source sweep (981,716 records / 406,549 serials): 6,515 same-day multi-run
events; DB holds a NON-latest run for 311 (the strictly-greater-date conflict rule
freezes on an arbitrary same-day run). Corrects the verdict doc to flag same-day
retests as a latest-wins faithfulness violation (not benign). Adds the proposed
>= -with-data-differs conflict-rule fix (diagnose-only) and the sweep tool.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 13:02:32 -07:00

4.9 KiB
Raw Blame History

Parsing Fidelity Verdict — testdatadb ingestion vs original staged datasheets

Date: 2026-06-17 · Host: AD2 · Scope: all 11,922 staged original .TXT datasheets vs the PostgreSQL test_records Raw report: PARSING-FIDELITY-REPORT-2026-06-17.txt · Tool: datasheet-pipeline/implementation/tools/validate-parsing.js

Verdict

Two distinct questions, two answers:

  1. Is the parser faithful to the .DAT record it reads? YES — 0 genuine parse faults across 11,239 comparable records. Every value the importer stores is byte-exact; no misreads, no mis-segmentation.
  2. Does each DB row faithfully reproduce the unit's final test sheet? NOT always. The DB is one-row-per-serial, and for units re-tested on the same calendar date the conflict rule (strictly-greater date) freezes on an arbitrary same-day run instead of the latest. Whole-source sweep: 311 (serial,date) groups where the DB holds a non-latest same-day run (see SAMEDAY-RETEST-EXPOSURE-2026-06-17.txt). This is a data-model / conflict-rule defect, not a parser fault — fix proposed in CONFLICT-RULE-FIX-PROPOSAL-2026-06-17.md.

The remaining staged-sample "mismatches" were explained by legitimate later-date retests (latest-wins working), reused generic serials, VAS format, or legacy out-of-scope units.

Method

Compared each staged original .TXT (the DOS-station ground truth, written before ingestion) against the DB record's raw_data (parsed from the .DAT). The cross-check keyed on scale-invariant data:

  • Error (%) — dimensionless, identical in .DAT and .TXT for every family (immune to mV scaling and current-output V→mA conversion). The primary fidelity signal.
  • Stim setpoints (scale-aware) — used to confirm the same unit/test when error values differed (retest vs wrong-record).
  • Serial (with hex-prefix decode), model (SCM-prefix normalized), date.

Results (11,921 staged files with an SN)

Outcome Count Meaning
Consistent (SN+model+date+5×error%) 11,226 Faithful parse, confirmed
Retest — DB date newer than .TXT 35 ON-CONFLICT updated DB to a later test (expected)
Retest — same date, stim matches, run differs 42 FAITHFULNESS VIOLATION — unit tested 2+ times same day; DB froze on a non-latest run (strictly-greater-date rule). Staged subset of the 311 whole-source cases.
VAS/single-point format 5 No 5-row accuracy block (SCMVAS) — not comparable by this method
Serial collision (generic SN, diff family) 2 1-1/1-2 reused across products; unique-on-serial keeps one
Genuine parse fault 0
Model variant mismatch (same family) 2 A819-1/A821-2 — reused serial across 8B35/8B36 (collision)
DB older than .TXT 1 A821-1 — same collision pair
Accuracy-row-count diff 0

Why the "genuine" bucket collapsed to 0

The last 16 suspects were all SCM5B37K-1530 (K-thermocouple). Their stim values matched the same 5 nominal setpoints (-50/112.5/275/437.5/600 °C) but differed by ~0.06 °C run-to-run — because the thermocouple input is a measured analog value, not an exact setpoint. A scale+relative stim tolerance correctly classifies them as same-day retests. A real segmentation fault would show a different setpoint structure; none did.

Two follow-up items (NOT parsing-correctness bugs)

  1. 608 staged originals have no DB record (mostly A-prefix 10xxx serials, e.g. A243-1 = 10243-1, model 5B45-25D). These exist as staged .TXT but are absent from the DB under both decoded and encoded serial. This is an ingestion-completeness question (the source .DAT for these units appears to be out of the import scan scope, or these are custom -NND variants), separate from parsing fidelity. Worth a completeness pass: confirm which .DAT paths the importer scans and whether these models' .DAT files are present.
  2. Same-day retests don't apply "latest wins" (PRIMARY DEFECT). The ON CONFLICT rule updates only when EXCLUDED.test_date > test_records.test_date (strictly greater). For a unit tested 2+ times on one date, the rule freezes on whichever same-day run the import processed first and never advances to the latest — so the DB (and the website cert) can show non-final measured values. Whole-source exposure (981,716 records / 406,549 serials): 6,515 same-day multi-run events across 5,977 serials; the DB holds a non-latest run for 311 of them (3,803 already on latest, 984 superseded by a later-date retest, 1,417 serial absent). Fix proposed in CONFLICT-RULE-FIX-PROPOSAL-2026-06-17.md. Directly audit-relevant: same-day runs are typically trim/re-test iterations and the last run is the accepted cert result.

How to re-run

cd C:\Shares\testdatadb
node <path-to>/validate-parsing.js [optional-report-path]
# Reads C:\Shares\test\STAGE\**\*.TXT and compares to test_records. Read-only.