Files
claudetools/projects/dataforth-dos/PARSING-FIDELITY-VERDICT-2026-06-17.md
Mike Swanson d58d1dd76c dataforth(datasheet): same-day retest faithfulness — exposure sweep + fix proposal
Whole-source sweep (981,716 records / 406,549 serials): 6,515 same-day multi-run
events; DB holds a NON-latest run for 311 (the strictly-greater-date conflict rule
freezes on an arbitrary same-day run). Corrects the verdict doc to flag same-day
retests as a latest-wins faithfulness violation (not benign). Adds the proposed
>= -with-data-differs conflict-rule fix (diagnose-only) and the sweep tool.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 13:02:32 -07:00

51 lines
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Parsing Fidelity Verdict — testdatadb ingestion vs original staged datasheets
**Date:** 2026-06-17 · **Host:** AD2 · **Scope:** all 11,922 staged original `.TXT` datasheets vs the PostgreSQL `test_records`
**Raw report:** `PARSING-FIDELITY-REPORT-2026-06-17.txt` · **Tool:** `datasheet-pipeline/implementation/tools/validate-parsing.js`
## Verdict
**Two distinct questions, two answers:**
1. **Is the parser faithful to the `.DAT` record it reads?** YES — 0 genuine parse faults across 11,239 comparable records. Every value the importer stores is byte-exact; no misreads, no mis-segmentation.
2. **Does each DB row faithfully reproduce the unit's *final* test sheet?** NOT always. The DB is one-row-per-serial, and for units re-tested **on the same calendar date** the conflict rule (strictly-greater date) freezes on an arbitrary same-day run instead of the latest. Whole-source sweep: **311 (serial,date) groups where the DB holds a non-latest same-day run** (see `SAMEDAY-RETEST-EXPOSURE-2026-06-17.txt`). This is a data-model / conflict-rule defect, not a parser fault — fix proposed in `CONFLICT-RULE-FIX-PROPOSAL-2026-06-17.md`.
The remaining staged-sample "mismatches" were explained by legitimate later-date retests (latest-wins working), reused generic serials, VAS format, or legacy out-of-scope units.
## Method
Compared each staged original `.TXT` (the DOS-station ground truth, written *before* ingestion) against the DB record's `raw_data` (parsed from the `.DAT`). The cross-check keyed on **scale-invariant data**:
- **Error (%)** — dimensionless, identical in `.DAT` and `.TXT` for every family (immune to mV scaling and current-output V→mA conversion). The primary fidelity signal.
- **Stim setpoints** (scale-aware) — used to confirm the *same unit/test* when error values differed (retest vs wrong-record).
- Serial (with hex-prefix decode), model (SCM-prefix normalized), date.
## Results (11,921 staged files with an SN)
| Outcome | Count | Meaning |
|---|---:|---|
| **Consistent** (SN+model+date+5×error%) | **11,226** | Faithful parse, confirmed |
| Retest — DB date newer than `.TXT` | 35 | ON-CONFLICT updated DB to a later test (expected) |
| Retest — same date, stim matches, run differs | 42 | **FAITHFULNESS VIOLATION** — unit tested 2+ times same day; DB froze on a non-latest run (strictly-greater-date rule). Staged subset of the 311 whole-source cases. |
| VAS/single-point format | 5 | No 5-row accuracy block (SCMVAS) — not comparable by this method |
| Serial collision (generic SN, diff family) | 2 | `1-1`/`1-2` reused across products; unique-on-serial keeps one |
| **Genuine parse fault** | **0** | — |
| Model variant mismatch (same family) | 2 | `A819-1`/`A821-2` — reused serial across 8B35/8B36 (collision) |
| DB older than `.TXT` | 1 | `A821-1` — same collision pair |
| Accuracy-row-count diff | 0 | — |
### Why the "genuine" bucket collapsed to 0
The last 16 suspects were all `SCM5B37K-1530` (K-thermocouple). Their stim values matched the same 5 nominal setpoints (-50/112.5/275/437.5/600 °C) but differed by ~0.06 °C run-to-run — because the thermocouple input is a *measured analog value*, not an exact setpoint. A scale+relative stim tolerance correctly classifies them as same-day retests. A real segmentation fault would show a different setpoint *structure*; none did.
## Two follow-up items (NOT parsing-correctness bugs)
1. **608 staged originals have no DB record** (mostly A-prefix `10xxx` serials, e.g. `A243-1` = `10243-1`, model `5B45-25D`). These exist as staged `.TXT` but are absent from the DB under both decoded and encoded serial. This is an **ingestion-completeness** question (the source `.DAT` for these units appears to be out of the import scan scope, or these are custom `-NND` variants), separate from parsing fidelity. Worth a completeness pass: confirm which `.DAT` paths the importer scans and whether these models' `.DAT` files are present.
2. **Same-day retests don't apply "latest wins" (PRIMARY DEFECT).** The `ON CONFLICT` rule updates only when `EXCLUDED.test_date > test_records.test_date` (strictly greater). For a unit tested 2+ times on one date, the rule freezes on whichever same-day run the import processed first and never advances to the latest — so the DB (and the website cert) can show non-final measured values. Whole-source exposure (981,716 records / 406,549 serials): **6,515 same-day multi-run events across 5,977 serials; the DB holds a non-latest run for 311 of them** (3,803 already on latest, 984 superseded by a later-date retest, 1,417 serial absent). Fix proposed in `CONFLICT-RULE-FIX-PROPOSAL-2026-06-17.md`. Directly audit-relevant: same-day runs are typically trim/re-test iterations and the **last** run is the accepted cert result.
## How to re-run
```bash
cd C:\Shares\testdatadb
node <path-to>/validate-parsing.js [optional-report-path]
# Reads C:\Shares\test\STAGE\**\*.TXT and compares to test_records. Read-only.
```