Whole-source sweep (981,716 records / 406,549 serials): 6,515 same-day multi-run events; DB holds a NON-latest run for 311 (the strictly-greater-date conflict rule freezes on an arbitrary same-day run). Corrects the verdict doc to flag same-day retests as a latest-wins faithfulness violation (not benign). Adds the proposed >= -with-data-differs conflict-rule fix (diagnose-only) and the sweep tool. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
51 lines
4.9 KiB
Markdown
51 lines
4.9 KiB
Markdown
# Parsing Fidelity Verdict — testdatadb ingestion vs original staged datasheets
|
||
|
||
**Date:** 2026-06-17 · **Host:** AD2 · **Scope:** all 11,922 staged original `.TXT` datasheets vs the PostgreSQL `test_records`
|
||
**Raw report:** `PARSING-FIDELITY-REPORT-2026-06-17.txt` · **Tool:** `datasheet-pipeline/implementation/tools/validate-parsing.js`
|
||
|
||
## Verdict
|
||
|
||
**Two distinct questions, two answers:**
|
||
|
||
1. **Is the parser faithful to the `.DAT` record it reads?** YES — 0 genuine parse faults across 11,239 comparable records. Every value the importer stores is byte-exact; no misreads, no mis-segmentation.
|
||
2. **Does each DB row faithfully reproduce the unit's *final* test sheet?** NOT always. The DB is one-row-per-serial, and for units re-tested **on the same calendar date** the conflict rule (strictly-greater date) freezes on an arbitrary same-day run instead of the latest. Whole-source sweep: **311 (serial,date) groups where the DB holds a non-latest same-day run** (see `SAMEDAY-RETEST-EXPOSURE-2026-06-17.txt`). This is a data-model / conflict-rule defect, not a parser fault — fix proposed in `CONFLICT-RULE-FIX-PROPOSAL-2026-06-17.md`.
|
||
|
||
The remaining staged-sample "mismatches" were explained by legitimate later-date retests (latest-wins working), reused generic serials, VAS format, or legacy out-of-scope units.
|
||
|
||
## Method
|
||
|
||
Compared each staged original `.TXT` (the DOS-station ground truth, written *before* ingestion) against the DB record's `raw_data` (parsed from the `.DAT`). The cross-check keyed on **scale-invariant data**:
|
||
- **Error (%)** — dimensionless, identical in `.DAT` and `.TXT` for every family (immune to mV scaling and current-output V→mA conversion). The primary fidelity signal.
|
||
- **Stim setpoints** (scale-aware) — used to confirm the *same unit/test* when error values differed (retest vs wrong-record).
|
||
- Serial (with hex-prefix decode), model (SCM-prefix normalized), date.
|
||
|
||
## Results (11,921 staged files with an SN)
|
||
|
||
| Outcome | Count | Meaning |
|
||
|---|---:|---|
|
||
| **Consistent** (SN+model+date+5×error%) | **11,226** | Faithful parse, confirmed |
|
||
| Retest — DB date newer than `.TXT` | 35 | ON-CONFLICT updated DB to a later test (expected) |
|
||
| Retest — same date, stim matches, run differs | 42 | **FAITHFULNESS VIOLATION** — unit tested 2+ times same day; DB froze on a non-latest run (strictly-greater-date rule). Staged subset of the 311 whole-source cases. |
|
||
| VAS/single-point format | 5 | No 5-row accuracy block (SCMVAS) — not comparable by this method |
|
||
| Serial collision (generic SN, diff family) | 2 | `1-1`/`1-2` reused across products; unique-on-serial keeps one |
|
||
| **Genuine parse fault** | **0** | — |
|
||
| Model variant mismatch (same family) | 2 | `A819-1`/`A821-2` — reused serial across 8B35/8B36 (collision) |
|
||
| DB older than `.TXT` | 1 | `A821-1` — same collision pair |
|
||
| Accuracy-row-count diff | 0 | — |
|
||
|
||
### Why the "genuine" bucket collapsed to 0
|
||
The last 16 suspects were all `SCM5B37K-1530` (K-thermocouple). Their stim values matched the same 5 nominal setpoints (-50/112.5/275/437.5/600 °C) but differed by ~0.06 °C run-to-run — because the thermocouple input is a *measured analog value*, not an exact setpoint. A scale+relative stim tolerance correctly classifies them as same-day retests. A real segmentation fault would show a different setpoint *structure*; none did.
|
||
|
||
## Two follow-up items (NOT parsing-correctness bugs)
|
||
|
||
1. **608 staged originals have no DB record** (mostly A-prefix `10xxx` serials, e.g. `A243-1` = `10243-1`, model `5B45-25D`). These exist as staged `.TXT` but are absent from the DB under both decoded and encoded serial. This is an **ingestion-completeness** question (the source `.DAT` for these units appears to be out of the import scan scope, or these are custom `-NND` variants), separate from parsing fidelity. Worth a completeness pass: confirm which `.DAT` paths the importer scans and whether these models' `.DAT` files are present.
|
||
2. **Same-day retests don't apply "latest wins" (PRIMARY DEFECT).** The `ON CONFLICT` rule updates only when `EXCLUDED.test_date > test_records.test_date` (strictly greater). For a unit tested 2+ times on one date, the rule freezes on whichever same-day run the import processed first and never advances to the latest — so the DB (and the website cert) can show non-final measured values. Whole-source exposure (981,716 records / 406,549 serials): **6,515 same-day multi-run events across 5,977 serials; the DB holds a non-latest run for 311 of them** (3,803 already on latest, 984 superseded by a later-date retest, 1,417 serial absent). Fix proposed in `CONFLICT-RULE-FIX-PROPOSAL-2026-06-17.md`. Directly audit-relevant: same-day runs are typically trim/re-test iterations and the **last** run is the accepted cert result.
|
||
|
||
## How to re-run
|
||
|
||
```bash
|
||
cd C:\Shares\testdatadb
|
||
node <path-to>/validate-parsing.js [optional-report-path]
|
||
# Reads C:\Shares\test\STAGE\**\*.TXT and compares to test_records. Read-only.
|
||
```
|