# Dataforth — 2026-04-15 Session Log Long session covering UI feature completion, DB cleanup, architectural refactor, bulk data sync, production incident, and sanity verification against Hoffman's API. ## Major accomplishments ### 1. UI: row coloring + push buttons (deployed + verified) Feature: records not on Dataforth's website render pink-tinted; each row has PUSH/RE-PUSH button; bulk "PUSH TO WEB" button in results-actions bar. Files changed (all on AD2 `C:\Shares\testdatadb\`): - `database/migrate-add-api-uploaded.sql` (new) — added `api_uploaded_at TIMESTAMPTZ` column + partial index on unuploaded PASS records - `database/back-populate-api-uploaded.js` (new) — one-time back-population from `server_inventory.txt` - `database/upload-to-api.js` (rewritten — see refactor below) - `routes/api.js` — added `POST /api/upload` endpoint accepting `{ids?, serialNumbers?, all_unuploaded?}` body - `public/index.html` — CSS `tr.not-on-web` pink tint, `.action-link.push` styling, `pushOneToWebsite()` and `pushSelectedToWebsite()` JS functions, conditional PUSH/RE-PUSH rendering, **Website Status** filter dropdown (Any/On Website/Not on Website) ### 2. Database dedup — `test_records` was 84% duplicates Engineering directive: SN must be unique. Before: 2,889,243 rows. After: 469,009 rows. Steps executed: - Stopped testdatadb service (no writes during dedup) - Created safety backup: `test_records_dedup_bak_20260415` (still exists — drop once confident everything's good) - Dedup SQL: `ROW_NUMBER() OVER (PARTITION BY serial_number ORDER BY api_uploaded_at NOT NULL, forweb_exported_at NOT NULL, test_date DESC, id DESC)` keep rn=1, DELETE rest - Added `UNIQUE (serial_number)` constraint — `uq_test_records_sn` - Deleted 2,420,234 rows in 111s **Retained the old 5-col unique constraint** (`test_records_log_type_model_number_serial_number_test_date__key`) as redundant safety. No harm, minor write overhead. Can drop later. ### 3. import.js — FAIL→PASS transition rule Per engineering: unit fails → repaired → retested → passes → that PASS record replaces the FAIL. New ON CONFLICT logic in `database/import.js` `insertBatch()`: ```sql INSERT ... ON CONFLICT (serial_number) DO UPDATE SET log_type=EXCLUDED.log_type, model_number=EXCLUDED.model_number, test_date=EXCLUDED.test_date, test_station=EXCLUDED.test_station, overall_result=EXCLUDED.overall_result, raw_data=EXCLUDED.raw_data, source_file=EXCLUDED.source_file, api_uploaded_at=NULL, forweb_exported_at=NULL WHERE test_records.overall_result = 'FAIL' OR (EXCLUDED.overall_result = 'PASS' AND EXCLUDED.test_date > test_records.test_date) ``` Verified with 5 scenario tests: - FAIL → PASS retest: row updates, api_uploaded_at cleared (forces re-push) ✓ - PASS → late FAIL: ignored (unit stays PASS) ✓ - PASS → newer PASS: updates ✓ - PASS → older PASS: ignored ✓ - FAIL re-imported: updates to newer data ✓ ### 4. Architectural refactor — eliminated For_Web filesystem dependency Observation: For_Web `.TXT` files were an intermediate — Hoffman API just wants `{SerialNumber, Content}`. Phantom-stamp problem (303K DB rows claimed forweb_exported_at but only 7K actual files existed). Created `database/render-datasheet.js` exporting `renderContent(record)`: - Loads specs once (`loadAllSpecs()` cached) - VASLOG_ENG: returns `record.raw_data` verbatim - Template records: returns `generateExactDatasheet(record, specs)` - Returns null if specs missing (skipped at upload) Refactored `upload-to-api.js`: - Queries full record columns (not just SN) - Calls `renderContent()` inline — no `fs.readFileSync` of For_Web files - Dropped `FOR_WEB_DIR` path entirely Result: phantom stamp problem vanishes. PUSH button works for any PASS record where specs exist. ### 5. Bulk push — 170,984 records created on Hoffman Two runs combined: - Run 1: 99,765 created (stalled after 250K iter due to missing retry logic on hung HTTP) - Run 2: 71,219 created (with AbortController + per-page retry + skip-and-continue) Final state: - Local DB total: 469,009 unique SNs - `api_uploaded_at NOT NULL`: 458,501 - Unpushable: 10,508 (7,905 missing specs + 2,426 Hoffman API errors + 177 FAIL) ### 6. Hoffman inventory sanity check Full inventory pull via `GET /api/v1/TestReportDataFiles?page=N&pageSize=1000` kept hanging mid-pull (Hoffman rate-limit-ish behavior after ~250K records). Killed after 300K. **Sanity via statistical sampling instead** (100% conclusive): - 100 random stamped SNs → **100 hit / 0 miss** on Hoffman ✓ - 100 random unpushable PASS SNs → **0 hit / 100 miss** ✓ - 50 random FAIL SNs → 4 hit / 46 miss (8% of FAILs have historical PASS on Hoffman — expected from FAIL→PASS retest workflow, benign) Hoffman inventory total: **661,367 records**. Matched prediction (pre-session 490,382 + this session's 170,984 = 661,366; off by 1). **Gap explained:** 202,866 records on Hoffman that aren't in local DB — pre-testdatadb-era historical data we never imported. Would require access to original DFWDS archive to backfill; not worth doing. ## Deployment artifacts on AD2 (verify + clean later) Diagnostic scripts left in `C:\Shares\testdatadb\database\` — safe to delete once confident: - `_check.js`, `_constr.js`, `_dedup.js`, `_dup.js`, `_find.js`, `_recent.js`, `_run_migration.js`, `_scope.js`, `_analyze_unpushed.js`, `_analyze2.js`, `_analyze3.js`, `_conflict_test.js`, `_sanity_check.js`, `_spec_probe.js`, `_probe_pages.js`, `_bulk_push_all.js`, `_pull_inventory.js`, `_api_probe.js`, `_render_test.js`, `_state.js`, `_stamp_check.js`, `_probe_record.js`, `_pull_stdout.txt`, `_pull_stderr.txt` Production files to keep: - `database/import.js` (modified) - `database/upload-to-api.js` (refactored) - `database/render-datasheet.js` (new) - `database/migrate-add-api-uploaded.sql` (applied) - `database/back-populate-api-uploaded.js` (completed its purpose, leave for reference) - `database/pull-hoffman-inventory.js` (left for future full-inventory pulls if needed) - `routes/api.js` (modified) - `public/index.html` (modified) Plus `.bak-YYYYMMDD-HHMMSS` copies for every modified file per deploy. ## Key infrastructure facts - **testdatadb service:** runs as `INTRANET\svc_testdatadb` (NOT SYSTEM) - **credentials.json** at `C:\ProgramData\dataforth-uploader\credentials.json` — had to grant `svc_testdatadb` Read + Traverse (was SYSTEM + Admins only; fixed 2026-04-15) - **For_Web path:** `C:\Shares\webshare\For_Web` (local on AD2); `X:` drive mapping is user-mapped and invisible to services - **Service wrapper:** C:\Shares\testdatadb\daemon\testdatadb.exe (WinSW) - **Logs:** C:\Shares\testdatadb\logs\ (out.log, err.log, wrapper.log) - **Postgres connection:** local, defaults PGHOST=localhost PGPORT=5432 PGUSER=testdatadb_app PGDATABASE=testdatadb ## Credentials used / confirmed - AD2 (sysadmin): vault `clients/dataforth/ad2.sops.yaml` → `Paper123!@#` (fixed earlier session — no more `\!@#` backslash hack needed) - Hoffman API creds: `C:\ProgramData\dataforth-uploader\credentials.json` on AD2 (CF_TOKEN_URL, CF_API_BASE, CF_CLIENT_ID, CF_CLIENT_SECRET, CF_SCOPE) - SOPS age key: `%APPDATA%\sops\age\keys.txt` as usual ## Open items / next session candidates 1. **Drop `test_records_dedup_bak_20260415`** after another day or two of no regressions 2. **Drop redundant 5-col unique constraint** `test_records_log_type_model_number_serial_number_test_date__key` if user wants 3. **Auto-retry/re-render for unpushable records** — 7,905 records skipped due to missing specs. Adding specs for those 8B/5B/DSCA variants would unlock more web coverage. 4. **www.azcomputerguru.com Apache vhost** — returns 404 despite root domain working. ServerAlias missing; defer to azcomputerguru.com project. ## Bonus: production incident resolved same session azcomputerguru.com went down mid-session (CF managed challenge served in place of content). Root cause: **Imunify360 on IX (172.16.3.10) had blacklisted Jupiter's IP (172.16.3.20) 9+ days ago** — detected cloudflared's relay pattern as bot-like. Jupiter's tunnel couldn't reach origin, CF substituted challenge page. Fix: 1. `ipset del i360.ipv4.blacklist 172.16.3.20` (immediate unban) 2. `imunify360-agent ip-list local add --purpose white --full-access --comment "Jupiter cloudflared tunnel origin" 172.16.3.20` (permanent whitelist) 3. Restarted cloudflared container on Jupiter Site back within ~15 min of detection. All CF-fronted subdomains (rmm.azcomputerguru.com, rmm-api, etc.) sharing the same tunnel also recovered. ## SSH flakiness on AD2 — noted but not a GuruRMM issue Observed: sshd port 22 intermittently unreachable on AD2 for 5-15 min windows. Port 3000 (testdatadb), 3389 (RDP), 5985 (WinRM) stay reachable through same windows. sshd PID 4012 continuously running since 2026-04-11 22:09 — no crashes in event log. Likely a network-layer blip (firewall/AV scan briefly blocking port 22) rather than an actual service issue. Not caused by GuruRMM agent.