wiki: seed Dataforth client + dataforth-dos project articles

wiki/clients/dataforth.md — 278 lines: prepaid block contract, all
servers/IPs, full contact table, M365/CA policy details, GuruRMM
enrollment, patterns (RDS/SAGE-SQL quirks, AD anomalies, C2 iptables
not persistent, Win7 EOL), security incident history table.

wiki/projects/dataforth-dos.md — 474 lines: DOS update system +
TestDataDB pipeline, PostgreSQL schema, FAIL→PASS retest rule,
H-prefix decode table, security incident (DF-JOEL2/MFA/IC3), D2TESTNAS
role, Neptune SBR email routing, Hoffman API, all anti-patterns.

wiki/index.md — Dataforth added to Clients + Projects tables and
Cross-Reference; d2testnas added to compilation queue.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-24 17:56:25 -07:00
parent 18c5a89abc
commit 63109d9033
3 changed files with 757 additions and 1 deletions

View File

@@ -0,0 +1,474 @@
---
type: project
name: dataforth-dos
display_name: Dataforth DOS — Test Datasheet Pipeline
last_compiled: 2026-05-24
compiled_by: DESKTOP-0O8A1RL/claude-main
sources:
- projects/dataforth-dos/CONTEXT.md
- projects/dataforth-dos/PROJECT_STATE.md
- projects/dataforth-dos/PROJECT_INDEX.md
- projects/dataforth-dos/TEST-DATASHEET-PROCESS.md
- projects/dataforth-dos/Sync-FromNAS.ps1
- projects/dataforth-dos/session-logs/2026-01-20-session.md
- projects/dataforth-dos/session-logs/2026-01-21-session.md
- projects/dataforth-dos/session-logs/2026-03-11-testdatadb-investigation.md
- projects/dataforth-dos/session-logs/2026-03-12-session.md
- projects/dataforth-dos/session-logs/2026-03-13-import-fix.md
- projects/dataforth-dos/session-logs/2026-03-16-session.md
- projects/dataforth-dos/session-logs/2026-04-11-discovery-session.md
- projects/dataforth-dos/session-logs/2026-04-12-session.md
- projects/dataforth-dos/session-logs/2026-04-15-session.md
- projects/dataforth-dos/session-logs/2026-05-12-session.md
- clients/dataforth/session-logs/2026-03-27-security-incident-mfa-datasheets.md
- clients/dataforth/session-logs/SESSION-SUMMARY.md
- clients/dataforth/session-logs/MEMORY.md
- clients/dataforth/session-logs/2026-04-12-session.md
- clients/dataforth/session-logs/2026-04-13-session.md
- clients/dataforth/session-logs/2026-04-14-session.md
- clients/dataforth/docs/manufacturing.md
- .claude/memory/project_datasheet_pipeline.md
- .claude/memory/project_dataforth_incident_2026-03-27.md
- .claude/memory/feedback_d2testnas_ssh.md
backlinks:
- clients/dataforth
- systems/jupiter
---
# Dataforth DOS — Test Datasheet Pipeline
Dataforth Corporation manufactures signal conditioning / data acquisition modules and has 64 MS-DOS 6.22 automated test stations. This project covers the full infrastructure that routes test data from those DOS stations into a modern web-published test datasheet system. The "DOS" name reflects the project's origin (DOS update system, 2026-01-20) but the scope expanded dramatically to include the TestDataDB service, the Hoffman API integration, and a major security incident response.
**Current state (2026-05-12):** Production pipeline fully operational. 469K records, 458.5K live on www.dataforth.com. Daily scheduled task runs at 02:30 AM.
---
## Summary
Dataforth's 64 QuickBASIC 4.5 test programs produce binary test logs for every module that passes through manufacturing. These logs must be converted to formatted "test data sheets" — text documents matching a proprietary QuickBASIC template — and published on the public Dataforth website for customer download.
The original pipeline involved DOS batch files, a VB6 filename decoder (DFWDS.exe), and a VB.NET web uploader — all of which were destroyed in a 2025 ransomware attack that wiped AD2. ACG rebuilt the pipeline in March 2026 using Node.js and a modern REST API, then has continued iterating: PostgreSQL migration, SCMVAS/SCMHVAS product line support, DB dedup, FAIL→PASS retest logic, direct API upload, and email notifications.
The project also encompasses the original DOS Update System (batch file infrastructure for over-the-network software updates to the 64 test stations, production-ready since 2026-01-20).
---
## Architecture
### Component Overview
```
DOS Test Stations (64)
MS-DOS 6.22, QuickBASIC 4.5 ATE
Writes binary .DAT files per test run
|
| CTONW.BAT — uploads .DAT via SMB1
v
D2TESTNAS (192.168.0.9) — Samba SMB1 bridge
rsync daemon port 873, module "test"
/data/test → serves as T:\ for DOS stations
|
| Sync-FromNAS.ps1 — every 15 min on AD2
v
AD2 (192.168.0.6) — C:\Shares\test\ (NAS mirror)
|
| testdatadb service — import.js
v
PostgreSQL 18 (local on AD2) — testdatadb DB
469,009 unique records (UNIQUE on serial_number)
|
| upload-to-api.js — real-time after import
| run-pipeline.ps1 — daily 02:30 AM fallback
v
Hoffman Product API — https://www.dataforth.com
/api/v1/TestReportDataFiles/bulk
OAuth2 client_credentials
458,501 records live on website (as of 2026-04-15)
661,367 total on Hoffman (includes 202,866 pre-testdatadb historical)
```
### Components
| Component | Location | Tech | State |
|---|---|---|---|
| DOS test stations | Physical (64 stations, TS-1 through TS-30 + variants) | MS-DOS 6.22, QuickBASIC 4.5 | Production |
| D2TESTNAS NAS | 192.168.0.9 | Linux (Samba, rsync daemon) | Production |
| Sync-FromNAS.ps1 | AD2 C:\Shares\test\scripts\ | PowerShell, rsync over SSH | Production — every 15 min |
| testdatadb service | AD2 C:\Shares\testdatadb\ | Node.js v20, Express, WinSW | Production — runs as INTRANET\svc_testdatadb |
| PostgreSQL 18 | AD2 localhost:5432 | PostgreSQL | Production |
| Dashboard UI | http://192.168.0.6:3000/ | Node.js/Express/HTML | Production — internal LAN only |
| Daily scheduled task | AD2 (Task Scheduler) | PowerShell + Node.js | Production — 02:30 AM as SYSTEM |
| Hoffman Product API | https://www.dataforth.com | REST/OAuth2 | Third-party (Ken Hoffman) |
| DOS Update System | AD2 C:\Shares\test\ + D2TESTNAS /data/test/ | DOS batch files, PS deployment | Production since 2026-01-20 |
### Two Parallel Upload Paths
**Path 1 — Real-time (preferred):**
import.js ingests new .DAT records → PostgreSQL → upload-to-api.js immediately pushes PASS records to Hoffman API. `api_uploaded_at` stamped on success.
**Path 2 — Daily scheduled fallback (02:30 AM):**
`DataforthTestDatasheetUploader` task runs `C:\ProgramData\dataforth-uploader\run-pipeline.ps1`. Runs dfwds-process.js (moves Test_Datasheets → For_Web), then upload-delta.js (pushes all For_Web files). Acts as safety net for real-time path failures. Sends summary email via Graph API on completion.
### Key File Paths
**On AD2 (192.168.0.6):**
| Path | Purpose |
|---|---|
| `C:\Shares\testdatadb\` | testdatadb Node.js application root |
| `C:\Shares\testdatadb\database\import.js` | Parses .DAT files, inserts to PostgreSQL |
| `C:\Shares\testdatadb\database\upload-to-api.js` | Pushes records to Hoffman API |
| `C:\Shares\testdatadb\database\render-datasheet.js` | In-memory datasheet rendering (no FS dependency) |
| `C:\Shares\testdatadb\database\export-datasheets.js` | Legacy For_Web .TXT writer (retained for compat) |
| `C:\Shares\testdatadb\parsers\` | Binary .DAT parsers per log type (multiline.js, vaslog.js, etc.) |
| `C:\Shares\testdatadb\specdata\` | QuickBASIC binary spec files (5BMAIN.DAT, 7BMAIN.DAT, etc.) |
| `C:\Shares\testdatadb\templates\datasheet-exact.js` | Datasheet formatter replicating QuickBASIC output |
| `C:\Shares\testdatadb\logs\` | Service logs (out.log, err.log, wrapper.log) |
| `C:\Shares\testdatadb\daemon\testdatadb.exe` | WinSW service wrapper |
| `C:\Shares\webshare\For_Web\` | ~7,517 legacy .TXT datasheet files |
| `C:\Shares\webshare\Test_Datasheets\` | Staging dir for new datasheets from stations |
| `C:\Shares\webshare\Bad_Datasheets\` | ~18,801 invalid/quarantined files (historical) |
| `C:\Shares\webshare\Datasheets_Log\` | DFWDS processing run logs |
| `C:\Shares\test\` | NAS mirror of test station data |
| `C:\Shares\test\Ate\HISTLOGS\` | Central consolidated test logs (per log type, per model) |
| `C:\Shares\test\scripts\Sync-FromNAS-rsync.ps1` | NAS sync script |
| `C:\ProgramData\dataforth-uploader\` | Scheduled task scripts, credentials.json, pipeline logs |
| `C:\ProgramData\dataforth-uploader\credentials.json` | Hoffman API OAuth2 creds + Graph API creds; ACL: SYSTEM + Admins + svc_testdatadb |
| `C:\Users\sysadmin\Documents\dataforth-uploader\` | dfwds-process.js and upload-delta.js (used by daily task) |
**On AD1 (192.168.0.27):**
| Path | Purpose |
|---|---|
| `\\AD1\Engineering\ENGR\ATE\` | QuickBASIC source, spec files per product family |
| `\\AD1\Engineering\ENGR\ATE\High Voltage Input Module Test\HVDATA\hvin.dat` | SCMVAS/SCMHVAS spec binary (33 records, engineering MODNAMEs — NOT marketing names) |
**Repo (D:\claudetools):**
| Path | Purpose |
|---|---|
| `projects/dataforth-dos/` | Project root |
| `projects/dataforth-dos/datasheet-pipeline/` | Pipeline scripts and research |
| `projects/dataforth-dos/datasheet-pipeline/implementation/` | Production-staged code (deployer: deploy-to-ad2.py) |
| `projects/dataforth-dos/datasheet-pipeline/scmvas-hvas-research/` | Discovery artifacts for SCMVAS/SCMHVAS extension |
| `projects/dataforth-dos/deploy/` | Deployment scripts |
| `projects/dataforth-dos/TEST-DATASHEET-PROCESS.md` | End-user process documentation (audience: Dataforth Engineering) |
---
## Data Model & Datasheet Pipeline Detail
### PostgreSQL Schema (key table)
```sql
CREATE TABLE test_records (
id SERIAL PRIMARY KEY,
log_type VARCHAR(20) NOT NULL, -- 5BLOG, 7BLOG, 8BLOG, DSCLOG, SCTLOG, VASLOG, VASLOG_ENG, etc.
model_number VARCHAR(100) NOT NULL,
serial_number VARCHAR(100) NOT NULL,
test_date DATE,
test_station VARCHAR(50),
overall_result VARCHAR(10), -- 'PASS' or 'FAIL'
raw_data TEXT, -- decoded record content
source_file TEXT,
work_order VARCHAR(50),
datasheet_exported_at TIMESTAMPTZ,
forweb_exported_at TIMESTAMPTZ, -- legacy For_Web file write
api_uploaded_at TIMESTAMPTZ, -- when pushed to Hoffman
import_date TIMESTAMPTZ DEFAULT NOW(),
search_vector tsvector,
CONSTRAINT uq_test_records_sn UNIQUE (serial_number)
);
```
### FAIL→PASS Retest Rule
Engineering directive: if a unit fails, is repaired, and passes a later retest, the PASS record replaces the FAIL. But if a unit already has a PASS and later tests FAIL (e.g., post-ship retest), the PASS is kept. Encoded as `ON CONFLICT (serial_number) DO UPDATE ... WHERE test_records.overall_result = 'FAIL' OR (EXCLUDED.overall_result = 'PASS' AND EXCLUDED.test_date > test_records.test_date)`.
### Log Types and Spec Files
| Log Type | Product Family | Spec File(s) | Notes |
|---|---|---|---|
| 5BLOG | SCM5B (isolated signal conditioning) | 5BMAIN.DAT, 5B45DATA.DAT, 5B49_2.DAT, DB5B48.DAT | 481 + 56 + 15 + 3 = 555 models |
| 7BLOG | SCM7B | 7BMAIN.DAT | 276 models |
| 8BLOG | 8B | 8BMAIN.DAT | 148 models |
| DSCLOG | DSCA | DSCMAIN4.DAT, DSCOUT.DAT | 391 + 23 = 414 models |
| SCTLOG | DSCT transmitters | SCTMAIN.DAT | 103 models |
| VASLOG | SCMVAS, SCMHVAS (production ATE path) | No spec file — accuracy-only format | Marketing names (SCMHVAS-M0100 etc.); do NOT look up in hvin.dat |
| VASLOG_ENG | SCMHVAS (engineering-tested variant) | None — verbatim file passthrough | 434 files imported; stored as raw_data, copied byte-exact to output |
| PWRLOG | Power supplies | Specs embedded in parser | — |
| SHT | Short-form records | Specs embedded | — |
**Total spec models: 1,470+** (stored at C:\Shares\testdatadb\specdata\ on AD2; source: \\AD1\Engineering\ENGR\ATE\<family>\<DATA>\)
### H-prefix Filename Decode
DOS QuickBASIC programs encode serial numbers using a letter prefix for the leading two digits:
A=10, B=11, C=12, D=13, E=14, F=15, G=16, H=17, I=18, J=19.
Example: `H8601-6.TXT` → serial `178601-6`.
**Always extract serial numbers from the DAT record data content, never from the 8.3 filename.**
### Dashboard Features
URL: `http://192.168.0.6:3000/` (internal LAN only)
- Filter by Serial Number, Work Order, Model Number, Result, Product Line, Website Status (Any/On Website/Not on Website), Test Station, Date range, full-text search
- **Pink tint** on rows = record not yet on Dataforth website (api_uploaded_at IS NULL)
- Per-row PUSH / RE-PUSH buttons; bulk "PUSH TO WEB" button
- CSV export of current filter set
- SHEET preview — renders the exact datasheet text that was/would be sent to website
---
## Security Incident History (2026-03-27)
### DF-JOEL2 Compromise
Joel Lohr's workstation was compromised via phishing email to his personal Yahoo/Comcast account (appeared to be from Arizona Technology Council). Attacker "Angel Raya" installed ScreenConnect from a phishing link, then deployed two C2 backdoor ScreenConnect clients and used a tool to hide them from the uninstall list.
**Timeline (2026-03-27 MST):**
- 08:25 — Joel clicks phishing link
- 08:28 — ScreenConnect.ClientSetup.msi downloaded to C:\Users\jlohr\Downloads\
- 08:29 — "Angel Raya" connected via cloud relay `instance-wlb9ga-relay.screenconnect.com`
- 08:29 — Two C2 backdoor clients deployed via PowerShell
- 08:31 — Sordum "Hide From Uninstall List" tool downloaded
- 08:32 — Rogue clients hidden; "Angel Raya" disconnected
- 11:55 — "Administrator" connected via C2 IP 80.76.49.18
- 18:51 — Successful unauthorized M365 sign-in from Istanbul, Turkey (91.93.232.236)
**Attacker Infrastructure:**
- C2 Server 1: 80.76.49.18:8040/8041
- C2 Server 2: 45.88.91.99:8040/8041
- ASN: AS399486, Virtuo (12651980 CANADA INC.), Montreal QC
- Abuse: abuses@virtuo.host (automated suspension confirmed)
- Cloud relay: instance-wlb9ga-relay.screenconnect.com
- ConnectWise case: 03464184
**Rogue ScreenConnect clients (all removed):**
- `0cad93610010625f` — "Angel Raya" initial access (cloud relay)
- `0dfe1abae029411c` — C2 backdoor (80.76.49.18:8041)
- `a897d9a21259d116` — C2 backdoor (45.88.91.99:8041)
- Legitimate ACG client: `1912bf3444b41a08` (instance-kgc7jt) — NOT removed
**M365 Compromise (jlohr@dataforth.com):**
- Brute-force for 7+ days; successful logins from Istanbul Turkey, Croydon UK, Germany
- Azure AD PowerShell and Azure CLI used by attacker
- No malicious inbox rules, forwarding, or OAuth consents found
**Remediation completed 2026-03-27:**
- C2 IPs blocked at UDM firewall (iptables INPUT + FORWARD — not permanent; add to UniFi UI)
- 3 rogue ScreenConnect clients uninstalled via WinRM
- HideUL tool deleted from C:\Users\Public\Pictures\Backup\
- jlohr AD password reset; Entra sessions revoked
- 32 machines scanned clean (28 unreachable/offline — status unknown)
- No lateral movement detected
**Reports filed:**
- FBI IC3: Submission ID `1c32ade367084be9acd548f23705736f` (filed 2026-03-27 5:11 PM EST)
- Virtuo Hosting: abuses@virtuo.host
- ConnectWise: Case #03464184
- Artifacts: `clients/dataforth/docs/IC3-Complaint-2026-03-27.pdf`, `clients/dataforth/docs/incident-2026-03-27-abuse-report-virtuo.md`, `clients/dataforth/docs/incident-2026-03-27-abuse-report-connectwise.md`
### MFA Deployment (same day)
Three Conditional Access policies deployed (report-only initially, enforced 2026-04-04):
- `dc920ee4-22e6-402b-b5e3-4f3662d26227` — Require MFA (skip from office IP 67.206.163.122)
- `3405f7db-91b6-48da-b3fb-2e0ef1e44d17` — Block Foreign Sign-Ins (US only)
- `82ebbe3b-d151-4cb7-aff7-af893a4915e3` — Block Legacy Auth
Named locations:
- `0a3e61d7-a544-4a47-961a-a98cd4804613` — Dataforth Office - Tucson (67.206.163.122/32)
- `12706cec-c91b-454e-a24d-c801284b79f7` — Allowed Countries - US Only
Security groups:
- `75ac10ae-d49e-42b1-aa87-04908a983495` — MFA-Excluded-BreakGlass
- `094b12c5-b39a-4287-943a-f1175ce61a6f` — MFA-Travel-Bypass
---
## D2TESTNAS — Role and Access
D2TESTNAS (192.168.0.9) is a Linux machine serving as the critical bridge between the DOS 6.22 test stations and the modern Windows/IP infrastructure. DOS 6.22 only supports SMB1 (not SMB2+), so D2TESTNAS runs Samba as an SMB1 share while also running an rsync daemon for efficient syncing to/from AD2.
**D2TESTNAS also physically houses Neptune Exchange Server** — ACG's Exchange Server 2016 instance that hosts mail for multiple ACG clients. Neptune's internal IP (172.16.3.11) is within the ACG LAN range, but physically the machine is at Dataforth's D2 facility. Access requires routing through D2TESTNAS because Dataforth's UDM uses a 172.16.x.x subnet that overlaps the ACG office LAN, making direct Tailscale routing to Neptune ambiguous.
**D2TESTNAS access:**
- SSH: `root@192.168.0.9` — vault: `clients/dataforth/d2testnas.sops.yaml`
- [WARNING] Use `root`, NOT `sysadmin``sysadmin` SSH fails on D2TESTNAS
- SSH key from acg-guru-5070 (ed25519) is authorized
- rsync daemon: port 873, module `test`, user `rsync` — vault contains rsync credentials
- Tailscale: provides the 172.16.0.0/22 route from Dataforth network to ACG office
**Quirk:** Dataforth UDM runs a 172.16.x.x subnet that overlaps ACG's office LAN (172.16.0.0/22). This creates routing ambiguity for direct Tailscale connections to 172.16.3.x (Neptune's range). Until Dataforth UDM is resubnetted, always access Neptune via D2TESTNAS.
---
## Neptune SBR Email Routing
Neptune Exchange Server routes outbound mail through the MailProtector smarthost via the SBR (Sender-Based Routing) transport agent. This is ACG infrastructure physically hosted at Dataforth D2.
**Outbound chain:**
1. User sends from Neptune mailbox (172.16.3.11)
2. SBR transport agent (Priority 12) fires on `OnResolved` event
3. SBR reads config files at `C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\agents\Custom\``InternalDomains.config`, `OverrideSettings.config`, `IgnoreAuthAs.config`
4. SBR rewrites routing to `.sbr` domain (e.g., `rieussetcorp.sbr`)
5. Exchange matches to send connector → smarthosts via MailProtector (`domain-com.outbound.emailservice.io`)
6. MailProtector relays to final destination
**Common issue:** When Neptune's IP changes or a new client domain is added, MailProtector must have the sending server IP authorized. Without this, MailProtector silently drops the message.
**IP:** Neptune outbound uses `67.206.163.122` (sometimes `.124` — SNAT rule on Dataforth UDM should force outbound to `.124` but may not always be active). 67.206.163.122 has no PTR record and is blacklisted by some providers.
**Neptune access:** Requires Tailscale via D2TESTNAS (see above). WinRM to 172.16.3.11, `ACG\administrator` — vault: `clients/dataforth/neptune-exchange.sops.yaml`.
---
## Dataforth Product API (Hoffman)
Ken Hoffman built and maintains the public-facing API that serves datasheets on www.dataforth.com.
**Endpoints of interest:**
- `POST /api/v1/TestReportDataFiles` — single upload `{SerialNumber, Content}`
- `POST /api/v1/TestReportDataFiles/bulk` — batch upload `{Items: [{SerialNumber, Content}, ...]}`
- `GET /api/v1/TestReportDataFiles/{serialNumber}` — retrieve one
- `GET /api/v1/TestReportDataFiles/stats``{TotalCount, LatestCreatedAtUtc, LatestUpdatedAtUtc}`
- Swagger: `https://www.dataforth.com/swagger/index.html`
**Authentication:** OAuth2 client_credentials. Token cached 1 hour. Refresh on 401.
- Token URL: `https://login.dataforth.com/connect/token`
- Client ID: `dataforth.onprem.sync`
- Client Secret: vault `clients/dataforth/api-oauth.sops.yaml`
- Scope: `dataforth.web`
**Idempotency:** Server deduplicates by content hash. Same SN + same content → `Unchanged`. Same SN + different content → `Updated`. New SN → `Created`. Safe to re-push everything.
**Throughput observed:** ~142 records/s sustained for bulk uploads (batches of 100).
**Hoffman API state (2026-04-15):** 661,367 total records on website. 202,866 are pre-testdatadb historical (uploaded by old DFWDS toolchain — not reproducible from current DB, don't try to reconcile).
---
## Known Issues and Anti-Patterns
### Anti-Patterns (from CONTEXT.md)
- **DO NOT hardcode Paper123!@#** — always fetch from vault: `bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g'`
- **DO NOT use X: drive in SSH sessions** — X: is only mapped under the service account. Use UNC `\\ad2\webshare\For_Web` instead.
- **DO NOT assume hvin.dat model lookup works for VASLOG** — marketing names (SCMHVAS-M0100) do NOT match engineering MODNAMEs (SCM5B41-1181) in hvin.dat. SCMVAS/SCMHVAS use simple accuracy-only template without hvin.dat.
- **DO NOT pass 50+ file paths on PowerShell command line** — hits "Command line too long". Use inline Node.js with `fs.readdirSync` instead.
- **DO NOT commit testdata.db or large samples** — 4.1GB database is in .gitignore.
- **DO NOT use SMB1 on AD2** — disabled for security. Use SSH/SFTP (port 22) or SMB2+ shares.
- **DO NOT expect immediate stdout from paramiko exec_command** — buffers until completion. Use progress markers or drain loop.
- **Vault entry ad2.sops.yaml has stale backslash escape** — password stored as `Paper123\!@#`, actual password is `Paper123!@#`. Strip with `sed 's/\\//g'` at read time until vault is cleaned.
- **DO NOT build GuruRMM agents manually** — all builds via Gitea webhook pipeline (push to main).
### Known Issues
- **Email notifications blocked on SMTP AUTH:** sysadmin@dataforth.com has SMTP AUTH disabled by Exchange Online. Email now uses Graph API (Mail.Send granted to Claude-Code-M365 app 2026-05-12). SMTP_USER/SMTP_PASS removed from credentials.json; replaced with GRAPH_TENANT_ID/CLIENT_ID/CLIENT_SECRET.
- **Vault stale backslash:** ad2.sops.yaml `credentials.password` contains `Paper123\!@#` (literal backslash). Must strip with `sed 's/\\//g'` at runtime. Cleanup pending.
- **Undocumented 2026-04-22 changes:** import.js, notify.js, upload-to-api.js modified that date with no session log. Changes appear stable but details unknown.
- **7B datasheet formatting:** SCM7B has ~830K records with specs loaded but needs a 7B-specific formatter layout. Not yet implemented.
- **SCM5B49 spec file empty:** 177000-15 (SCM5B49-05) needs empty spec file from John Lehman. Blocking one Quatronix datasheet.
- **New product lines not yet integrated:** MAQ20 (XLS format), PWRM10 (XLS), 10D (JSON, expected ~May 2026), DSCMHV — all need different parsers.
- **Diagnostic scripts on AD2:** `C:\Shares\testdatadb\database\_*.js` (~20 files from 2026-04-15 session) — safe to delete.
- **Service runs as INTRANET\svc_testdatadb** — credentials.json must be ACL'd to include Read + Traverse for this account (fixed 2026-04-15 but verify after changes).
- **psql not in PATH on AD2 for SSH sessions** — use Node.js pg module inline instead.
- **AD2 SSH port 22 intermittent** — sshd briefly unreachable for 5-15 min windows; ports 3000/3389/5985 stay up. Not a GuruRMM issue — likely network-layer (AV scan). Not caused by sshd crash.
### QuickBASIC STR$() Formatting Quirk
QuickBASIC's `STR$()` on a SINGLE emits two formats depending on magnitude:
- **Scientific with trailing test-status digit** (98.4%): `"PASS-7.005501E-033"` — trailing digit is test status code (2 or 3), not part of the value
- **Plain decimal, no trailing digit** (1.6%): `"PASS .01599373"` or `"PASS-.00499773"`
Both formats encode percent units. Regex must try scientific first, plain decimal as fallback. Patch applied 2026-04-12.
---
## Build & Deploy
### Deploying Code to AD2
```bash
# From projects/dataforth-dos/deploy/ or datasheet-pipeline/implementation/
python deploy-to-ad2.py
# What it does:
# 1. Fetches AD2 password from vault (30s timeout, fails loud)
# 2. Connects via paramiko SFTP to 192.168.0.6:22
# 3. Creates .bak-YYYYMMDD timestamped backups of existing files
# 4. Uploads modified files from implementation/
# 5. Restarts testdatadb service via SSH exec_command
# 6. Verifies API responds 200 OK on port 3000
```
Manual connection:
```bash
AD2_PASS=$(bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g')
ssh sysadmin@192.168.0.6
```
### Checking Service Health
```powershell
# On AD2 via SSH
Get-Service testdatadb # should show Running
Get-Service postgresql-18 # should show Running
```
```bash
# From workstation (needs VPN)
curl http://192.168.0.6:3000/api/stats
# Returns: {"totalRecords":469009,...}
```
### Scheduled Task Reference
- Task name: `DataforthTestDatasheetUploader`
- Schedule: Daily 02:30 AM
- Runs as: SYSTEM
- Script: `C:\ProgramData\dataforth-uploader\run-pipeline.ps1`
- Logs: `C:\ProgramData\dataforth-uploader\logs\pipeline-YYYYMMDD.log` (60-day retention)
---
## Active State
**As of 2026-05-12 — pipeline is healthy.** See `projects/dataforth-dos/CONTEXT.md` for live detail.
**Pending:**
- Enable SMTP AUTH for sysadmin@dataforth.com (AJ to do in Exchange Admin Center) OR confirm Graph API email path works (deployed 2026-05-12, not yet confirmed for live pipeline run)
- After email confirmed: add jlehman@dataforth.com to TO list in notify.js and run-pipeline.ps1
- Clean diagnostic `_*.js` files from AD2
- Fix vault stale backslash in ad2.sops.yaml
- Implement 7B datasheet formatter
- Integrate MAQ20/PWRM10/10D new product lines
- Investigate undocumented 2026-04-22 session changes
---
## History Highlights
| Date | Event |
|---|---|
| 2026-01-19 | DEPLOY.BAT added to Sync-FromNAS root-level sync. |
| 2026-01-20 | **DOS Update System production-ready.** 9 BAT files fixed (XCOPY /D param errors, STARTNET path), 39 deployments, pilot machine TS-4R. |
| 2026-03-11 | TestDataDB investigation — max test_date stuck at 2026-01-19; parser mtime issue. |
| 2026-03-1216 | Import.js fixes, sync script improvements, Sync-FromNAS issues resolved. |
| 2026-03-27 | **Security incident — DF-JOEL2 compromised** (see Security Incident History section). MFA deployed. |
| 2026-03-2729 | **Pipeline rebuilt** after 2025 crypto attack. New Node.js pipeline replaces DFWDS.exe + TestDataSheetUploader. Spec parser (1470 models), exact-match formatter, auto-export. 72/73 Quatronix datasheets generated. Root cause: CTONWTXT.BAT not called in AUTOEXEC v4.1 since 2026-03-12. |
| 2026-04-11 | Discovery session — SCMVAS/SCMHVAS research. hvin.dat decoded. Decision: accuracy-only template, no hvin.dat lookup. |
| 2026-04-12 | **SCMVAS/SCMHVAS pipeline extension deployed.** vaslog.js parser, accuracy-only formatter. 27,503 records backfilled (438 stragglers from QB STR$() quirk — patched same day). 434 Engineering-Tested .txt imported. Commit `0dd3d82`. |
| 2026-04-12 | TestDataDB PostgreSQL migration verified complete (2.89M records). SQLite archived. |
| 2026-04-13 | Hoffman API architecture finalized — client_credentials grant `dataforth.onprem.sync`. |
| 2026-04-14 | **DFWDS logic ported to dfwds-process.js (Node).** 897 staged datasheets drained. 803 new records created on Hoffman. End-to-end pipeline working. |
| 2026-04-15 | **Major release:** DB dedup (2.89M→469K unique SNs), FAIL→PASS retest rule, For_Web filesystem dependency eliminated (render in-memory), 170,984 records bulk-pushed to Hoffman. Dashboard pink tint, push buttons, bulk push, website status filter. PostgreSQL backup `test_records_dedup_bak_20260415` retained. |
| 2026-04-22 | Undocumented changes to import.js, notify.js, upload-to-api.js. No session log. |
| 2026-05-12 | **Email notifications implemented** — nodemailer then replaced with Graph API (Mail.Send granted to Claude-Code-M365 app). PS double-quote stripping issue discovered and worked around. SMTP_USER/PASS removed from credentials.json; GRAPH_TENANT_ID/CLIENT_ID/CLIENT_SECRET added. Pipeline fully documented in TEST-DATASHEET-PROCESS.md. |
---
## Backlinks
- [[clients/dataforth]] — Client article; this project runs on Dataforth's AD2 server
- [[systems/jupiter]] — Neptune Exchange physically housed at Dataforth D2; D2TESTNAS bridges Tailscale routing