Author: Mike Swanson Machine: DESKTOP-0O8A1RL Timestamp: 2026-04-21 18:46:45
17 KiB
Dataforth DOS Project - Context
Last Updated: 2026-04-14 Status: Active - Datasheet Pipeline Extended for SCMVAS/SCMHVAS
Quick Start - Infrastructure Overview
| Component | IP/Location | Access | Notes |
|---|---|---|---|
| AD2 (Primary) | 192.168.0.6 | SSH: sysadmin / vault | Windows Server 2022, hosts testdatadb service |
| AD1 (Secondary) | 192.168.0.27 | SSH: sysadmin / vault | Hosts Engineering share at \AD1\Engineering |
| D2TESTNAS | 192.168.0.9 | SMB1 only | Bridge for DOS test stations (TS-xx machines) |
| VPN | Required | FortiClient | Access to 192.168.0.x network |
Get credentials:
# AD2 password (has stale backslash escape - strip it)
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g'
# AD1 password
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad1.sops.yaml credentials.password
All passwords: Paper123!@# (stored in vault, note backslash escape issue in ad2.sops.yaml)
Current State (READ THIS FIRST)
Recent Work (2026-04-11/12)
Extended Test Datasheet Pipeline for SCMVAS-Mxxx and SCMHVAS-Mxxxx families
- Added VASLOG parser support (multiline CSV .DAT format)
- Created accuracy-only datasheet template (simple format, no hvin.dat lookup)
- Implemented pass-through for Engineering-Tested .txt files
- Backfilled 27,503 historical records (438 required regex patch for QB STR$() format quirk)
- 434 Engineering .txt files imported and published
- Deployed to AD2, service restarted, web publishing verified
Status: ✅ Complete, production-deployed
Critical Files Changed: 5 modified, 1 new parser
- server/parsers/vaslog.js (new)
- server/templates/datasheet-exact.js (SCMVAS/SCMHVAS branch added)
- server/database/import.js (recursive flag fix, VASLOG_ENG support)
- server/parsers/spec-reader.js (stub for SCMVAS/SCMHVAS)
- deploy/deploy-to-ad2.py (vault-based credentials)
Session Logs:
- 2026-04-12-session.md - Implementation, deploy, backfill, patch (DEFINITIVE)
- 2026-04-11-discovery-session.md - Discovery phase
testdatadb Service (on AD2)
- Service Name: testdatadb
- Status: Running
- Service Account: INTRANET\svc_testdatadb
- Working Directory: C:\Shares\testdatadb
- API Port: 3000 (http://192.168.0.6:3000)
- Database: SQLite at C:\Shares\testdatadb\database/testdata.db (4.1GB)
- Web Output: X:\For_Web (= \ad2\webshare\For_Web UNC path)
File Shares on AD2
C:\Shares\test\ # Mirror of D2TESTNAS test data
├── TS-xx\LOGS\ # Test logs from DOS stations
│ ├── 5BLOG\ # SCM5B family
│ ├── 8BLOG\ # 8B family
│ ├── VASLOG\ # SCMVAS/SCMHVAS .DAT files
│ │ ├── HVAS-M01.DAT # Production logs
│ │ ├── VAS-M100.DAT
│ │ └── VASLOG - Engineering Tested\ # 434 .txt files
│ └── ...
└── Corrected HVAS Files\ # 200 pre-generated datasheets
C:\Shares\testdatadb\ # Node.js application
├── server/
│ ├── parsers/ # Log file parsers
│ ├── templates/ # Datasheet formatters
│ └── database/ # Import/export scripts
├── database/
│ └── testdata.db # SQLite (4.1GB, not in git)
└── node_modules/
File Shares on AD1
\\AD1\Engineering\
└── ENGR\ATE\High Voltage Input Module Test\
├── HVDATA\
│ └── hvin.dat # Spec database (33 records, engineering MODNAMEs)
└── Released\
├── TESTHV3.BAS # Primary test program (2020)
├── TESTHV4.BAS # Alternate test program (2017)
├── NLIBATE3.BAS # ATE library
└── DBHV.BAS # Database editor (TYPE DBASE definition)
Email / SMTP
Dataforth is M365 hybrid — Exchange Online is the mail system. Use SMTP via M365:
- SMTP host: smtp.office365.com Port: 587 (STARTTLS)
- Auth: sysadmin@dataforth.com (vault:
clients/dataforth/m365.sops.yaml→credentials.password) - Tenant ID:
7dfa3ce8-c496-4b51-ab8d-bd3dcd78b584 - Neptune Exchange (neptune.acghosting.com): ACG infrastructure — NOT Dataforth's, do not use
Anti-Patterns (DON'T DO THIS)
❌ DO NOT hardcode Paper123!@# - Always fetch from vault:
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g'
❌ DO NOT use X: drive in SSH sessions - It's only mapped under service account. Use UNC path instead:
# Wrong:
node database/export-datasheets.js # Fails: "X:\For_Web does not exist"
# Right:
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web"
node database/export-datasheets.js
❌ DO NOT assume hvin.dat lookup works - Marketing names (SCMHVAS-M0100) ≠ engineering MODNAMEs (SCM5B41-1181). SCMVAS/SCMHVAS use simplified accuracy-only template WITHOUT hvin.dat.
❌ DO NOT pass 50+ file paths on PowerShell command line - Hits "Command line too long". Use inline node script with fs.readdirSync instead.
❌ DO NOT commit testdata.db or large samples - 4.1GB database is in .gitignore. Keep research samples local only.
❌ DO NOT use SMB1 on AD2 - Disabled for security. Use SSH/SFTP (port 22) or SMB2+ shares.
❌ DO NOT expect immediate output from exec_command - paramiko buffers stdout. Use progress markers or drain at completion.
❌ DO NOT assume VPN is stable - Dataforth VPN can drop mid-session. Save work frequently, use local samples for offline analysis.
Where to Find Things
Codebase Structure
projects/dataforth-dos/
├── datasheet-pipeline/
│ ├── implementation/ # Staged code (approved by Code Review)
│ ├── scmvas-hvas-research/ # Discovery scripts and source files
│ │ ├── source/ # TESTHV3.BAS, hvin.dat, etc.
│ │ ├── samples/ # .DAT and .txt samples (local)
│ │ ├── parse_hvin.py # hvin.dat binary parser
│ │ └── pull-*.py # SSH download scripts
│ └── IMPLEMENTATION_PLAN.md # Approved plan (2026-04-11)
├── deploy/
│ └── deploy-to-ad2.py # Deployment script (vault-based auth)
├── session-logs/
│ ├── 2026-04-12-session.md # SCMVAS/SCMHVAS implementation (DEFINITIVE)
│ └── 2026-04-11-discovery-session.md
└── CONTEXT.md # This file
Production Files on AD2
C:\Shares\testdatadb\
├── server.js # Main entry point
├── server/
│ ├── parsers/
│ │ ├── multiline.js # Handles VASLOG .DAT (CSV format)
│ │ ├── vaslog.js # VASLOG-specific logic (new)
│ │ └── spec-reader.js # Spec DB loader (stub for SCMVAS/SCMHVAS)
│ ├── templates/
│ │ └── datasheet-exact.js # Datasheet formatter (SCMVAS/SCMHVAS branch added)
│ └── database/
│ ├── import.js # LOG_TYPES registry, importFiles()
│ └── export-datasheets.js # Batch export script
└── database/
└── testdata.db # SQLite (27k+ records after backfill)
Common Operations
Deploy Code to AD2
# From projects/dataforth-dos/deploy/
python3 deploy-to-ad2.py
# What it does:
# 1. Fetches password from vault (D:/vault/scripts/vault.sh)
# 2. Connects via paramiko SFTP to 192.168.0.6:22
# 3. Creates .bak-YYYYMMDD timestamped backups
# 4. Uploads modified files from implementation/
# 5. Restarts testdatadb service via SSH exec_command
# 6. Verifies API responds 200 OK on port 3000
Manual deployment (if script unavailable):
# Get password
AD2_PASS=$(bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g')
# Connect
sshpass -p "${AD2_PASS}" ssh sysadmin@192.168.0.6
# Backup + copy
cd C:\Shares\testdatadb\server\parsers
copy multiline.js multiline.js.bak-20260414
# ... upload new files via SFTP ...
# Restart service
Restart-Service -Name testdatadb
# Verify
curl http://localhost:3000
Import New Test Data
# SSH to AD2
ssh sysadmin@192.168.0.6
# Run import for specific log type
cd C:\Shares\testdatadb
node database/import.js
# Import specific files (avoid "Command line too long")
node -e "
const importFiles = require('./server/database/import').importFiles;
const fs = require('fs');
const files = fs.readdirSync('C:/Shares/test/TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested')
.filter(f => f.endsWith('.txt'))
.map(f => 'C:/Shares/test/TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested/' + f);
importFiles(files, 'VASLOG_ENG').then(() => console.log('Done'));
"
Export Datasheets for Web
# SSH to AD2
ssh sysadmin@192.168.0.6
# Export all pending datasheets
cd C:\Shares\testdatadb
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web" # NOT X:\For_Web in SSH
node database/export-datasheets.js
# Export specific model family
node database/export-datasheets.js --family SCMHVAS
Backfill Historical Data
# SSH to AD2, run as inline script to avoid command-line length limits
node -e "
const db = require('./server/database/db');
const exportDatasheet = require('./server/templates/datasheet-exact');
db.all(\`
SELECT * FROM test_records
WHERE log_type IN ('VASLOG', 'VASLOG_ENG')
AND exported_at IS NULL
ORDER BY id
\`, (err, rows) => {
if (err) throw err;
console.log(\`[INFO] Found \${rows.length} records to export\`);
let count = 0;
rows.forEach(row => {
try {
exportDatasheet(row);
count++;
if (count % 100 === 0) console.log(\`[PROGRESS] \${count}/\${rows.length}\`);
} catch (e) {
console.error(\`[SKIP] \${row.model_name}: \${e.message}\`);
}
});
console.log(\`[DONE] Exported \${count} datasheets\`);
});
"
Check Service Status
# On AD2 (via SSH or RDP)
Get-Service testdatadb
# View service logs (if logging enabled)
Get-EventLog -LogName Application -Source testdatadb -Newest 50
# Test API
Invoke-WebRequest http://localhost:3000 | Select-Object StatusCode
# Check process
Get-Process | Where-Object { $_.ProcessName -like "*node*" }
Access Shares from macOS/Linux
# Mount AD2 share (SMB2+)
mkdir -p ~/mnt/ad2-testdatadb
mount_smbfs //sysadmin:Password@192.168.0.6/testdatadb ~/mnt/ad2-testdatadb
# Mount AD1 Engineering share
mkdir -p ~/mnt/ad1-engineering
mount_smbfs //sysadmin:Password@192.168.0.27/Engineering ~/mnt/ad1-engineering
# Unmount
umount ~/mnt/ad2-testdatadb
Key Technical Decisions (ADRs)
2026-04-12: Use Option C (simple accuracy-only template, no hvin.dat lookup)
- Reason: Marketing names (SCMHVAS-M0100) ≠ engineering MODNAMEs (SCM5B41-1181) in hvin.dat
- Sample datasheets show simple 1-parameter format (Accuracy only)
- Spec-reader stub lets SCMVAS/SCMHVAS pass through pipeline without schema changes
2026-04-12: Pass-through for VASLOG_ENG .txt files (not re-render)
- Reason: Engineering-Tested files already match target format exactly
- fs.copyFileSync() guarantees byte-level fidelity, avoids encoding round-trip
- Fallback to writeFileSync(raw_data, 'utf8') if source file missing
2026-04-12: Fix recursive=false default regression with config.recursive !== false
- Reason: Adding
recursivefield to LOG_TYPES must not break 7 pre-existing families - Treats absent/undefined as true (legacy behavior), explicit false as false
2026-04-12: Vault-based credentials in deploy script (no hardcoding, no prompts)
- Reason: Never commit passwords, even to private repo
- deploy-to-ad2.py calls vault.sh with 30s timeout, fails loud if unavailable
- No env-var fallback, no interactive prompt
2026-04-12: MM/DD/YYYY date normalization for datasheet Date field
- Reason: Matches newest Engineering-Tested samples
- Older "Corrected HVAS Files" used MM-DD-YYYY (hyphens) - backfill rewrites with slashes
- Intentional visible change, documented in implementation plan
2026-04-12: Patch regex with plain-decimal fallback for QuickBASIC STR$() quirk
- Reason: QB STR$() emits scientific notation for most values, plain decimal for ~1.6%
- Not a version difference or bug - purely QB float-to-string formatting threshold
- Two-regex approach: try scientific first, fall back to plain decimal
QuickBASIC Artifacts & Log Formats
VASLOG .DAT Structure
"SCMHVAS-M0100 " # Header: model name (marketing, NOT engineering MODNAME)
20,0.0034 # CSV line 1: measurement data
40,0.0126 # CSV line 2
60,-0.0046 # CSV line 3
80,0.0141 # CSV line 4
100,-0.00325 # CSV line 5
"PASS-7.005501E-033",... # Status line: PASS/FAIL + accuracy (scientific OR plain decimal)
"179379-1","04-09-2026" # Footer: serial number, test date (MM-DD-YYYY)
VASLOG_ENG .txt Structure (Engineering-Tested)
SCMHVAS - M0100
SN: 171087-1
Date: 04/08/2024
Test: PASS
Accuracy: -7.0055E-03 %
QuickBASIC STR$() Formatting Quirk
' QB emits TWO formats for floats:
PRINT STR$(-7.005501E-03) ' → "-7.005501E-033" (scientific + status digit)
PRINT STR$(0.01599373) ' → " .01599373" (plain decimal, leading space)
' Threshold: ~0.01 magnitude
' Affects ~1.6% of records (438/27503)
' NOT a bug - documented QB behavior
hvin.dat Binary Format
TYPE DBASE (from DBHV.BAS)
MODNAME AS STRING * 13 ' Engineering ID: "SCM5B41-1181 "
INTYPE AS STRING * 3
OUTSIGTYPE AS STRING * 7
WAVESHPCAL AS STRING * 8
' ... 42 SINGLE floats (IEEE 754, 4 bytes each) ...
END TYPE
' Total: 13+3+7+8 + (42*4) = 199 bytes/record
' File size: 6567 bytes = 33 records
Troubleshooting
"Output directory does not exist: X:\For_Web"
- Cause: X: drive only mapped under service account, not in SSH session
- Fix: Use UNC path:
\\ad2\webshare\For_Web
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web"
node database/export-datasheets.js
"Command line is too long" (PowerShell)
- Cause: Passing 50+ file paths as arguments exceeds PowerShell limit
- Fix: Use inline node script with fs.readdirSync (see Common Operations above)
VPN Drops Mid-Session
- Symptom: AD2/AD1 become unreachable, SSH hangs
- Fix:
- Work offline on local samples for analysis
- Restore VPN (FortiClient)
- Resume deployment/import when connection stable
Vault Returns Paper123\!@# (Backslash)
- Cause: Legacy shell escape stored in ad2.sops.yaml
- Fix: Strip backslash at read-time:
sed 's/\\//g' - TODO: Clean vault entry to remove backslash
Paramiko "No Output" for Long-Running Commands
- Cause: exec_command buffers stdout until completion
- Fix: Either:
- Accept final output when command completes
- Add progress markers that flush every N records
- Drain channel periodically:
while not channel.exit_status_ready(): channel.recv(1024)
438 Records Skipped During Backfill
- Cause: Plain-decimal format not matching scientific-notation-only regex
- Fix: Already patched (2026-04-12). Regex now tries both formats.
- Verification: Rerun backfill on stragglers → 438/438 rendered
Recent Commit History
2026-04-12 (commit 0dd3d82): SCMVAS/SCMHVAS pipeline extension
- 114 files changed, 35,486 insertions
- 5 production files modified, 1 new parser
- All research scripts sanitized (vault-based credentials)
- .gitignore updated (exclude testdata.db)
Useful Links
- Latest Session: session-logs/2026-04-12-session.md (DEFINITIVE)
- Discovery Session: session-logs/2026-04-11-discovery-session.md
- Implementation Plan: datasheet-pipeline/scmvas-hvas-research/IMPLEMENTATION_PLAN.md
- Credentials (vault): D:\vault\clients\dataforth\
Quick Reference - Log Types
| Family | Log Type | Format | Parser | Location |
|---|---|---|---|---|
| SCM5B | 5BLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/5BLOG |
| 8B | 8BLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/8BLOG |
| DSCA | DSCLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/DSCLOG |
| SCMVAS | VASLOG | Multiline CSV .DAT | vaslog.js | TS-3R/LOGS/VASLOG |
| SCMHVAS (prod) | VASLOG | Multiline CSV .DAT | vaslog.js | TS-3R/LOGS/VASLOG |
| SCMHVAS (eng) | VASLOG_ENG | .txt (pass-through) | vaslog.js | TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested |
Before starting work: Read session-logs/2026-04-12-session.md for complete context For AD2 access: Ensure Dataforth VPN connected (FortiClient) For credentials: Always use vault - never hardcode passwords