440 lines
16 KiB
Markdown
440 lines
16 KiB
Markdown
# Dataforth DOS Project - Context
|
|
|
|
**Last Updated:** 2026-04-14
|
|
**Status:** Active - Datasheet Pipeline Extended for SCMVAS/SCMHVAS
|
|
|
|
## Quick Start - Infrastructure Overview
|
|
|
|
| Component | IP/Location | Access | Notes |
|
|
|-----------|-------------|--------|-------|
|
|
| **AD2** (Primary) | 192.168.0.6 | SSH: sysadmin / vault | Windows Server 2022, hosts testdatadb service |
|
|
| **AD1** (Secondary) | 192.168.0.27 | SSH: sysadmin / vault | Hosts Engineering share at \\AD1\Engineering |
|
|
| **D2TESTNAS** | 192.168.0.9 | SMB1 only | Bridge for DOS test stations (TS-xx machines) |
|
|
| **VPN** | Required | FortiClient | Access to 192.168.0.x network |
|
|
|
|
**Get credentials:**
|
|
```bash
|
|
# AD2 password (has stale backslash escape - strip it)
|
|
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g'
|
|
|
|
# AD1 password
|
|
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad1.sops.yaml credentials.password
|
|
```
|
|
|
|
**All passwords:** `Paper123!@#` (stored in vault, note backslash escape issue in ad2.sops.yaml)
|
|
|
|
## Current State (READ THIS FIRST)
|
|
|
|
### Recent Work (2026-04-11/12)
|
|
**Extended Test Datasheet Pipeline for SCMVAS-Mxxx and SCMHVAS-Mxxxx families**
|
|
- Added VASLOG parser support (multiline CSV .DAT format)
|
|
- Created accuracy-only datasheet template (simple format, no hvin.dat lookup)
|
|
- Implemented pass-through for Engineering-Tested .txt files
|
|
- **Backfilled 27,503 historical records** (438 required regex patch for QB STR$() format quirk)
|
|
- **434 Engineering .txt files** imported and published
|
|
- Deployed to AD2, service restarted, web publishing verified
|
|
|
|
**Status:** ✅ Complete, production-deployed
|
|
|
|
**Critical Files Changed:** 5 modified, 1 new parser
|
|
- server/parsers/vaslog.js (new)
|
|
- server/templates/datasheet-exact.js (SCMVAS/SCMHVAS branch added)
|
|
- server/database/import.js (recursive flag fix, VASLOG_ENG support)
|
|
- server/parsers/spec-reader.js (stub for SCMVAS/SCMHVAS)
|
|
- deploy/deploy-to-ad2.py (vault-based credentials)
|
|
|
|
**Session Logs:**
|
|
- **2026-04-12-session.md** - Implementation, deploy, backfill, patch (DEFINITIVE)
|
|
- **2026-04-11-discovery-session.md** - Discovery phase
|
|
|
|
### testdatadb Service (on AD2)
|
|
- **Service Name:** testdatadb
|
|
- **Status:** Running
|
|
- **Service Account:** INTRANET\svc_testdatadb
|
|
- **Working Directory:** C:\Shares\testdatadb
|
|
- **API Port:** 3000 (http://192.168.0.6:3000)
|
|
- **Database:** SQLite at C:\Shares\testdatadb\database/testdata.db (4.1GB)
|
|
- **Web Output:** X:\For_Web (= \\ad2\webshare\For_Web UNC path)
|
|
|
|
### File Shares on AD2
|
|
```
|
|
C:\Shares\test\ # Mirror of D2TESTNAS test data
|
|
├── TS-xx\LOGS\ # Test logs from DOS stations
|
|
│ ├── 5BLOG\ # SCM5B family
|
|
│ ├── 8BLOG\ # 8B family
|
|
│ ├── VASLOG\ # SCMVAS/SCMHVAS .DAT files
|
|
│ │ ├── HVAS-M01.DAT # Production logs
|
|
│ │ ├── VAS-M100.DAT
|
|
│ │ └── VASLOG - Engineering Tested\ # 434 .txt files
|
|
│ └── ...
|
|
└── Corrected HVAS Files\ # 200 pre-generated datasheets
|
|
|
|
C:\Shares\testdatadb\ # Node.js application
|
|
├── server/
|
|
│ ├── parsers/ # Log file parsers
|
|
│ ├── templates/ # Datasheet formatters
|
|
│ └── database/ # Import/export scripts
|
|
├── database/
|
|
│ └── testdata.db # SQLite (4.1GB, not in git)
|
|
└── node_modules/
|
|
```
|
|
|
|
### File Shares on AD1
|
|
```
|
|
\\AD1\Engineering\
|
|
└── ENGR\ATE\High Voltage Input Module Test\
|
|
├── HVDATA\
|
|
│ └── hvin.dat # Spec database (33 records, engineering MODNAMEs)
|
|
└── Released\
|
|
├── TESTHV3.BAS # Primary test program (2020)
|
|
├── TESTHV4.BAS # Alternate test program (2017)
|
|
├── NLIBATE3.BAS # ATE library
|
|
└── DBHV.BAS # Database editor (TYPE DBASE definition)
|
|
```
|
|
|
|
## Anti-Patterns (DON'T DO THIS)
|
|
|
|
❌ **DO NOT hardcode Paper123!@#** - Always fetch from vault:
|
|
```bash
|
|
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g'
|
|
```
|
|
|
|
❌ **DO NOT use X: drive in SSH sessions** - It's only mapped under service account. Use UNC path instead:
|
|
```powershell
|
|
# Wrong:
|
|
node database/export-datasheets.js # Fails: "X:\For_Web does not exist"
|
|
|
|
# Right:
|
|
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web"
|
|
node database/export-datasheets.js
|
|
```
|
|
|
|
❌ **DO NOT assume hvin.dat lookup works** - Marketing names (SCMHVAS-M0100) ≠ engineering MODNAMEs (SCM5B41-1181). SCMVAS/SCMHVAS use simplified accuracy-only template WITHOUT hvin.dat.
|
|
|
|
❌ **DO NOT pass 50+ file paths on PowerShell command line** - Hits "Command line too long". Use inline node script with fs.readdirSync instead.
|
|
|
|
❌ **DO NOT commit testdata.db or large samples** - 4.1GB database is in .gitignore. Keep research samples local only.
|
|
|
|
❌ **DO NOT use SMB1 on AD2** - Disabled for security. Use SSH/SFTP (port 22) or SMB2+ shares.
|
|
|
|
❌ **DO NOT expect immediate output from exec_command** - paramiko buffers stdout. Use progress markers or drain at completion.
|
|
|
|
❌ **DO NOT assume VPN is stable** - Dataforth VPN can drop mid-session. Save work frequently, use local samples for offline analysis.
|
|
|
|
## Where to Find Things
|
|
|
|
### Codebase Structure
|
|
```
|
|
projects/dataforth-dos/
|
|
├── datasheet-pipeline/
|
|
│ ├── implementation/ # Staged code (approved by Code Review)
|
|
│ ├── scmvas-hvas-research/ # Discovery scripts and source files
|
|
│ │ ├── source/ # TESTHV3.BAS, hvin.dat, etc.
|
|
│ │ ├── samples/ # .DAT and .txt samples (local)
|
|
│ │ ├── parse_hvin.py # hvin.dat binary parser
|
|
│ │ └── pull-*.py # SSH download scripts
|
|
│ └── IMPLEMENTATION_PLAN.md # Approved plan (2026-04-11)
|
|
├── deploy/
|
|
│ └── deploy-to-ad2.py # Deployment script (vault-based auth)
|
|
├── session-logs/
|
|
│ ├── 2026-04-12-session.md # SCMVAS/SCMHVAS implementation (DEFINITIVE)
|
|
│ └── 2026-04-11-discovery-session.md
|
|
└── CONTEXT.md # This file
|
|
```
|
|
|
|
### Production Files on AD2
|
|
```
|
|
C:\Shares\testdatadb\
|
|
├── server.js # Main entry point
|
|
├── server/
|
|
│ ├── parsers/
|
|
│ │ ├── multiline.js # Handles VASLOG .DAT (CSV format)
|
|
│ │ ├── vaslog.js # VASLOG-specific logic (new)
|
|
│ │ └── spec-reader.js # Spec DB loader (stub for SCMVAS/SCMHVAS)
|
|
│ ├── templates/
|
|
│ │ └── datasheet-exact.js # Datasheet formatter (SCMVAS/SCMHVAS branch added)
|
|
│ └── database/
|
|
│ ├── import.js # LOG_TYPES registry, importFiles()
|
|
│ └── export-datasheets.js # Batch export script
|
|
└── database/
|
|
└── testdata.db # SQLite (27k+ records after backfill)
|
|
```
|
|
|
|
## Common Operations
|
|
|
|
### Deploy Code to AD2
|
|
```bash
|
|
# From projects/dataforth-dos/deploy/
|
|
python3 deploy-to-ad2.py
|
|
|
|
# What it does:
|
|
# 1. Fetches password from vault (D:/vault/scripts/vault.sh)
|
|
# 2. Connects via paramiko SFTP to 192.168.0.6:22
|
|
# 3. Creates .bak-YYYYMMDD timestamped backups
|
|
# 4. Uploads modified files from implementation/
|
|
# 5. Restarts testdatadb service via SSH exec_command
|
|
# 6. Verifies API responds 200 OK on port 3000
|
|
```
|
|
|
|
**Manual deployment (if script unavailable):**
|
|
```bash
|
|
# Get password
|
|
AD2_PASS=$(bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g')
|
|
|
|
# Connect
|
|
sshpass -p "${AD2_PASS}" ssh sysadmin@192.168.0.6
|
|
|
|
# Backup + copy
|
|
cd C:\Shares\testdatadb\server\parsers
|
|
copy multiline.js multiline.js.bak-20260414
|
|
# ... upload new files via SFTP ...
|
|
|
|
# Restart service
|
|
Restart-Service -Name testdatadb
|
|
|
|
# Verify
|
|
curl http://localhost:3000
|
|
```
|
|
|
|
### Import New Test Data
|
|
```bash
|
|
# SSH to AD2
|
|
ssh sysadmin@192.168.0.6
|
|
|
|
# Run import for specific log type
|
|
cd C:\Shares\testdatadb
|
|
node database/import.js
|
|
|
|
# Import specific files (avoid "Command line too long")
|
|
node -e "
|
|
const importFiles = require('./server/database/import').importFiles;
|
|
const fs = require('fs');
|
|
const files = fs.readdirSync('C:/Shares/test/TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested')
|
|
.filter(f => f.endsWith('.txt'))
|
|
.map(f => 'C:/Shares/test/TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested/' + f);
|
|
importFiles(files, 'VASLOG_ENG').then(() => console.log('Done'));
|
|
"
|
|
```
|
|
|
|
### Export Datasheets for Web
|
|
```bash
|
|
# SSH to AD2
|
|
ssh sysadmin@192.168.0.6
|
|
|
|
# Export all pending datasheets
|
|
cd C:\Shares\testdatadb
|
|
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web" # NOT X:\For_Web in SSH
|
|
node database/export-datasheets.js
|
|
|
|
# Export specific model family
|
|
node database/export-datasheets.js --family SCMHVAS
|
|
```
|
|
|
|
### Backfill Historical Data
|
|
```bash
|
|
# SSH to AD2, run as inline script to avoid command-line length limits
|
|
node -e "
|
|
const db = require('./server/database/db');
|
|
const exportDatasheet = require('./server/templates/datasheet-exact');
|
|
|
|
db.all(\`
|
|
SELECT * FROM test_records
|
|
WHERE log_type IN ('VASLOG', 'VASLOG_ENG')
|
|
AND exported_at IS NULL
|
|
ORDER BY id
|
|
\`, (err, rows) => {
|
|
if (err) throw err;
|
|
console.log(\`[INFO] Found \${rows.length} records to export\`);
|
|
let count = 0;
|
|
rows.forEach(row => {
|
|
try {
|
|
exportDatasheet(row);
|
|
count++;
|
|
if (count % 100 === 0) console.log(\`[PROGRESS] \${count}/\${rows.length}\`);
|
|
} catch (e) {
|
|
console.error(\`[SKIP] \${row.model_name}: \${e.message}\`);
|
|
}
|
|
});
|
|
console.log(\`[DONE] Exported \${count} datasheets\`);
|
|
});
|
|
"
|
|
```
|
|
|
|
### Check Service Status
|
|
```powershell
|
|
# On AD2 (via SSH or RDP)
|
|
Get-Service testdatadb
|
|
|
|
# View service logs (if logging enabled)
|
|
Get-EventLog -LogName Application -Source testdatadb -Newest 50
|
|
|
|
# Test API
|
|
Invoke-WebRequest http://localhost:3000 | Select-Object StatusCode
|
|
|
|
# Check process
|
|
Get-Process | Where-Object { $_.ProcessName -like "*node*" }
|
|
```
|
|
|
|
### Access Shares from macOS/Linux
|
|
```bash
|
|
# Mount AD2 share (SMB2+)
|
|
mkdir -p ~/mnt/ad2-testdatadb
|
|
mount_smbfs //sysadmin:Password@192.168.0.6/testdatadb ~/mnt/ad2-testdatadb
|
|
|
|
# Mount AD1 Engineering share
|
|
mkdir -p ~/mnt/ad1-engineering
|
|
mount_smbfs //sysadmin:Password@192.168.0.27/Engineering ~/mnt/ad1-engineering
|
|
|
|
# Unmount
|
|
umount ~/mnt/ad2-testdatadb
|
|
```
|
|
|
|
## Key Technical Decisions (ADRs)
|
|
|
|
**2026-04-12:** Use Option C (simple accuracy-only template, no hvin.dat lookup)
|
|
- Reason: Marketing names (SCMHVAS-M0100) ≠ engineering MODNAMEs (SCM5B41-1181) in hvin.dat
|
|
- Sample datasheets show simple 1-parameter format (Accuracy only)
|
|
- Spec-reader stub lets SCMVAS/SCMHVAS pass through pipeline without schema changes
|
|
|
|
**2026-04-12:** Pass-through for VASLOG_ENG .txt files (not re-render)
|
|
- Reason: Engineering-Tested files already match target format exactly
|
|
- fs.copyFileSync() guarantees byte-level fidelity, avoids encoding round-trip
|
|
- Fallback to writeFileSync(raw_data, 'utf8') if source file missing
|
|
|
|
**2026-04-12:** Fix recursive=false default regression with `config.recursive !== false`
|
|
- Reason: Adding `recursive` field to LOG_TYPES must not break 7 pre-existing families
|
|
- Treats absent/undefined as true (legacy behavior), explicit false as false
|
|
|
|
**2026-04-12:** Vault-based credentials in deploy script (no hardcoding, no prompts)
|
|
- Reason: Never commit passwords, even to private repo
|
|
- deploy-to-ad2.py calls vault.sh with 30s timeout, fails loud if unavailable
|
|
- No env-var fallback, no interactive prompt
|
|
|
|
**2026-04-12:** MM/DD/YYYY date normalization for datasheet Date field
|
|
- Reason: Matches newest Engineering-Tested samples
|
|
- Older "Corrected HVAS Files" used MM-DD-YYYY (hyphens) - backfill rewrites with slashes
|
|
- Intentional visible change, documented in implementation plan
|
|
|
|
**2026-04-12:** Patch regex with plain-decimal fallback for QuickBASIC STR$() quirk
|
|
- Reason: QB STR$() emits scientific notation for most values, plain decimal for ~1.6%
|
|
- Not a version difference or bug - purely QB float-to-string formatting threshold
|
|
- Two-regex approach: try scientific first, fall back to plain decimal
|
|
|
|
## QuickBASIC Artifacts & Log Formats
|
|
|
|
### VASLOG .DAT Structure
|
|
```
|
|
"SCMHVAS-M0100 " # Header: model name (marketing, NOT engineering MODNAME)
|
|
20,0.0034 # CSV line 1: measurement data
|
|
40,0.0126 # CSV line 2
|
|
60,-0.0046 # CSV line 3
|
|
80,0.0141 # CSV line 4
|
|
100,-0.00325 # CSV line 5
|
|
"PASS-7.005501E-033",... # Status line: PASS/FAIL + accuracy (scientific OR plain decimal)
|
|
"179379-1","04-09-2026" # Footer: serial number, test date (MM-DD-YYYY)
|
|
```
|
|
|
|
### VASLOG_ENG .txt Structure (Engineering-Tested)
|
|
```
|
|
SCMHVAS - M0100
|
|
SN: 171087-1
|
|
Date: 04/08/2024
|
|
Test: PASS
|
|
Accuracy: -7.0055E-03 %
|
|
```
|
|
|
|
### QuickBASIC STR$() Formatting Quirk
|
|
```basic
|
|
' QB emits TWO formats for floats:
|
|
PRINT STR$(-7.005501E-03) ' → "-7.005501E-033" (scientific + status digit)
|
|
PRINT STR$(0.01599373) ' → " .01599373" (plain decimal, leading space)
|
|
|
|
' Threshold: ~0.01 magnitude
|
|
' Affects ~1.6% of records (438/27503)
|
|
' NOT a bug - documented QB behavior
|
|
```
|
|
|
|
### hvin.dat Binary Format
|
|
```
|
|
TYPE DBASE (from DBHV.BAS)
|
|
MODNAME AS STRING * 13 ' Engineering ID: "SCM5B41-1181 "
|
|
INTYPE AS STRING * 3
|
|
OUTSIGTYPE AS STRING * 7
|
|
WAVESHPCAL AS STRING * 8
|
|
' ... 42 SINGLE floats (IEEE 754, 4 bytes each) ...
|
|
END TYPE
|
|
|
|
' Total: 13+3+7+8 + (42*4) = 199 bytes/record
|
|
' File size: 6567 bytes = 33 records
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### "Output directory does not exist: X:\For_Web"
|
|
- **Cause:** X: drive only mapped under service account, not in SSH session
|
|
- **Fix:** Use UNC path: `\\ad2\webshare\For_Web`
|
|
```powershell
|
|
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web"
|
|
node database/export-datasheets.js
|
|
```
|
|
|
|
### "Command line is too long" (PowerShell)
|
|
- **Cause:** Passing 50+ file paths as arguments exceeds PowerShell limit
|
|
- **Fix:** Use inline node script with fs.readdirSync (see Common Operations above)
|
|
|
|
### VPN Drops Mid-Session
|
|
- **Symptom:** AD2/AD1 become unreachable, SSH hangs
|
|
- **Fix:**
|
|
1. Work offline on local samples for analysis
|
|
2. Restore VPN (FortiClient)
|
|
3. Resume deployment/import when connection stable
|
|
|
|
### Vault Returns `Paper123\!@#` (Backslash)
|
|
- **Cause:** Legacy shell escape stored in ad2.sops.yaml
|
|
- **Fix:** Strip backslash at read-time: `sed 's/\\//g'`
|
|
- **TODO:** Clean vault entry to remove backslash
|
|
|
|
### Paramiko "No Output" for Long-Running Commands
|
|
- **Cause:** exec_command buffers stdout until completion
|
|
- **Fix:** Either:
|
|
1. Accept final output when command completes
|
|
2. Add progress markers that flush every N records
|
|
3. Drain channel periodically: `while not channel.exit_status_ready(): channel.recv(1024)`
|
|
|
|
### 438 Records Skipped During Backfill
|
|
- **Cause:** Plain-decimal format not matching scientific-notation-only regex
|
|
- **Fix:** Already patched (2026-04-12). Regex now tries both formats.
|
|
- **Verification:** Rerun backfill on stragglers → 438/438 rendered
|
|
|
|
## Recent Commit History
|
|
|
|
**2026-04-12 (commit 0dd3d82):** SCMVAS/SCMHVAS pipeline extension
|
|
- 114 files changed, 35,486 insertions
|
|
- 5 production files modified, 1 new parser
|
|
- All research scripts sanitized (vault-based credentials)
|
|
- .gitignore updated (exclude testdata.db)
|
|
|
|
## Useful Links
|
|
|
|
- **Latest Session:** session-logs/2026-04-12-session.md (DEFINITIVE)
|
|
- **Discovery Session:** session-logs/2026-04-11-discovery-session.md
|
|
- **Implementation Plan:** datasheet-pipeline/scmvas-hvas-research/IMPLEMENTATION_PLAN.md
|
|
- **Credentials (vault):** D:\vault\clients\dataforth\
|
|
|
|
## Quick Reference - Log Types
|
|
|
|
| Family | Log Type | Format | Parser | Location |
|
|
|--------|----------|--------|--------|----------|
|
|
| SCM5B | 5BLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/5BLOG |
|
|
| 8B | 8BLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/8BLOG |
|
|
| DSCA | DSCLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/DSCLOG |
|
|
| SCMVAS | VASLOG | Multiline CSV .DAT | vaslog.js | TS-3R/LOGS/VASLOG |
|
|
| SCMHVAS (prod) | VASLOG | Multiline CSV .DAT | vaslog.js | TS-3R/LOGS/VASLOG |
|
|
| SCMHVAS (eng) | VASLOG_ENG | .txt (pass-through) | vaslog.js | TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested |
|
|
|
|
---
|
|
|
|
**Before starting work:** Read session-logs/2026-04-12-session.md for complete context
|
|
**For AD2 access:** Ensure Dataforth VPN connected (FortiClient)
|
|
**For credentials:** Always use vault - never hardcode passwords
|