diff --git a/projects/dataforth-dos/session-logs/2026-03-13-import-fix.md b/projects/dataforth-dos/session-logs/2026-03-13-import-fix.md new file mode 100644 index 0000000..75b3fe3 --- /dev/null +++ b/projects/dataforth-dos/session-logs/2026-03-13-import-fix.md @@ -0,0 +1,153 @@ +# Dataforth TestDataDB Import Fix - 2026-03-13 + +## Problem Identified + +Data is flowing correctly through most of the pipeline, but NOT being imported to the database: + +| Stage | Status | Evidence | +|-------|--------|----------| +| DOS Machines → NAS | **[OK]** | Files from 2026-03-13 06:30 AM on D2TESTNAS | +| NAS → AD2 (rsync) | **[OK]** | 9.8MB transferred at 06:30 & 07:15 per rsync logs | +| AD2 → Database | **[BROKEN]** | Newest DB record: 2026-01-19 (2 months stale!) | + +**Root Cause:** The `import.js` script is either not being called after rsync sync, or is failing silently. + +--- + +## Diagnostic Steps (Run on AD2 or PC with AD2 access) + +### 1. Check the sync log for import activity +```powershell +Get-Content "C:\Shares\test\scripts\sync-from-nas.log" -Tail 100 +``` + +Look for: +- Lines mentioning "import" or "Import-ToDatabase" +- Any error messages +- Recent timestamps + +### 2. Check if import is configured in sync script +```powershell +Select-String -Path "C:\Shares\test\scripts\Sync-FromNAS-rsync.ps1" -Pattern "import|Import-ToDatabase" -Context 2,2 +``` + +The sync script should call `node import.js --file [files]` after syncing. + +### 3. Verify TestDataDB service is running +```powershell +# Check if Node.js server is running +Get-Process node -ErrorAction SilentlyContinue + +# Or check the API +Invoke-RestMethod -Uri "http://localhost:3000/api/stats" +``` + +### 4. Check database current state +```powershell +cd C:\Shares\testdatadb +node -e "const db=require('better-sqlite3')('database/testdata.db'); console.log(db.prepare('SELECT MAX(test_date) as latest_test, MAX(import_date) as latest_import, COUNT(*) as total FROM test_records').get())" +``` + +Expected output shows: +- `latest_test`: Should be recent (today or yesterday) +- `latest_import`: When last import ran +- `total`: Currently ~1.6 million records + +### 5. Test manual import of a single file +```powershell +cd C:\Shares\testdatadb\database + +# Pick a recent file from the sync +$recentFile = Get-ChildItem "C:\Shares\test\TS-3R\LOGS\DSCLOG\*.DAT" | Sort-Object LastWriteTime -Descending | Select-Object -First 1 + +# Run import +node import.js --file $recentFile.FullName +``` + +If this works, the import script is fine - the issue is in the sync script not calling it. + +--- + +## Fix Options + +### Option A: Fix the Sync Script to Call Import + +Edit `C:\Shares\test\scripts\Sync-FromNAS-rsync.ps1` and ensure it calls import after syncing: + +```powershell +# After the PULL phase completes, add: +if ($syncedFiles.Count -gt 0) { + $importScript = "C:\Shares\testdatadb\database\import.js" + $importArgs = @("import.js", "--file") + $syncedFiles + & node $importArgs 2>&1 | Tee-Object -Append -FilePath $LOG_FILE +} +``` + +### Option B: Run a Catch-Up Import + +If the sync script is working but imports were missed, run a full re-import: + +```powershell +cd C:\Shares\testdatadb\database + +# This will import ALL files (takes ~30 minutes for 1M+ records) +node import.js +``` + +Or import only recent files: + +```powershell +cd C:\Shares\testdatadb\database + +# Find files modified in last 7 days and import them +$recentFiles = Get-ChildItem "C:\Shares\test\TS-*\LOGS\*\*.DAT" -Recurse | + Where-Object { $_.LastWriteTime -gt (Get-Date).AddDays(-7) } + +foreach ($file in $recentFiles) { + node import.js --file $file.FullName +} +``` + +### Option C: Create Scheduled Task for Import + +If sync and import should run separately: + +```powershell +# Create a scheduled task to run import every hour +$action = New-ScheduledTaskAction -Execute "node" -Argument "C:\Shares\testdatadb\database\import.js" -WorkingDirectory "C:\Shares\testdatadb\database" +$trigger = New-ScheduledTaskTrigger -Once -At (Get-Date) -RepetitionInterval (New-TimeSpan -Hours 1) +Register-ScheduledTask -TaskName "TestDataDB-Import" -Action $action -Trigger $trigger -User "SYSTEM" +``` + +--- + +## Verification + +After fixing, verify the import is working: + +```powershell +# Check API for updated stats +Invoke-RestMethod -Uri "http://localhost:3000/api/stats" | ConvertTo-Json + +# The "newest" date should now be today or yesterday +# The "total_records" should have increased +``` + +--- + +## Files Referenced + +- **Sync Script:** `C:\Shares\test\scripts\Sync-FromNAS-rsync.ps1` +- **Sync Log:** `C:\Shares\test\scripts\sync-from-nas.log` +- **Import Script:** `C:\Shares\testdatadb\database\import.js` +- **Database:** `C:\Shares\testdatadb\database\testdata.db` +- **TestDataDB Server:** `C:\Shares\testdatadb\server.js` + +--- + +## Context from Mac Investigation + +- SSH to D2TESTNAS (192.168.0.9) confirmed data arriving from DOS machines +- TestDataDB API at http://192.168.0.6:3000/api/stats responds but shows stale data +- rsync daemon logs show successful transfers (9.8MB) to AD2 +- The gap is between rsync completing and import.js being called