Add CONTEXT.md files for automatic context recovery

This commit is contained in:
2026-04-14 20:45:46 -07:00
parent 04bdac0448
commit d0dbfed5ec
3 changed files with 1049 additions and 0 deletions

264
CONTEXT.md Normal file
View File

@@ -0,0 +1,264 @@
# ClaudeTools - Project Context
**Last Updated:** 2026-04-14
**Status:** Active - Production Stable
## Quick Start - Infrastructure Overview
| Component | Location | Access | Notes |
|-----------|----------|--------|-------|
| **Production API** | http://172.16.3.30:8001 | Public access | ClaudeTools work tracking API |
| **Production DB** | MariaDB @ 172.16.3.30:3306/claudetools | Vault credentials | 38 tables, AES-256-GCM encryption |
| **Vault (SOPS)** | D:\vault\ | age-encrypted YAML | Primary credential store |
| **1Password** | Service account | Fallback | op://Projects/ClaudeTools * |
| **Gitea Repo** | git.azcomputerguru.com/azcomputerguru/claudetools | Active development | Main codebase |
**Get DB credentials:**
```bash
bash D:/vault/scripts/vault.sh get-field projects/claudetools/database.sops.yaml credentials.password
```
## Current State (READ THIS FIRST)
### Project Status
- **API:** 95+ endpoints, production-stable
- **Database:** 38 tables, fully encrypted sensitive fields
- **Authentication:** JWT-based, AES-256-GCM for API keys
- **Deployment:** Auto-deploy via Gitea webhooks (planned)
### Active Subprojects
1. **GuruRMM** - Remote monitoring system (see projects/msp-tools/guru-rmm/CONTEXT.md)
2. **Dataforth DOS** - Test datasheet pipeline (see projects/dataforth-dos/CONTEXT.md)
### Session Logs
- **Project-specific:** projects/*/session-logs/
- **Client work:** clients/*/session-logs/
- **General:** session-logs/ (root)
## Anti-Patterns (DON'T DO THIS)
**DO NOT query database directly** - Use Database Agent for ALL operations
**DO NOT write production code yourself** - Delegate to Coding Agent, coordinate as needed
**DO NOT use emojis** - ASCII markers only: [OK], [ERROR], [WARNING], [SUCCESS], [INFO]
**DO NOT hardcode credentials** - Always use SOPS vault (primary) or 1Password (fallback)
**DO NOT skip Code Review Agent** - MANDATORY after any code changes
**DO NOT execute >500 token operations directly** - Delegate to appropriate agent
## Where to Find Things
### Repository Structure
```
ClaudeTools/
├── .claude/
│ ├── CLAUDE.md # Project instructions (directives)
│ ├── REFERENCE.md # Technical reference
│ ├── CODING_GUIDELINES.md # Code standards
│ ├── FILE_PLACEMENT_GUIDE.md # Where to save files
│ ├── agents/ # Agent definitions
│ └── memory/ # Persistent facts (syncs via Git)
├── credentials.md # Infrastructure reference (migrating to vault)
├── session-logs/ # General session logs
├── projects/
│ ├── msp-tools/guru-rmm/ # GuruRMM (CONTEXT.md there)
│ ├── dataforth-dos/ # Dataforth (CONTEXT.md there)
│ └── claudetools-api/ # API codebase (legacy)
├── clients/
│ └── [client-name]/ # Client-specific work
└── D:\vault\ # SOPS encrypted credentials (separate repo)
```
### Credential Locations
**Primary: SOPS Vault (D:\vault\)**
```bash
# Search for keywords (no decryption needed)
bash D:/vault/scripts/vault.sh search "claudetools"
# Get specific field
bash D:/vault/scripts/vault.sh get-field projects/claudetools/database.sops.yaml credentials.password
# List all entries
bash D:/vault/scripts/vault.sh list
```
**Structure:**
- infrastructure/ - Servers, network gear
- clients/ - Client-specific credentials
- services/ - External services (GitHub, APIs)
- projects/ - Project databases, APIs
- msp-tools/ - MSP application credentials
**Fallback: 1Password**
```bash
# ClaudeTools database credentials
op read "op://Projects/ClaudeTools Database/password"
# ClaudeTools API auth
op read "op://Projects/ClaudeTools API Auth/credential"
```
## Common Operations
### Start Work on Subproject
```
User: "Let's work on GuruRMM tunnel Phase 2"
Claude should:
1. Read projects/msp-tools/guru-rmm/CONTEXT.md (this file)
2. Check recent session logs referenced in CONTEXT.md
3. Understand current state, infrastructure, anti-patterns
4. Proceed without asking user for context
DO NOT:
- Ask user "what's the server IP?"
- Ask user "where is the database?"
- Ask user "what credentials should I use?"
All of this is in CONTEXT.md
```
### Database Operations
```bash
# WRONG: Direct query
ssh user@172.16.3.30 "mysql -u claudetools -p claudetools -e 'SELECT * FROM ...'"
# RIGHT: Delegate to Database Agent
"Use Database Agent to query work_logs table for entries from last 7 days"
```
### Deploy Code Changes
```bash
# WRONG: Deploy yourself
scp file.js user@172.16.3.30:/path/
# RIGHT: Follow project deployment process
# (See project-specific CONTEXT.md for deployment steps)
```
## Project-Specific Context Files
**When user says "work on [project]", read that project's CONTEXT.md FIRST:**
| Project | CONTEXT.md Location | What It Contains |
|---------|---------------------|------------------|
| GuruRMM | projects/msp-tools/guru-rmm/CONTEXT.md | Server IPs, deployment, tunnel architecture, agent status |
| Dataforth DOS | projects/dataforth-dos/CONTEXT.md | AD2/AD1 infrastructure, testdatadb service, log formats |
| ClaudeTools API | (This file) | Main project overview, credential locations |
## Coordination Rules (My Role)
I am a **Coordinator**, NOT an executor:
| Operation | Delegate To |
|-----------|-------------|
| Database queries/updates | Database Agent |
| Production code generation | Coding Agent |
| Code review (MANDATORY) | Code Review Agent |
| Test execution | Testing Agent |
| Git commit/push | Gitea Agent |
| Backups/restore | Backup Agent |
| File exploration | Explore Agent |
| Semantic code search | deep-explore Agent (GrepAI) |
| Complex reasoning | General-purpose + Sequential Thinking |
**I do myself:** Simple responses, reading 1-2 files, presenting results, planning, decisions
**Rule:** >500 tokens of work = delegate
## Memory System
**Location:** `.claude/memory/` (syncs across machines via Git)
**Structure:**
- MEMORY.md - Index of all facts
- [topic]-context.md - Topic-specific persistent facts
**IMPORTANT:** Always write to `.claude/memory/` (repo-relative), NOT `~/.claude/projects/*/memory/`
## Session Log Locations
**Follow these rules:**
| Work Type | Save To |
|-----------|---------|
| Dataforth DOS work | projects/dataforth-dos/session-logs/ |
| ClaudeTools API code | projects/claudetools-api/session-logs/ |
| GuruRMM work | projects/msp-tools/guru-rmm/session-logs/ |
| Client work | clients/[client-name]/session-logs/ |
| General/mixed work | session-logs/ (root) |
**See:** .claude/FILE_PLACEMENT_GUIDE.md for full rules
## Auto-Invoke Skills
**Frontend Design:** Auto-invoke `/frontend-design` skill after ANY UI change (HTML/CSS/JSX/styling)
**Sequential Thinking:** Use for genuine complexity only:
- Rejection loops (3+ failed attempts)
- Critical architectural decisions
- Multi-step debugging with unclear root cause
- NOT for every task
## Key Commands
| Command | Purpose | When |
|---------|---------|------|
| `/checkpoint` | Git commit + database context save | After code changes |
| `/save` | Comprehensive session log | End of session |
| `/context` | Search session logs + credentials | User references previous work |
| `/1password` | 1Password operations | Manage secrets |
| `/sync` | Sync from Gitea | Update local config |
| `/refresh-directives` | Re-read CLAUDE.md | After summarization, large tasks |
## Reference Documents (Read On-Demand)
- **Project instructions:** .claude/CLAUDE.md (my role, agent delegation rules)
- **Technical reference:** .claude/REFERENCE.md (endpoints, database, workflows)
- **Coding standards:** .claude/CODING_GUIDELINES.md (agents read, not every session)
- **File placement:** .claude/FILE_PLACEMENT_GUIDE.md (where to save files)
- **MCP servers:** MCP_SERVERS.md (available integrations)
- **Agent definitions:** .claude/agents/*.md (agent capabilities)
## Troubleshooting
### "I don't know the database password"
- **Check:** D:\vault\ (SOPS encrypted)
- **Command:** `bash D:/vault/scripts/vault.sh get-field projects/claudetools/database.sops.yaml credentials.password`
- **Fallback:** credentials.md (has 1Password references)
### "Where is the GuruRMM server?"
- **Check:** projects/msp-tools/guru-rmm/CONTEXT.md
- **Answer:** 172.16.3.30 (listed in first table)
### "How do I deploy Dataforth changes?"
- **Check:** projects/dataforth-dos/CONTEXT.md
- **Section:** "Common Operations → Deploy Code to AD2"
### "I forgot my role as Coordinator"
- **Read:** .claude/CLAUDE.md
- **Command:** `/refresh-directives`
## Quick Links
- **Credentials (vault):** D:\vault\ (SOPS encrypted YAML)
- **Credentials (legacy):** credentials.md (migrating to vault)
- **Gitea:** http://172.16.3.20:3000/azcomputerguru/claudetools
- **API Docs:** http://172.16.3.30:8001/api/docs
---
**When user says "work on [project]":**
1. Read [project]/CONTEXT.md FIRST (don't ask user for context)
2. Check recent session logs mentioned in CONTEXT.md
3. Understand infrastructure, anti-patterns, current state
4. Proceed with work
**This eliminates:**
- "What's the server IP?" → In CONTEXT.md
- "Where's the database?" → In CONTEXT.md
- "What credentials?" → In CONTEXT.md
- "What did we do last time?" → In session logs referenced by CONTEXT.md

View File

@@ -0,0 +1,439 @@
# Dataforth DOS Project - Context
**Last Updated:** 2026-04-14
**Status:** Active - Datasheet Pipeline Extended for SCMVAS/SCMHVAS
## Quick Start - Infrastructure Overview
| Component | IP/Location | Access | Notes |
|-----------|-------------|--------|-------|
| **AD2** (Primary) | 192.168.0.6 | SSH: sysadmin / vault | Windows Server 2022, hosts testdatadb service |
| **AD1** (Secondary) | 192.168.0.27 | SSH: sysadmin / vault | Hosts Engineering share at \\AD1\Engineering |
| **D2TESTNAS** | 192.168.0.9 | SMB1 only | Bridge for DOS test stations (TS-xx machines) |
| **VPN** | Required | FortiClient | Access to 192.168.0.x network |
**Get credentials:**
```bash
# AD2 password (has stale backslash escape - strip it)
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g'
# AD1 password
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad1.sops.yaml credentials.password
```
**All passwords:** `Paper123!@#` (stored in vault, note backslash escape issue in ad2.sops.yaml)
## Current State (READ THIS FIRST)
### Recent Work (2026-04-11/12)
**Extended Test Datasheet Pipeline for SCMVAS-Mxxx and SCMHVAS-Mxxxx families**
- Added VASLOG parser support (multiline CSV .DAT format)
- Created accuracy-only datasheet template (simple format, no hvin.dat lookup)
- Implemented pass-through for Engineering-Tested .txt files
- **Backfilled 27,503 historical records** (438 required regex patch for QB STR$() format quirk)
- **434 Engineering .txt files** imported and published
- Deployed to AD2, service restarted, web publishing verified
**Status:** ✅ Complete, production-deployed
**Critical Files Changed:** 5 modified, 1 new parser
- server/parsers/vaslog.js (new)
- server/templates/datasheet-exact.js (SCMVAS/SCMHVAS branch added)
- server/database/import.js (recursive flag fix, VASLOG_ENG support)
- server/parsers/spec-reader.js (stub for SCMVAS/SCMHVAS)
- deploy/deploy-to-ad2.py (vault-based credentials)
**Session Logs:**
- **2026-04-12-session.md** - Implementation, deploy, backfill, patch (DEFINITIVE)
- **2026-04-11-discovery-session.md** - Discovery phase
### testdatadb Service (on AD2)
- **Service Name:** testdatadb
- **Status:** Running
- **Service Account:** INTRANET\svc_testdatadb
- **Working Directory:** C:\Shares\testdatadb
- **API Port:** 3000 (http://192.168.0.6:3000)
- **Database:** SQLite at C:\Shares\testdatadb\database/testdata.db (4.1GB)
- **Web Output:** X:\For_Web (= \\ad2\webshare\For_Web UNC path)
### File Shares on AD2
```
C:\Shares\test\ # Mirror of D2TESTNAS test data
├── TS-xx\LOGS\ # Test logs from DOS stations
│ ├── 5BLOG\ # SCM5B family
│ ├── 8BLOG\ # 8B family
│ ├── VASLOG\ # SCMVAS/SCMHVAS .DAT files
│ │ ├── HVAS-M01.DAT # Production logs
│ │ ├── VAS-M100.DAT
│ │ └── VASLOG - Engineering Tested\ # 434 .txt files
│ └── ...
└── Corrected HVAS Files\ # 200 pre-generated datasheets
C:\Shares\testdatadb\ # Node.js application
├── server/
│ ├── parsers/ # Log file parsers
│ ├── templates/ # Datasheet formatters
│ └── database/ # Import/export scripts
├── database/
│ └── testdata.db # SQLite (4.1GB, not in git)
└── node_modules/
```
### File Shares on AD1
```
\\AD1\Engineering\
└── ENGR\ATE\High Voltage Input Module Test\
├── HVDATA\
│ └── hvin.dat # Spec database (33 records, engineering MODNAMEs)
└── Released\
├── TESTHV3.BAS # Primary test program (2020)
├── TESTHV4.BAS # Alternate test program (2017)
├── NLIBATE3.BAS # ATE library
└── DBHV.BAS # Database editor (TYPE DBASE definition)
```
## Anti-Patterns (DON'T DO THIS)
**DO NOT hardcode Paper123!@#** - Always fetch from vault:
```bash
bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g'
```
**DO NOT use X: drive in SSH sessions** - It's only mapped under service account. Use UNC path instead:
```powershell
# Wrong:
node database/export-datasheets.js # Fails: "X:\For_Web does not exist"
# Right:
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web"
node database/export-datasheets.js
```
**DO NOT assume hvin.dat lookup works** - Marketing names (SCMHVAS-M0100) ≠ engineering MODNAMEs (SCM5B41-1181). SCMVAS/SCMHVAS use simplified accuracy-only template WITHOUT hvin.dat.
**DO NOT pass 50+ file paths on PowerShell command line** - Hits "Command line too long". Use inline node script with fs.readdirSync instead.
**DO NOT commit testdata.db or large samples** - 4.1GB database is in .gitignore. Keep research samples local only.
**DO NOT use SMB1 on AD2** - Disabled for security. Use SSH/SFTP (port 22) or SMB2+ shares.
**DO NOT expect immediate output from exec_command** - paramiko buffers stdout. Use progress markers or drain at completion.
**DO NOT assume VPN is stable** - Dataforth VPN can drop mid-session. Save work frequently, use local samples for offline analysis.
## Where to Find Things
### Codebase Structure
```
projects/dataforth-dos/
├── datasheet-pipeline/
│ ├── implementation/ # Staged code (approved by Code Review)
│ ├── scmvas-hvas-research/ # Discovery scripts and source files
│ │ ├── source/ # TESTHV3.BAS, hvin.dat, etc.
│ │ ├── samples/ # .DAT and .txt samples (local)
│ │ ├── parse_hvin.py # hvin.dat binary parser
│ │ └── pull-*.py # SSH download scripts
│ └── IMPLEMENTATION_PLAN.md # Approved plan (2026-04-11)
├── deploy/
│ └── deploy-to-ad2.py # Deployment script (vault-based auth)
├── session-logs/
│ ├── 2026-04-12-session.md # SCMVAS/SCMHVAS implementation (DEFINITIVE)
│ └── 2026-04-11-discovery-session.md
└── CONTEXT.md # This file
```
### Production Files on AD2
```
C:\Shares\testdatadb\
├── server.js # Main entry point
├── server/
│ ├── parsers/
│ │ ├── multiline.js # Handles VASLOG .DAT (CSV format)
│ │ ├── vaslog.js # VASLOG-specific logic (new)
│ │ └── spec-reader.js # Spec DB loader (stub for SCMVAS/SCMHVAS)
│ ├── templates/
│ │ └── datasheet-exact.js # Datasheet formatter (SCMVAS/SCMHVAS branch added)
│ └── database/
│ ├── import.js # LOG_TYPES registry, importFiles()
│ └── export-datasheets.js # Batch export script
└── database/
└── testdata.db # SQLite (27k+ records after backfill)
```
## Common Operations
### Deploy Code to AD2
```bash
# From projects/dataforth-dos/deploy/
python3 deploy-to-ad2.py
# What it does:
# 1. Fetches password from vault (D:/vault/scripts/vault.sh)
# 2. Connects via paramiko SFTP to 192.168.0.6:22
# 3. Creates .bak-YYYYMMDD timestamped backups
# 4. Uploads modified files from implementation/
# 5. Restarts testdatadb service via SSH exec_command
# 6. Verifies API responds 200 OK on port 3000
```
**Manual deployment (if script unavailable):**
```bash
# Get password
AD2_PASS=$(bash D:/vault/scripts/vault.sh get-field clients/dataforth/ad2.sops.yaml credentials.password | sed 's/\\//g')
# Connect
sshpass -p "${AD2_PASS}" ssh sysadmin@192.168.0.6
# Backup + copy
cd C:\Shares\testdatadb\server\parsers
copy multiline.js multiline.js.bak-20260414
# ... upload new files via SFTP ...
# Restart service
Restart-Service -Name testdatadb
# Verify
curl http://localhost:3000
```
### Import New Test Data
```bash
# SSH to AD2
ssh sysadmin@192.168.0.6
# Run import for specific log type
cd C:\Shares\testdatadb
node database/import.js
# Import specific files (avoid "Command line too long")
node -e "
const importFiles = require('./server/database/import').importFiles;
const fs = require('fs');
const files = fs.readdirSync('C:/Shares/test/TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested')
.filter(f => f.endsWith('.txt'))
.map(f => 'C:/Shares/test/TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested/' + f);
importFiles(files, 'VASLOG_ENG').then(() => console.log('Done'));
"
```
### Export Datasheets for Web
```bash
# SSH to AD2
ssh sysadmin@192.168.0.6
# Export all pending datasheets
cd C:\Shares\testdatadb
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web" # NOT X:\For_Web in SSH
node database/export-datasheets.js
# Export specific model family
node database/export-datasheets.js --family SCMHVAS
```
### Backfill Historical Data
```bash
# SSH to AD2, run as inline script to avoid command-line length limits
node -e "
const db = require('./server/database/db');
const exportDatasheet = require('./server/templates/datasheet-exact');
db.all(\`
SELECT * FROM test_records
WHERE log_type IN ('VASLOG', 'VASLOG_ENG')
AND exported_at IS NULL
ORDER BY id
\`, (err, rows) => {
if (err) throw err;
console.log(\`[INFO] Found \${rows.length} records to export\`);
let count = 0;
rows.forEach(row => {
try {
exportDatasheet(row);
count++;
if (count % 100 === 0) console.log(\`[PROGRESS] \${count}/\${rows.length}\`);
} catch (e) {
console.error(\`[SKIP] \${row.model_name}: \${e.message}\`);
}
});
console.log(\`[DONE] Exported \${count} datasheets\`);
});
"
```
### Check Service Status
```powershell
# On AD2 (via SSH or RDP)
Get-Service testdatadb
# View service logs (if logging enabled)
Get-EventLog -LogName Application -Source testdatadb -Newest 50
# Test API
Invoke-WebRequest http://localhost:3000 | Select-Object StatusCode
# Check process
Get-Process | Where-Object { $_.ProcessName -like "*node*" }
```
### Access Shares from macOS/Linux
```bash
# Mount AD2 share (SMB2+)
mkdir -p ~/mnt/ad2-testdatadb
mount_smbfs //sysadmin:Password@192.168.0.6/testdatadb ~/mnt/ad2-testdatadb
# Mount AD1 Engineering share
mkdir -p ~/mnt/ad1-engineering
mount_smbfs //sysadmin:Password@192.168.0.27/Engineering ~/mnt/ad1-engineering
# Unmount
umount ~/mnt/ad2-testdatadb
```
## Key Technical Decisions (ADRs)
**2026-04-12:** Use Option C (simple accuracy-only template, no hvin.dat lookup)
- Reason: Marketing names (SCMHVAS-M0100) ≠ engineering MODNAMEs (SCM5B41-1181) in hvin.dat
- Sample datasheets show simple 1-parameter format (Accuracy only)
- Spec-reader stub lets SCMVAS/SCMHVAS pass through pipeline without schema changes
**2026-04-12:** Pass-through for VASLOG_ENG .txt files (not re-render)
- Reason: Engineering-Tested files already match target format exactly
- fs.copyFileSync() guarantees byte-level fidelity, avoids encoding round-trip
- Fallback to writeFileSync(raw_data, 'utf8') if source file missing
**2026-04-12:** Fix recursive=false default regression with `config.recursive !== false`
- Reason: Adding `recursive` field to LOG_TYPES must not break 7 pre-existing families
- Treats absent/undefined as true (legacy behavior), explicit false as false
**2026-04-12:** Vault-based credentials in deploy script (no hardcoding, no prompts)
- Reason: Never commit passwords, even to private repo
- deploy-to-ad2.py calls vault.sh with 30s timeout, fails loud if unavailable
- No env-var fallback, no interactive prompt
**2026-04-12:** MM/DD/YYYY date normalization for datasheet Date field
- Reason: Matches newest Engineering-Tested samples
- Older "Corrected HVAS Files" used MM-DD-YYYY (hyphens) - backfill rewrites with slashes
- Intentional visible change, documented in implementation plan
**2026-04-12:** Patch regex with plain-decimal fallback for QuickBASIC STR$() quirk
- Reason: QB STR$() emits scientific notation for most values, plain decimal for ~1.6%
- Not a version difference or bug - purely QB float-to-string formatting threshold
- Two-regex approach: try scientific first, fall back to plain decimal
## QuickBASIC Artifacts & Log Formats
### VASLOG .DAT Structure
```
"SCMHVAS-M0100 " # Header: model name (marketing, NOT engineering MODNAME)
20,0.0034 # CSV line 1: measurement data
40,0.0126 # CSV line 2
60,-0.0046 # CSV line 3
80,0.0141 # CSV line 4
100,-0.00325 # CSV line 5
"PASS-7.005501E-033",... # Status line: PASS/FAIL + accuracy (scientific OR plain decimal)
"179379-1","04-09-2026" # Footer: serial number, test date (MM-DD-YYYY)
```
### VASLOG_ENG .txt Structure (Engineering-Tested)
```
SCMHVAS - M0100
SN: 171087-1
Date: 04/08/2024
Test: PASS
Accuracy: -7.0055E-03 %
```
### QuickBASIC STR$() Formatting Quirk
```basic
' QB emits TWO formats for floats:
PRINT STR$(-7.005501E-03) ' → "-7.005501E-033" (scientific + status digit)
PRINT STR$(0.01599373) ' → " .01599373" (plain decimal, leading space)
' Threshold: ~0.01 magnitude
' Affects ~1.6% of records (438/27503)
' NOT a bug - documented QB behavior
```
### hvin.dat Binary Format
```
TYPE DBASE (from DBHV.BAS)
MODNAME AS STRING * 13 ' Engineering ID: "SCM5B41-1181 "
INTYPE AS STRING * 3
OUTSIGTYPE AS STRING * 7
WAVESHPCAL AS STRING * 8
' ... 42 SINGLE floats (IEEE 754, 4 bytes each) ...
END TYPE
' Total: 13+3+7+8 + (42*4) = 199 bytes/record
' File size: 6567 bytes = 33 records
```
## Troubleshooting
### "Output directory does not exist: X:\For_Web"
- **Cause:** X: drive only mapped under service account, not in SSH session
- **Fix:** Use UNC path: `\\ad2\webshare\For_Web`
```powershell
$env:OUTPUT_DIR = "\\ad2\webshare\For_Web"
node database/export-datasheets.js
```
### "Command line is too long" (PowerShell)
- **Cause:** Passing 50+ file paths as arguments exceeds PowerShell limit
- **Fix:** Use inline node script with fs.readdirSync (see Common Operations above)
### VPN Drops Mid-Session
- **Symptom:** AD2/AD1 become unreachable, SSH hangs
- **Fix:**
1. Work offline on local samples for analysis
2. Restore VPN (FortiClient)
3. Resume deployment/import when connection stable
### Vault Returns `Paper123\!@#` (Backslash)
- **Cause:** Legacy shell escape stored in ad2.sops.yaml
- **Fix:** Strip backslash at read-time: `sed 's/\\//g'`
- **TODO:** Clean vault entry to remove backslash
### Paramiko "No Output" for Long-Running Commands
- **Cause:** exec_command buffers stdout until completion
- **Fix:** Either:
1. Accept final output when command completes
2. Add progress markers that flush every N records
3. Drain channel periodically: `while not channel.exit_status_ready(): channel.recv(1024)`
### 438 Records Skipped During Backfill
- **Cause:** Plain-decimal format not matching scientific-notation-only regex
- **Fix:** Already patched (2026-04-12). Regex now tries both formats.
- **Verification:** Rerun backfill on stragglers → 438/438 rendered
## Recent Commit History
**2026-04-12 (commit 0dd3d82):** SCMVAS/SCMHVAS pipeline extension
- 114 files changed, 35,486 insertions
- 5 production files modified, 1 new parser
- All research scripts sanitized (vault-based credentials)
- .gitignore updated (exclude testdata.db)
## Useful Links
- **Latest Session:** session-logs/2026-04-12-session.md (DEFINITIVE)
- **Discovery Session:** session-logs/2026-04-11-discovery-session.md
- **Implementation Plan:** datasheet-pipeline/scmvas-hvas-research/IMPLEMENTATION_PLAN.md
- **Credentials (vault):** D:\vault\clients\dataforth\
## Quick Reference - Log Types
| Family | Log Type | Format | Parser | Location |
|--------|----------|--------|--------|----------|
| SCM5B | 5BLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/5BLOG |
| 8B | 8BLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/8BLOG |
| DSCA | DSCLOG | Multiline CSV .DAT | multiline.js | TS-xx/LOGS/DSCLOG |
| SCMVAS | VASLOG | Multiline CSV .DAT | vaslog.js | TS-3R/LOGS/VASLOG |
| SCMHVAS (prod) | VASLOG | Multiline CSV .DAT | vaslog.js | TS-3R/LOGS/VASLOG |
| SCMHVAS (eng) | VASLOG_ENG | .txt (pass-through) | vaslog.js | TS-3R/LOGS/VASLOG/VASLOG - Engineering Tested |
---
**Before starting work:** Read session-logs/2026-04-12-session.md for complete context
**For AD2 access:** Ensure Dataforth VPN connected (FortiClient)
**For credentials:** Always use vault - never hardcode passwords

View File

@@ -0,0 +1,346 @@
# GuruRMM - Project Context
**Last Updated:** 2026-04-14
**Status:** Active Development - Tunnel Phase 1 Complete
## Quick Start - Infrastructure Overview
| Component | Location | Access |
|-----------|----------|--------|
| **Production Server** | 172.16.3.30 (gururmm) | SSH: op://Infrastructure/GuruRMM Server/username |
| **Public API** | https://rmm-api.azcomputerguru.com | Via Cloudflare Tunnel |
| **Internal API** | http://172.16.3.30:3001 | Direct access |
| **Database** | PostgreSQL @ 172.16.3.30:5432/gururmm | op://Infrastructure/GuruRMM Server/PostgreSQL * |
| **Build Server** | Same host (gururmm-build) | Linux native builds only |
| **Agent Downloads** | /var/www/gururmm/downloads/ | Nginx on port 80 |
| **Gitea Repo** | git.azcomputerguru.com/azcomputerguru/gururmm | Active (NOT guru-rmm) |
**All credentials:** `op read "op://Infrastructure/GuruRMM Server/[field]"`
## Current State (READ THIS FIRST)
### Version & Deployment
- **Server:** v0.6.0 (commit c7c8317) - deployed 2026-04-14
- **Agent:** v0.6.0 (Linux + Windows builds) - deployed 2026-04-14
- **Database:** Migrations 001-010 applied
- **Service Status:** gururmm-server.service running (PID 944198)
### Active Work
- **Phase 1 Complete:** Tunnel infrastructure (REST API, WebSocket protocol, database schema, agent state machine)
- **Phase 2 Pending:** Channel implementation (Terminal, File, Registry, Service)
- **Phase 3 Not Started:** Production hardening (rate limiting, timeouts, metrics)
### Agent Fleet Status (as of 2026-04-15 03:20 UTC)
- **Online:** 2/6 agents
- AD2 (Windows 10, v0.6.0) - ID: d28a1c90-47d7-448f-a287-197bc8892234
- DESKTOP-0O8A1RL (Windows 11, v0.6.0) - ID: 0b2527cc-ab3f-49d9-9a06-bfd0b4a613a7
- **Offline:** 4/6 agents
- SL-SERVER: **STUCK IN PENDING UPDATE** - requires manual service restart
### Recent Session Logs (MUST READ BEFORE CONTINUING WORK)
- **2026-04-14:** Tunnel API testing, authentication fix - `session-logs/2026-04-14-session.md`
- **2026-04-02:** Tunnel implementation, update bug fixes - See git history
- **2026-04-01:** Cloudflare Tunnel configuration - See credentials.md
## Anti-Patterns (DON'T DO THIS)
**DO NOT build on macOS** - Binaries won't run on Linux server. SSH to 172.16.3.30 and build natively.
**DO NOT query database directly** - Use Database Agent for ALL database operations (coordinator role).
**DO NOT point downloads URL to port 3001** - API server doesn't serve /downloads. Use nginx (port 80) or public URL.
**DO NOT hardcode credentials** - Always fetch from 1Password: `op read "op://Infrastructure/GuruRMM Server/..."`
**DO NOT create new password utilities** - Use `/tmp/hash_password` (already compiled):
```bash
/tmp/target/release/hash_password "password_here"
# Output: $argon2id$v=19$m=19456,t=2,p=1$...[97 chars]
```
**DO NOT build in CloudeTools repo** - Active repo is `gururmm` on Gitea, not `guru-rmm`.
**DO NOT use emojis** - ASCII markers only: [OK], [ERROR], [WARNING], [SUCCESS], [INFO]
## Where to Find Things
### Codebase Structure
```
projects/msp-tools/guru-rmm/
├── server/ # Rust API server
│ ├── src/
│ │ ├── api/ # REST endpoints
│ │ │ ├── tunnel.rs # Tunnel API (Phase 1 complete)
│ │ │ ├── agents.rs # Agent management
│ │ │ └── auth.rs # Login/JWT
│ │ ├── db/ # Database operations
│ │ │ ├── tunnel.rs # Tunnel queries
│ │ │ └── agents.rs # Agent queries
│ │ ├── ws/ # WebSocket protocol
│ │ │ └── mod.rs # ServerMessage/AgentMessage enums
│ │ └── auth/ # Password hashing (Argon2id)
│ └── migrations/ # Database schema (001-010)
│ └── 010_tunnel_sessions.sql # Tunnel tables (tech_sessions, tunnel_audit)
├── agent/ # Rust agent binary
│ ├── src/
│ │ ├── tunnel/ # Tunnel manager (Phase 1 complete)
│ │ │ └── mod.rs # AgentMode state machine
│ │ ├── updater/ # Self-update system (v0.6.0 fixes applied)
│ │ └── transport/ # WebSocket client
│ └── Cargo.toml
├── session-logs/ # Work history (READ BEFORE STARTING)
└── ROADMAP.md # Feature roadmap
```
### Production Files on Server (172.16.3.30)
- **Binary:** /opt/gururmm/gururmm-server
- **Config:** /opt/gururmm/.env
- **Service:** systemctl status gururmm-server
- **Logs:** journalctl -u gururmm-server -n 100
- **Downloads:** /var/www/gururmm/downloads/ (served by nginx)
### Cloudflare Tunnel Config (Jupiter NAS)
- **Location:** /mnt/cache/appdata/cloudflared/config.yml
- **Hostname:** rmm-api.azcomputerguru.com
- **Target:** http://172.16.3.30 (nginx port 80, NOT API port 3001)
- **Container:** cloudflared (restart to apply changes)
## Common Operations
### Deploy Server Binary
```bash
# SSH to build server
SSH_USER=$(op read "op://Infrastructure/GuruRMM Server/username")
SSH_PASS=$(op read "op://Infrastructure/GuruRMM Server/password")
sshpass -p "${SSH_PASS}" ssh -o StrictHostKeyChecking=no ${SSH_USER}@172.16.3.30
# Build on Linux (native)
cd /opt/gururmm/server
cargo build --release
# Install
sudo systemctl stop gururmm-server
sudo cp target/release/gururmm-server /opt/gururmm/
sudo systemctl start gururmm-server
# Verify
systemctl status gururmm-server
curl http://localhost:3001/health # Should return "OK"
```
### Deploy Agent Binaries
```bash
# SSH to build server
ssh ${SSH_USER}@172.16.3.30
# Build Linux agent
cd /opt/gururmm/agent
cargo build --release --target x86_64-unknown-linux-gnu
# Build Windows agent (cross-compile)
cargo build --release --target x86_64-pc-windows-gnu
# Generate checksums
cd /var/www/gururmm/downloads/
sha256sum gururmm-agent-linux-x64 > gururmm-agent-linux-x64.sha256
sha256sum gururmm-agent-windows-x64.exe > gururmm-agent-windows-x64.exe.sha256
# Agents will auto-update on next heartbeat
```
### Test Tunnel API Endpoints
```bash
# Get JWT token
ADMIN_PASS=$(op read "op://Infrastructure/GuruRMM Server/Admin Password")
TOKEN=$(curl -s http://172.16.3.30:3001/api/auth/login \
-H "Content-Type: application/json" \
-d "{\"email\":\"admin@azcomputerguru.com\",\"password\":\"${ADMIN_PASS}\"}" | \
python3 -c "import sys, json; print(json.load(sys.stdin)['token'])")
# Open tunnel to AD2
curl -s http://172.16.3.30:3001/api/v1/tunnel/open \
-H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type: application/json" \
-d '{"agent_id":"d28a1c90-47d7-448f-a287-197bc8892234"}' | jq '.'
# Get status (save session_id from above)
curl -s http://172.16.3.30:3001/api/v1/tunnel/status/SESSION_ID \
-H "Authorization: Bearer ${TOKEN}" | jq '.'
# Close tunnel
curl -s http://172.16.3.30:3001/api/v1/tunnel/close \
-H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type: application/json" \
-d '{"session_id":"SESSION_ID"}' | jq '.'
```
**Full examples with output:** See session-logs/2026-04-14-session.md (lines 170-230)
### Check Agent Status
```bash
# Get list of agents
curl -s http://172.16.3.30:3001/api/agents \
-H "Authorization: Bearer ${TOKEN}" | jq '.'
# Filter online agents only
curl -s http://172.16.3.30:3001/api/agents \
-H "Authorization: Bearer ${TOKEN}" | \
jq '[.[] | select(.status == "online") | {hostname, agent_version, last_seen}]'
```
### Database Operations (USE DATABASE AGENT)
```bash
# DO NOT query directly - delegate to Database Agent
# Agent will handle credentials and connection automatically
# Example request to Database Agent:
# "Use Database Agent to query tech_sessions table for active tunnels"
```
### Access Database Manually (Emergency Only)
```bash
SSH_USER=$(op read "op://Infrastructure/GuruRMM Server/username")
SSH_PASS=$(op read "op://Infrastructure/GuruRMM Server/password")
PGPASS=$(op read "op://Infrastructure/GuruRMM Server/PostgreSQL Password")
sshpass -p "${SSH_PASS}" ssh -o StrictHostKeyChecking=no ${SSH_USER}@172.16.3.30 \
"PGPASSWORD='${PGPASS}' psql -h localhost -U gururmm -d gururmm"
```
## Key Technical Decisions (ADRs)
**2026-04-14:** Use Argon2id for password hashing (not bcrypt)
- Library: argon2 crate v0.5
- Config: m=19456, t=2, p=1
- Output: 97-character hash string
**2026-04-02:** Tunnel sessions use tech_id FK to users table
- Enables session ownership validation
- Prevents cross-tech session access in multi-tenant environment
- Session status query returns 403 if not owned by requesting tech
**2026-04-01:** Downloads URL points to nginx (port 80), not API (port 3001)
- API server doesn't serve static files
- Nginx configured at /var/www/gururmm/downloads/
- Cloudflare Tunnel routes rmm-api.azcomputerguru.com to nginx
**2026-04-01:** Agent update system uses atomic rename pattern (Unix)
- Eliminates race condition between backup and install
- Copy to temp → chmod +x → rename (atomic)
- Includes rollback on restart failure (v0.6.0 fix)
## Tunnel Architecture (Phase 1 Complete)
### Session Lifecycle
1. Tech opens tunnel: POST /api/v1/tunnel/open → creates tech_session record
2. Server sends TunnelOpen via WebSocket → agent receives
3. Agent transitions Heartbeat → Tunnel mode → sends TunnelReady
4. Tech can now send channel operations (Phase 2, not implemented)
5. Tech closes tunnel: POST /api/v1/tunnel/close → updates tech_session.status='closed'
6. Server sends TunnelClose → agent transitions back to Heartbeat mode
### Database Schema
```sql
-- tech_sessions: Active tunnel sessions
CREATE TABLE tech_sessions (
id SERIAL PRIMARY KEY,
session_id VARCHAR(36) UNIQUE NOT NULL,
tech_id UUID REFERENCES users(id),
agent_id UUID REFERENCES agents(id),
status VARCHAR(20) DEFAULT 'active',
opened_at TIMESTAMPTZ DEFAULT NOW(),
closed_at TIMESTAMPTZ
);
-- Unique constraint: one active session per tech+agent
CREATE UNIQUE INDEX idx_tech_sessions_active
ON tech_sessions(tech_id, agent_id, status) WHERE status = 'active';
-- tunnel_audit: Audit log for tunnel operations
CREATE TABLE tunnel_audit (
id BIGSERIAL PRIMARY KEY,
session_id VARCHAR(36) REFERENCES tech_sessions(session_id),
channel_id VARCHAR(36),
operation VARCHAR(50),
details JSONB,
created_at TIMESTAMPTZ DEFAULT NOW()
);
```
### WebSocket Protocol
```rust
// Server → Agent
enum ServerMessage {
TunnelOpen { session_id: String, tech_id: Uuid },
TunnelClose { session_id: String },
TunnelData { channel_id: String, data: TunnelDataPayload },
}
// Agent → Server
enum AgentMessage {
TunnelReady { session_id: String },
TunnelData { channel_id: String, data: TunnelDataPayload },
TunnelError { channel_id: String, error: String },
}
```
## Roadmap
### Phase 2: Channel Implementation (Next)
- [ ] Terminal channel (shell command execution)
- [ ] File channel (upload/download with progress)
- [ ] Registry channel (Windows registry access)
- [ ] Service channel (Windows service management)
- [ ] WebSocket data forwarding (tech ↔ server ↔ agent)
- [ ] Dashboard UI for tunnel management
### Phase 3: Production Hardening
- [ ] Rate limiting on tunnel operations
- [ ] Session timeout enforcement (max duration)
- [ ] Concurrent session limits per tech
- [ ] Audit log cleanup/archival (retention policy)
- [ ] Metrics collection (session duration, data transferred)
- [ ] Alerting on suspicious tunnel activity
### Backlog
- [ ] Fix SL-SERVER stuck update (manual restart required)
- [ ] Investigate 4 duplicate agent records in database
- [ ] Windows update system testing (scheduled task timing)
- [ ] Agent reconnection on network failure
- [ ] Multi-tenant access control audit
## Useful Links
- **Roadmap:** projects/msp-tools/guru-rmm/ROADMAP.md
- **Latest Session:** session-logs/2026-04-14-session.md
- **Gitea Repo:** http://172.16.3.20:3000/azcomputerguru/gururmm
- **Credentials:** credentials.md (search for "GuruRMM Server")
## Quick Reference - API Endpoints
### Authentication
- POST /api/auth/login - Get JWT token
- POST /api/auth/register - Create first admin (disabled after first user)
- GET /api/auth/me - Get current user info
### Tunnel Management (Phase 1)
- POST /api/v1/tunnel/open - Open tunnel session
- GET /api/v1/tunnel/status/:session_id - Get session status
- POST /api/v1/tunnel/close - Close tunnel session
### Agents
- GET /api/agents - List all agents with details
- GET /api/agents/:id - Get specific agent
- POST /api/agents/:id/move - Move agent to different site
- DELETE /api/agents/:id - Delete agent
### Commands
- POST /api/agents/:id/command - Send command to agent
- GET /api/commands - List command history
- GET /api/commands/:id - Get command result
---
**Before starting work:** Read latest session log in session-logs/ directory
**For context recovery:** Use /context skill to search previous work
**For credentials:** Always use 1Password - never hardcode