Synced files: - Session logs updated - Latest context and credentials - Command/directive updates Machine: Mikes-MacBook-Air.local Timestamp: 2026-03-13 06:39:13 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
375 lines
16 KiB
Markdown
375 lines
16 KiB
Markdown
# Session Log: 2026-03-12 - D2TESTNAS VM Build, NAS Migration, Rsync Sync Fix
|
|
|
|
## Session Summary
|
|
|
|
Major infrastructure session: replaced broken SCP-based sync with rsync, built a new Debian 13 VM to replace the aging ReadyNAS, transferred data, and performed IP cutover. Also investigated BTRFS snapshots on old NAS and began DOS machine testing against new Linux-based NAS.
|
|
|
|
### Key Accomplishments
|
|
1. **Fixed Sync-FromNAS.ps1 on AD2** - Replaced broken SCP with rsync daemon protocol, added guards for stray files (TS-21, TS-3R/HVLOG), added log file write retry for AV locking
|
|
2. **Disabled old SCP scheduled tasks on AD2** - Killed Sync-FromNAS and BulkSync-Catchup tasks
|
|
3. **Built D2TESTNAS replacement VM** on DF-HYPERV-B (Debian 13, Samba SMB1, rsync daemon, BTRFS 512GB data disk)
|
|
4. **Transferred data from old NAS** - test/ data (~24GB+), datasheets, home, 82 snapshots (partial ~43GB logical)
|
|
5. **IP cutover completed** - New VM now at 192.168.0.9, old NAS on DHCP at 192.168.0.117
|
|
6. **WINS/NetBIOS conflict resolved** - Killed nmbd on old NAS, removed auto-restart cron, blocked ports 137/138 via iptables
|
|
|
|
### Key Decisions
|
|
- Chose Hyper-V VM on DF-HYPERV-B over repurposing physical server DF-SVR-D2-SYNC
|
|
- Used BTRFS for data disk with subvolumes for test and datasheets
|
|
- Single rsync stream to avoid overloading old NAS (ARM processor)
|
|
- BTRFS snapshots from old NAS are being flattened (CoW -> full copies) which makes them much larger than ReadyNAS UI reported
|
|
|
|
### Problems Encountered and Solutions
|
|
- **TS-21 stray file**: 1,129-byte DAT file from 2012 existed instead of directory. Renamed, added script guard.
|
|
- **TS-3R/LOGS/HVLOG stray file**: 56-byte file from 2013. Same fix.
|
|
- **Log file locking**: AV locking sync-from-nas.log. Added 3-retry with 100ms delay.
|
|
- **AD2 high latency**: AV causing 685-1056ms ping. Recommended exclusions.
|
|
- **NAS freezing under SSH load**: Power cycled, limited to single rsync stream.
|
|
- **nmbd auto-restart on old NAS**: Cron `*/5 * * * * pgrep -x nmbd || /usr/sbin/nmbd -D`. Removed cron, blocked ports via iptables.
|
|
- **nmcli config didn't save first attempt**: SSH dropped before apply. Re-ran successfully.
|
|
- **DOS Error 53 (network path not found)**: Old NAS still broadcasting D2TESTNAS name. Fixed by killing nmbd and blocking NetBIOS ports.
|
|
|
|
---
|
|
|
|
## Credentials
|
|
|
|
### New D2TESTNAS VM (Debian 13)
|
|
- **IP**: 192.168.0.9 (static via NetworkManager)
|
|
- **SSH**: root / Paper123!@# (also localadmin / Paper123!@#)
|
|
- **SSH Key**: ed25519 generated on VM, public key installed on old NAS
|
|
- **Key fingerprint**: SHA256:S2Eom4RwHS/8YMu+ePnOmDOJxGhIkxJQ2ocR3WsH24o root@D2TESTNAS
|
|
|
|
### Mac SSH Key (add to AD2 and D2TESTNAS)
|
|
```
|
|
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDrGbr4EwvQ4P3ZtyZW3ZKkuDQOMbqyAQUul2+JE4K4S azcomputerguru@local
|
|
```
|
|
**Add to:**
|
|
- AD2: `C:\Users\sysadmin\.ssh\authorized_keys`
|
|
- D2TESTNAS: `/root/.ssh/authorized_keys`
|
|
|
|
### Rsync Daemon (new VM)
|
|
- **Port**: 873
|
|
- **Module**: test = /data/test
|
|
- **User**: rsync
|
|
- **Password**: IQ203s32119
|
|
- **Config**: /etc/rsyncd.conf
|
|
- **Secrets**: /etc/rsyncd.secrets
|
|
|
|
### Samba (new VM)
|
|
- **Shares**: test (/data/test), datasheets (/data/datasheets), snapshots (/data/test/.snapshots)
|
|
- **Protocol**: SMB1 (CORE) through SMB3
|
|
- **Auth**: Guest OK on all shares
|
|
- **Workgroup**: D2TESTING
|
|
- **NetBIOS name**: D2TESTNAS
|
|
- **WINS support**: yes
|
|
|
|
### Old NAS (ReadyNAS)
|
|
- **Current IP**: 192.168.0.117 (DHCP, was 192.168.0.9)
|
|
- **MAC**: 28:C6:8E:34:4B:5E
|
|
- **SSH**: root (key-based auth from new VM)
|
|
- **Status**: nmbd killed, cron cleared, NetBIOS ports blocked via iptables. Samba stopped. SSH still works for rsync transfers.
|
|
|
|
### AD2 (Windows Server)
|
|
- **IP**: 192.168.0.6
|
|
- **Sync script**: C:\Scripts\Sync-FromNAS-rsync.ps1 (deployed, dry-run validated)
|
|
- **Test data path**: C:\Shares\test\
|
|
- **cwRsync**: Installed via Chocolatey
|
|
|
|
### DF-SVR-D2-SYNC (unused physical server)
|
|
- **IP**: 192.168.0.93
|
|
- **Creds**: sysadmin / Paper123!@#
|
|
- **HP ProLiant ML350 G6, 64GB RAM, Server 2019**
|
|
- **SMB share**: NAS-BACKUP (was used temporarily for CIFS backup attempt)
|
|
- **SMB1 enabled** on this server
|
|
|
|
### UDM Network
|
|
- **WINS server**: 192.168.0.9 (configured in UDM DHCP option 44)
|
|
|
|
---
|
|
|
|
## Infrastructure
|
|
|
|
### New D2TESTNAS VM Configuration
|
|
- **Host**: DF-HYPERV-B (dedicated Hyper-V host)
|
|
- **OS**: Debian 13 (Trixie)
|
|
- **Network**: eth0, static 192.168.0.9/24, gateway 192.168.0.1
|
|
- **Disks**:
|
|
- /dev/sda: OS disk
|
|
- /dev/sdb: 512GB BTRFS data disk mounted at /data
|
|
- **BTRFS subvolumes**: test, datasheets (under /data)
|
|
- **Services**: smbd, nmbd, rsync (daemon), sshd, cron
|
|
- **Snapshot cron**:
|
|
```
|
|
0 * * * * /usr/local/bin/btrfs-snapshot.sh test 48
|
|
0 * * * * /usr/local/bin/btrfs-snapshot.sh datasheets 48
|
|
0 0 * * * /usr/local/bin/btrfs-snapshot.sh test 30
|
|
0 0 * * 0 /usr/local/bin/btrfs-snapshot.sh test 12
|
|
```
|
|
|
|
### Key Config Files on New VM
|
|
- `/etc/samba/smb.conf` - Samba config (SMB1/CORE, DOS charset CP437, WINS)
|
|
- `/etc/rsyncd.conf` - rsync daemon (module "test")
|
|
- `/etc/rsyncd.secrets` - rsync auth (rsync:IQ203s32119)
|
|
- `/usr/local/bin/btrfs-snapshot.sh` - BTRFS snapshot script
|
|
|
|
### Data Transfer Status (as of ~18:30)
|
|
- **test/ data (excl snapshots)**: ~24 GB transferred, rsync still running (single stream from .117)
|
|
- **test/ snapshots**: ~43 GB logical transferred (82 snapshots), transfer was stopped to reduce NAS load - needs restart
|
|
- **datasheets/ + snapshots**: Complete (2.3 MB + 82 snapshot dirs)
|
|
- **home/**: Complete (612 KB)
|
|
- **Disk usage**: ~26 GB actual on BTRFS (CoW dedup), 486 GB free
|
|
- **Note**: ReadyNAS UI reported 5.26GB data + 16.28GB snapshots, but actual rsync transfer is MUCH larger due to BTRFS CoW flattening
|
|
|
|
---
|
|
|
|
## Files Created/Modified
|
|
|
|
### New Files
|
|
- `D:\ClaudeTools\projects\dataforth-dos\sync-fixes\Sync-FromNAS-rsync.ps1` - Complete rsync-based replacement sync script (deployed to AD2)
|
|
- `D:\ClaudeTools\projects\dataforth-dos\d2testnas-vm\setup-d2testnas.sh` - 522-line post-install setup script
|
|
- `D:\ClaudeTools\projects\dataforth-dos\d2testnas-vm\README.md` - Hyper-V creation commands, Debian install notes, cutover checklist
|
|
|
|
### Script Fixes Applied (Sync-FromNAS-rsync.ps1)
|
|
1. **Directory-only filter** for NAS station enumeration (line ~125)
|
|
2. **Station path guard** - detects stray files where directories expected
|
|
3. **Log type directory guard** - renames stray files in LOGS subdirs
|
|
4. **Write-Log retry** - 3 attempts with 100ms delay for AV file locking
|
|
|
|
### Deployed to New VM (via SSH)
|
|
- /etc/samba/smb.conf (full Samba config)
|
|
- /etc/rsyncd.conf + /etc/rsyncd.secrets
|
|
- /usr/local/bin/btrfs-snapshot.sh + cron entries
|
|
- SSH key pair generated, public key added to old NAS
|
|
|
|
---
|
|
|
|
## Pending/Incomplete Tasks
|
|
|
|
### Immediate (resume next session)
|
|
1. **Monitor test/ data rsync** - Single stream running from old NAS (.117) to new VM (.9). Check with:
|
|
```bash
|
|
ssh root@192.168.0.9 "ps aux | grep 'rsync -av' | grep -v grep; du -sh /data/test/ --exclude=.snapshots"
|
|
```
|
|
2. **Restart snapshot transfer** after data transfer completes:
|
|
```bash
|
|
ssh root@192.168.0.9 "nohup bash -c 'rsync -av root@192.168.0.117:/data/test/.snapshots/ /data/test/.snapshots/ 2>&1 | tail -5' &"
|
|
```
|
|
3. **Test DOS machine connectivity** - Error 53 was resolved (old NAS NetBIOS killed). Need to reboot DOS machine and test:
|
|
- `NET USE T: \\D2TESTNAS\TEST`
|
|
- Run CTONW.BAT (copy logs to NAS)
|
|
- Run NWTOC.BAT (download updates from NAS)
|
|
- Verify files appear in /data/test/TS-XX/LOGS/ on new VM
|
|
|
|
### After Data Transfer Complete
|
|
4. **Verify data integrity** - Compare file counts/sizes between old and new NAS
|
|
5. **Power off old NAS** once all data confirmed transferred
|
|
6. **Set up scheduled task on AD2** - Create 15-minute scheduled task for Sync-FromNAS-rsync.ps1
|
|
7. **Run real (non-dry) sync on AD2** - Execute Sync-FromNAS-rsync.ps1 without -DryRun flag
|
|
8. **AV exclusions on AD2** - Add exclusions for C:\Shares\test\ and rsync.exe
|
|
|
|
### Nice to Have
|
|
9. **Copy NAS config backup to new VM** (already backed up to DF-SVR-D2-SYNC)
|
|
10. **Datto Workplace SmartBadge research** - Researched that SmartBadge add-in for Excel doesn't exist; Workplace integrates via sync client and web, not Excel plugin
|
|
|
|
---
|
|
|
|
## DOS Machine Data Flow
|
|
|
|
```
|
|
DOS 6.22 (C:\ATE\) --COPY--> T:\MACHINE\LOGS\ (NAS via SMB1)
|
|
|
|
|
v (rsync daemon, port 873)
|
|
AD2 C:\Shares\test\
|
|
|
|
|
v (future: database ingestion)
|
|
MariaDB @ 172.16.3.30
|
|
```
|
|
|
|
### Batch Files (DOS -> NAS)
|
|
- **CTONW.BAT v3.2** - Uses COPY (not XCOPY) to upload log files from C:\ATE\ to T:\MACHINE\LOGS\
|
|
- **NWTOC.BAT v3.5** - Uses COPY to download updates from T:\COMMON\ProdSW\ to C:\BAT\ and C:\ATE\
|
|
- **UPDATE.BAT v2.1** - Uses XCOPY for full machine backup (had /D flag fix for DOS 6.22)
|
|
|
|
### Log Types
|
|
5BLOG, 7BLOG, 8BLOG, DSCLOG, SCTLOG, VASLOG, PWRLOG, HVLOG
|
|
|
|
### Active Stations
|
|
TS-3L (most recent activity), TS-4R, TS-3R, TS-11L, TS-GURU, plus many others
|
|
|
|
---
|
|
|
|
## Reference
|
|
|
|
### Key Commands
|
|
```bash
|
|
# SSH to new D2TESTNAS
|
|
ssh root@192.168.0.9
|
|
|
|
# SSH to old NAS (DHCP)
|
|
ssh root@192.168.0.117
|
|
|
|
# Check rsync transfers on new VM
|
|
ssh root@192.168.0.9 "ps aux | grep rsync | grep -v grep"
|
|
|
|
# Test Samba from Windows
|
|
net view \\192.168.0.9
|
|
smbclient -L //192.168.0.9 -N
|
|
|
|
# Test rsync daemon
|
|
rsync rsync://rsync@192.168.0.9/test/
|
|
|
|
# Restart services on new VM
|
|
ssh root@192.168.0.9 "systemctl restart smbd nmbd rsync"
|
|
|
|
# BTRFS snapshot status
|
|
ssh root@192.168.0.9 "ls /data/test/.snapshots/"
|
|
```
|
|
|
|
### Old NAS Lockdown Commands (already applied)
|
|
```bash
|
|
# Block NetBIOS (prevents name conflict)
|
|
ssh root@192.168.0.117 "iptables -A INPUT -p udp --dport 137 -j DROP; iptables -A INPUT -p udp --dport 138 -j DROP; iptables -A OUTPUT -p udp --sport 137 -j DROP; iptables -A OUTPUT -p udp --sport 138 -j DROP"
|
|
|
|
# Remove auto-restart cron
|
|
ssh root@192.168.0.117 "crontab -r"
|
|
```
|
|
|
|
---
|
|
|
|
## Session Timeline
|
|
- Started: ~14:00 (context recovery from previous session)
|
|
- Rsync script fixes and deployment to AD2
|
|
- Disabled old SCP scheduled tasks
|
|
- Investigated BTRFS snapshots (81 found)
|
|
- Built D2TESTNAS VM on DF-HYPERV-B (Debian 13)
|
|
- Configured all services (Samba, rsync, BTRFS, SSH)
|
|
- Started data transfer from old NAS
|
|
- Killed snapshot transfer to reduce NAS load (single stream)
|
|
- IP cutover: new VM .185 -> .9, old NAS .9 -> DHCP .117
|
|
- Resolved WINS conflict (killed old NAS nmbd, removed cron, blocked ports)
|
|
- DOS machine testing started - Error 53 resolved
|
|
- Data transfer ongoing (~24GB+ transferred, snapshots pending restart)
|
|
- Session saved: ~18:45
|
|
|
|
## Update: ~19:30 - Batch File Fix and DOS Machine Testing
|
|
|
|
### DOS Machine Testing Results
|
|
- All 4 tested machines (TS-3L, TS-3R, TS-4L, TS-4R) connected to new Linux NAS successfully
|
|
- T: drive mapped via NetBIOS name (after killing old NAS nmbd)
|
|
- Files successfully copied (3 .LOG files)
|
|
- BUT: "Bad command or file name" (5x) and "Too many parameters" (5x) errors from IF EXIST/IF NOT EXIST commands
|
|
- Confirmed CTONW.BAT v3.2 on machine, correct line endings (CR+LF verified via DEBUG)
|
|
- Root cause: DOS 6.22 IF EXIST command failing on network paths - likely SMB1 compatibility issue with wildcard queries
|
|
|
|
### Fix Applied: Batch Files v4.0
|
|
Eliminated all IF EXIST/IF NOT EXIST checks from startup batch files. Directories pre-created on server.
|
|
|
|
**CTONW.BAT v4.0** - Direct COPY commands, no IF EXIST guards. Target dirs pre-created on NAS.
|
|
**NWTOC.BAT v4.0** - Direct MD and COPY commands, no IF EXIST guards. MD harmless if dir exists locally.
|
|
**AUTOEXEC.BAT v4.0** - Removed IF EXIST around CALL commands, direct MD for local dirs.
|
|
|
|
All deployed to NAS at `/data/test/COMMON/ProdSW/`. Machines will pick up new versions on next boot via NWTOC download.
|
|
|
|
### Pre-created Directories on NAS
|
|
Ran script to create LOGS/5BLOG, LOGS/7BLOG, LOGS/8BLOG, LOGS/DSCLOG, LOGS/HVLOG, LOGS/PWRLOG, LOGS/SCTLOG, LOGS/VASLOG, and Reports for ALL TS-* station directories.
|
|
|
|
### Old NAS Status
|
|
- DHCP at 192.168.0.117
|
|
- nmbd killed, cron removed, NetBIOS ports 137/138 blocked via iptables
|
|
- rsync data transfer still running (single stream, ~24GB+ transferred)
|
|
- Snapshot transfer stopped (was at ~43GB logical), needs restart after data completes
|
|
|
|
### Pending
|
|
1. Reboot a DOS machine to test v4.0 batch files (second boot needed for NWTOC v4.0)
|
|
2. Monitor data transfer completion (rsync single stream still running as of ~20:00)
|
|
3. Restart snapshot transfer after data completes
|
|
4. Verify test data appears in correct LOGS subdirectories on NAS
|
|
5. Set up AD2 scheduled task for rsync sync
|
|
6. Run real (non-dry) Sync-FromNAS-rsync.ps1 on AD2
|
|
|
|
## Update: ~20:00 - DEPLOY Trailing Space Bug and Data Upload Success
|
|
|
|
### Critical Bug Found: DEPLOY.BAT Trailing Space
|
|
- **Root cause of ALL "Too many parameters" errors**: `ECHO SET MACHINE=%MACHINE% >> C:\AUTOEXEC.BAT` includes the space before `>>` in the output
|
|
- This sets `MACHINE=TS-3L ` (with trailing space) which causes `T:\TS-3L \LOGS\DSCLOG` to be parsed as two parameters
|
|
- **Fix**: DEPLOY v4.1 moves redirect before ECHO: `>>C:\AUTOEXEC.BAT ECHO SET MACHINE=%MACHINE%`
|
|
- First line uses `>` (overwrite), rest use `>>` (append)
|
|
- DEPLOY v4.1 deployed to NAS at `/data/test/COMMON/ProdSW/DEPLOY.BAT`
|
|
|
|
### Samba Case Sensitivity - Confirmed OK
|
|
- `smb.conf` has `case sensitive = no` and `default case = upper`
|
|
- No duplicate directories (only `TS-4L` exists, not `ts-4L`)
|
|
|
|
### TS-3L Deploy Test
|
|
- Ran `T:\UPDATE TS-3L` which calls DEPLOY v4.0 (before trailing space fix)
|
|
- DEPLOY completed, files confirmed v4.0 on machine via TYPE
|
|
- After reboot: NAS still showed old CTONW.LOG/NWTOC.LOG - MACHINE had trailing space
|
|
- Running CTONW manually showed 9x "Too many parameters" on all COPY-to-subdirectory lines
|
|
- `COPY C:\ATE\*.LOG T:\%MACHINE%` worked (no subdirectory in path) but `COPY ... T:\%MACHINE%\LOGS\DSCLOG` failed
|
|
- This confirmed the trailing space theory - space before `\LOGS\` splits the path
|
|
|
|
### TS-4L Data Upload - SUCCESS
|
|
- TS-4L uploaded data at 20:10 with clean MACHINE variable (no trailing space)
|
|
- **84 test data files uploaded to NAS:**
|
|
- 5BLOG: 20 files
|
|
- 7BLOG: 29 files (historical .SHT files)
|
|
- 8BLOG: 10 files
|
|
- DSCLOG: 21 files (including today's 38-02.DAT from 03-12-26)
|
|
- SCTLOG: 2 files
|
|
- VASLOG: 2 files
|
|
- **90+ work-order Reports** (.TXT files) uploaded to TS-4L/Reports/
|
|
- **3 LOG files** (NWTOC.LOG, CTONW.LOG, CTONWTXT.LOG)
|
|
- CTONW.LOG confirms: `CTONW.BAT v4.0 / Machine: TS-4L` (no trailing space)
|
|
|
|
### Original STARTNET.BAT Found (from TS-3L backup)
|
|
The actual STARTNET.BAT on DOS machines loads network drivers manually:
|
|
```
|
|
LH /L:0;1,45472 /S c:\net\smartdrv.exe /q
|
|
c:\net\net initialize
|
|
c:\net\netbind.com
|
|
lh c:\net\umb.com
|
|
c:\net\tcptsr.exe
|
|
c:\net\tinyrfc.exe
|
|
c:\net\nmtsr.exe
|
|
c:\net\emsbfr.exe
|
|
c:\net\net start
|
|
net use T: \\d2testnas\test
|
|
net use X: \\d2testnas\datasheets
|
|
```
|
|
- `net start` prompts for computer name (pre-populated from SYSTEM.INI)
|
|
- Could add `/y` flag to suppress prompt, or use MACHINE variable
|
|
- Our v2.0 STARTNET.BAT on ProdSW is a simplified rewrite that was never deployed to machines
|
|
|
|
### T:\UPDATE.BAT
|
|
- Tiny 4-line wrapper at root of test share: `CALL T:\COMMON\ProdSW\DEPLOY.BAT %1`
|
|
- Allows running `T:\UPDATE TS-3L` from DOS machines
|
|
|
|
### Rsync Transfer Status
|
|
- Single stream still running from old NAS (.117) to new VM (.9)
|
|
- Snapshot transfer still pending restart
|
|
|
|
### Files Modified This Update
|
|
- `D:\ClaudeTools\projects\dataforth-dos\batch-files\DEPLOY.BAT` - v4.1 (trailing space fix)
|
|
- Deployed to NAS at `/data/test/COMMON/ProdSW/DEPLOY.BAT`
|
|
|
|
---
|
|
|
|
## Pending/Incomplete Tasks (Updated)
|
|
|
|
### Immediate
|
|
1. **Re-deploy TS-3L with DEPLOY v4.1** - needs new deploy + reboot to fix trailing space
|
|
2. **Set up AD2 rsync scheduled task** - Sync-FromNAS-rsync.ps1 deployed but no task created (15-min interval planned)
|
|
3. **Run real (non-dry) sync** of Sync-FromNAS-rsync.ps1 on AD2
|
|
4. **Database ingestion pipeline** - No ingestion exists yet. Data flows: NAS -> AD2 -> MariaDB @ 172.16.3.30
|
|
|
|
### After Data Transfer Complete
|
|
5. **Monitor old NAS rsync completion** - single stream still running
|
|
6. **Restart snapshot transfer** after data completes
|
|
7. **Verify data integrity** - compare file counts between old and new NAS
|
|
8. **Power off old NAS** once confirmed
|
|
|
|
### Batch File Updates Needed
|
|
9. **UPDATE.BAT** (in ProdSW) - has IF EXIST checks, needs v4.0 treatment
|
|
10. **ATESYNC.BAT / ATESYNCD.BAT** - have IF EXIST checks, need v4.0 treatment (not currently called by AUTOEXEC)
|
|
11. **STARTNET.BAT** - consider deploying updated version or adding `/y` to suppress net start prompt
|
|
12. **AV exclusions on AD2** - add exclusions for C:\Shares\test\ and rsync.exe
|