Session log + DFWDS Node port + Hoffman API uploader pipeline

Built the missing piece between the test datasheet pipeline and Dataforth's
new product API. End-to-end:

- Pulled DFWDS (Dataforth Web Datasheet System) VB6 source from
  AD1\Engineering\ENGR\ATE\Test Datasheets\DFWDS to local for analysis
- Decoded its filename validation: A-J prefix decodes (A=10..J=19), all-
  numeric WO# valid (no leading 0), anything else bad
- Ported the validation + move logic to Node (dfwds-process.js)
- Built bulk uploader (upload-delta.js) for Hoffman's Swagger API
  (POST /api/v1/TestReportDataFiles/bulk with OAuth client_credentials)

Sanitized 3 prior reference scripts (fetch-server-inventory, test-scenarios,
test-upload-two) to read CF_* env vars instead of hardcoded creds.

Live drain results:
- 897 files moved Test_Datasheets -> For_Web (all valid, no renames, no
  bad), DFWDS port summary in 1.1s
- Pushed entire For_Web (7,061 files) to Hoffman API in 49.7s @ 142/s:
  Created=803 Updated=114 Unchanged=6,144 Errors=0
- Server count: 489,579 -> 490,382 (+803 net new)

Also:
- Added clients/dataforth/.gitignore to exclude plaintext Oauth.txt note
- Added clients/instrumental-music-center/docs/2026-04-13-ticket-notes.md
  (ticket write-up of 2026-04-11/12/13 IMC1 RDS removal/SQL migration work)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-14 21:06:20 -07:00
parent 72105233a2
commit dd5c5afd4b
80 changed files with 13466 additions and 0 deletions

View File

@@ -0,0 +1,122 @@
# IMC1 Maintenance — Ticket Notes (2026-04-11 through 2026-04-13)
**Client:** Instrumental Music Center (IMC)
**Server:** IMC1 (Windows Server 2016, AD DC + AIMsi SQL host + RDS), 192.168.0.2
**Dates worked:** Saturday 2026-04-11, Sunday 2026-04-12, finalized Monday 2026-04-13
---
## Original request
Remove RDS role from IMC1 in preparation for a planned Server 2019 upgrade.
## Outcome (short version)
RDS role **could not be removed** — blocked by an underlying Windows component-store corruption that also affects Cumulative Update apply-on-boot. The 2019 upgrade path has to be reconsidered.
Parallel/opportunistic work completed while diagnosing: **716 GB recovered on E:** (stale SQL backup cleanup), **GFS retention automated**, **AIMsi databases moved to dedicated SSD (S:)**, **out-of-band SSH access set up** for future work.
---
## Work performed — billable
### 1. Remote-access groundwork
- Installed OpenSSH Server on IMC1 from official GitHub release (Windows built-in `Add-WindowsCapability` install was a ghost — binaries never landed, also a symptom of the component-store corruption)
- Registered `sshd` + `ssh-agent` services, opened TCP/22 in the firewall
- Configured key-based auth (ed25519) via `C:\ProgramData\ssh\administrators_authorized_keys` with proper ACLs
- Set PowerShell as default SSH shell
- Identified & documented a routing conflict: Tailscale's `pfsense-2` subnet-router was advertising `192.168.0.0/24` with lower metric than the OpenVPN tunnel, making IMC1 unreachable via VPN. Workaround: disconnect Tailscale when using IMC OpenVPN.
### 2. SQL backup cleanup — 716 GB freed on E:
- Inventoried `E:\SQL\MSSQL14.SQLEXPRESS\MSSQL\Backup\`: **66 nightly full backups, 905 GB total**, covering 2026-02-01 through 2026-04-11
- Verified the off-site Cloudberry/MSP360 backup was intact before deleting locally
- Applied GFS retention manually: kept 14 dailies + 1st-of-month (16 files, 189 GB); deleted the other 50 (716 GB)
- Noted the per-backup size dropping from ~15 GB to ~11 GB around 2026-03-28 (someone archived or cleaned source data that day; unrelated to this work)
### 3. Automated backup retention — going forward
- Wrote `C:\Scripts\Clean-AimsiBackups.ps1` implementing the same GFS policy (14 dailies + monthlies on day-1)
- Hard safety guards: minimum of 3 newest files always kept, filename-pattern validation, per-run logs to `C:\Scripts\Logs\aimsi-retention-YYYYMM.log`
- Registered Windows Scheduled Task `IMC AIMsi Backup Retention`: runs daily 23:30 as SYSTEM, 1-hour execution limit
- Test run completed successfully; script will keep backup footprint stable automatically from now on
### 4. AIMsi SQL database relocation (C: → S:)
- Elevated `IMC\guru` to sysadmin on the `AIMSQL` instance via single-user-mode recovery (only Windows admins had sysadmin by default; `IMC\guru` wasn't in that group historically)
- Moved the three user databases from `C:` to the dedicated Samsung 850 PRO SSD (`S:\SQL\Data\`):
- `AIM.mdf` — 8.6 GB — production AIMsi database
- `IMC.mdf` — 9.8 GB — legacy (usage unclear; kept for safety)
- `TestConv61223.mdf` — 8.8 GB — leftover from a 2023-06-12 migration test; candidate for drop
- Moved `tempdb` + cleaned up its orphaned files on C:
- Left system DBs (`master`, `model`, `msdb`) on C: — moving `master` requires modifying SQL Server startup parameters and the benefit is marginal vs risk
- Verified AIM client launch after the move; 4 typical concurrent users connected fine
- **Disk impact:** `C:` 322 → 278 GB used (44 GB); `S:` 27 → 53 GB used (+26 GB)
### 5. RDS removal (parked — deeper than scoped)
Root error during `Uninstall-WindowsFeature RDS-RD-Server`: `0x80073701 ERROR_SXS_ASSEMBLY_MISSING`.
Things we tried (in order):
1. `DISM /Online /Cleanup-Image /RestoreHealth` → failed Error 14 (internally `E_OUTOFMEMORY 0x8007000e` from an oversized 168 MB `COMPONENTS` registry hive — normal is 3050 MB)
2. Same with explicit `/ScratchDir` → failed `E_ACCESSDENIED` because BITS + `wuauserv` were stopped
3. Started BITS/wuauserv and retried → BITS idles-and-auto-stops on Server 2016; DISM still couldn't fetch payloads
4. `/Source:WIM:E:\W2016\sources\install.wim:2 /LimitAccess` (install media on E:) → failed `CBS_E_SOURCE_MISSING`. The E:\W2016 image is RTM (14393.0); the damaged assembly is from a post-RTM Cumulative Update, so the RTM source doesn't have the right version
5. Extracted KB5075999 (Feb 2026 CU) from a local MSU and ran `DISM /Online /Add-Package`**staged successfully (S_OK)**, but on reboot the apply phase failed with `HRESULT_FROM_WIN32(15010) ERROR_EVT_INVALID_EVENT_DATA` at `onecore\admin\wmi\events\config\manproc.cpp:733` — the ETW event manifest for provider GUID `{9c2a37f3-e5fd-5cae-bcd1-43dafeee1ff0}` is malformed → `CBS_E_INSTALLERS_FAILED` → full rollback
**Decision:** parked. The component-store corruption is isolated (server otherwise healthy) but it prevents both RDS removal AND CU application. It's deeper than the originally scoped RDS-removal work, and needs a planning conversation before committing more time.
### 6. Minor housekeeping
- Re-created the missing `C:\Users\guru\Downloads` folder (registry pointed there, the folder itself was gone)
---
## Current state of IMC1
| Area | State |
|---|---|
| AIMsi production | healthy, clients connecting |
| SQL instance | running; user DBs on S:, system DBs on C: |
| SQL backups | retained by scheduled script; 189 GB footprint (vs prior 905 GB) |
| C: drive | 278/419 GB used (~66%) — comfortable |
| S: drive | 53/238 GB used |
| RDS role | still installed (removal blocked) |
| OS updates | latest successful CU prior to this session; KB5075999 rolled back |
| SSH access | working, key-based, admin group |
---
## Open items / recommendations (for the next conversation)
**Decision needed on the Server 2019 upgrade path.** Three realistic options:
| Path | Rough effort | Notes |
|---|---|---|
| **A. Repair the component store, then retry RDS removal + in-place 2019 upgrade** | 24 hours, uncertain | Next step is to identify which KB owns the malformed provider GUID `{9c2a37f3-e5fd-5cae-bcd1-43dafeee1ff0}`, re-register its manifest via `wevtutil im`, retry. If the hive is just too damaged, this path fails the same way. |
| **B. In-place Server 2019 upgrade without fixing the corruption** | 23 hours attempt; high rollback risk | Microsoft's in-place upgrade rewrites system files wholesale; in practice it often resolves corruptions like this one, but if the upgrade itself encounters the same provider-manifest issue, we're worse off. |
| **C. Clean Server 2019 build + AD / SQL / file share / RDS migration** | 610 hours | Lowest risk of mid-flight breakage. Most predictable. Requires a cutover window. This is what I'd recommend if the upgrade is planned anyway. |
**Also pending (independent of the upgrade):**
- Verify the `IMC` database (9.8 GB) is actively being used; drop if not
- Drop `TestConv61223` (8.8 GB leftover from 2023-06-12 migration test) once confirmed unused
- Disable SMB1: `Set-SmbServerConfiguration -EnableSMB1Protocol $false` (currently enabled — security hygiene)
- `IMC2`, `IMC-VM` in AD — last-logon 2023 and 2021 respectively, likely decommissioned; clean up if confirmed
- `SERVERIMC` (192.168.0.63) — its role/status is unclear from AD, worth verifying what it's doing
---
## Key paths and references (for the next tech)
- SSH: `ssh IMC\guru@192.168.0.2` (ed25519 key, PowerShell shell)
- Authorized keys file: `C:\ProgramData\ssh\administrators_authorized_keys`
- Retention script: `C:\Scripts\Clean-AimsiBackups.ps1`
- Retention logs: `C:\Scripts\Logs\aimsi-retention-YYYYMM.log`
- Scheduled task: `IMC AIMsi Backup Retention` (daily 23:30, SYSTEM)
- SQL instance: `IMC1\AIMSQL` (MSSQL15 = SQL Server 2019 Express, despite folder name)
- SQL data: `S:\SQL\Data\` (user DBs); `C:\Program Files\Microsoft SQL Server\MSSQL15.AIMSQL\MSSQL\DATA\` (system DBs)
- SQL backups: `E:\SQL\MSSQL14.SQLEXPRESS\MSSQL\Backup\` (yes, MSSQL14 folder hosting a MSSQL15 instance's backups — legacy)
- DISM scratch + extracted KB payload: `C:\DISMScratch\`
- Server 2016 install media (RTM): `E:\W2016\sources\install.wim`
- Full session log: `clients/instrumental-music-center/session-logs/2026-04-12-imc1-cleanup-and-sql-move.md`
## Credentials
- Domain admin + AIMSQL sysadmin: `IMC\guru` (password handled verbally, not stored in this file)
- Local admin + SSH: same account
- `sa` on AIMSQL: exists and enabled; password unknown (tried one candidate, wrong, no lockout triggered)