Files
claudetools/clients/instrumental-music-center/docs/2026-04-13-ticket-notes.md
Mike Swanson dd5c5afd4b Session log + DFWDS Node port + Hoffman API uploader pipeline
Built the missing piece between the test datasheet pipeline and Dataforth's
new product API. End-to-end:

- Pulled DFWDS (Dataforth Web Datasheet System) VB6 source from
  AD1\Engineering\ENGR\ATE\Test Datasheets\DFWDS to local for analysis
- Decoded its filename validation: A-J prefix decodes (A=10..J=19), all-
  numeric WO# valid (no leading 0), anything else bad
- Ported the validation + move logic to Node (dfwds-process.js)
- Built bulk uploader (upload-delta.js) for Hoffman's Swagger API
  (POST /api/v1/TestReportDataFiles/bulk with OAuth client_credentials)

Sanitized 3 prior reference scripts (fetch-server-inventory, test-scenarios,
test-upload-two) to read CF_* env vars instead of hardcoded creds.

Live drain results:
- 897 files moved Test_Datasheets -> For_Web (all valid, no renames, no
  bad), DFWDS port summary in 1.1s
- Pushed entire For_Web (7,061 files) to Hoffman API in 49.7s @ 142/s:
  Created=803 Updated=114 Unchanged=6,144 Errors=0
- Server count: 489,579 -> 490,382 (+803 net new)

Also:
- Added clients/dataforth/.gitignore to exclude plaintext Oauth.txt note
- Added clients/instrumental-music-center/docs/2026-04-13-ticket-notes.md
  (ticket write-up of 2026-04-11/12/13 IMC1 RDS removal/SQL migration work)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 21:06:50 -07:00

123 lines
8.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# IMC1 Maintenance — Ticket Notes (2026-04-11 through 2026-04-13)
**Client:** Instrumental Music Center (IMC)
**Server:** IMC1 (Windows Server 2016, AD DC + AIMsi SQL host + RDS), 192.168.0.2
**Dates worked:** Saturday 2026-04-11, Sunday 2026-04-12, finalized Monday 2026-04-13
---
## Original request
Remove RDS role from IMC1 in preparation for a planned Server 2019 upgrade.
## Outcome (short version)
RDS role **could not be removed** — blocked by an underlying Windows component-store corruption that also affects Cumulative Update apply-on-boot. The 2019 upgrade path has to be reconsidered.
Parallel/opportunistic work completed while diagnosing: **716 GB recovered on E:** (stale SQL backup cleanup), **GFS retention automated**, **AIMsi databases moved to dedicated SSD (S:)**, **out-of-band SSH access set up** for future work.
---
## Work performed — billable
### 1. Remote-access groundwork
- Installed OpenSSH Server on IMC1 from official GitHub release (Windows built-in `Add-WindowsCapability` install was a ghost — binaries never landed, also a symptom of the component-store corruption)
- Registered `sshd` + `ssh-agent` services, opened TCP/22 in the firewall
- Configured key-based auth (ed25519) via `C:\ProgramData\ssh\administrators_authorized_keys` with proper ACLs
- Set PowerShell as default SSH shell
- Identified & documented a routing conflict: Tailscale's `pfsense-2` subnet-router was advertising `192.168.0.0/24` with lower metric than the OpenVPN tunnel, making IMC1 unreachable via VPN. Workaround: disconnect Tailscale when using IMC OpenVPN.
### 2. SQL backup cleanup — 716 GB freed on E:
- Inventoried `E:\SQL\MSSQL14.SQLEXPRESS\MSSQL\Backup\`: **66 nightly full backups, 905 GB total**, covering 2026-02-01 through 2026-04-11
- Verified the off-site Cloudberry/MSP360 backup was intact before deleting locally
- Applied GFS retention manually: kept 14 dailies + 1st-of-month (16 files, 189 GB); deleted the other 50 (716 GB)
- Noted the per-backup size dropping from ~15 GB to ~11 GB around 2026-03-28 (someone archived or cleaned source data that day; unrelated to this work)
### 3. Automated backup retention — going forward
- Wrote `C:\Scripts\Clean-AimsiBackups.ps1` implementing the same GFS policy (14 dailies + monthlies on day-1)
- Hard safety guards: minimum of 3 newest files always kept, filename-pattern validation, per-run logs to `C:\Scripts\Logs\aimsi-retention-YYYYMM.log`
- Registered Windows Scheduled Task `IMC AIMsi Backup Retention`: runs daily 23:30 as SYSTEM, 1-hour execution limit
- Test run completed successfully; script will keep backup footprint stable automatically from now on
### 4. AIMsi SQL database relocation (C: → S:)
- Elevated `IMC\guru` to sysadmin on the `AIMSQL` instance via single-user-mode recovery (only Windows admins had sysadmin by default; `IMC\guru` wasn't in that group historically)
- Moved the three user databases from `C:` to the dedicated Samsung 850 PRO SSD (`S:\SQL\Data\`):
- `AIM.mdf` — 8.6 GB — production AIMsi database
- `IMC.mdf` — 9.8 GB — legacy (usage unclear; kept for safety)
- `TestConv61223.mdf` — 8.8 GB — leftover from a 2023-06-12 migration test; candidate for drop
- Moved `tempdb` + cleaned up its orphaned files on C:
- Left system DBs (`master`, `model`, `msdb`) on C: — moving `master` requires modifying SQL Server startup parameters and the benefit is marginal vs risk
- Verified AIM client launch after the move; 4 typical concurrent users connected fine
- **Disk impact:** `C:` 322 → 278 GB used (44 GB); `S:` 27 → 53 GB used (+26 GB)
### 5. RDS removal (parked — deeper than scoped)
Root error during `Uninstall-WindowsFeature RDS-RD-Server`: `0x80073701 ERROR_SXS_ASSEMBLY_MISSING`.
Things we tried (in order):
1. `DISM /Online /Cleanup-Image /RestoreHealth` → failed Error 14 (internally `E_OUTOFMEMORY 0x8007000e` from an oversized 168 MB `COMPONENTS` registry hive — normal is 3050 MB)
2. Same with explicit `/ScratchDir` → failed `E_ACCESSDENIED` because BITS + `wuauserv` were stopped
3. Started BITS/wuauserv and retried → BITS idles-and-auto-stops on Server 2016; DISM still couldn't fetch payloads
4. `/Source:WIM:E:\W2016\sources\install.wim:2 /LimitAccess` (install media on E:) → failed `CBS_E_SOURCE_MISSING`. The E:\W2016 image is RTM (14393.0); the damaged assembly is from a post-RTM Cumulative Update, so the RTM source doesn't have the right version
5. Extracted KB5075999 (Feb 2026 CU) from a local MSU and ran `DISM /Online /Add-Package`**staged successfully (S_OK)**, but on reboot the apply phase failed with `HRESULT_FROM_WIN32(15010) ERROR_EVT_INVALID_EVENT_DATA` at `onecore\admin\wmi\events\config\manproc.cpp:733` — the ETW event manifest for provider GUID `{9c2a37f3-e5fd-5cae-bcd1-43dafeee1ff0}` is malformed → `CBS_E_INSTALLERS_FAILED` → full rollback
**Decision:** parked. The component-store corruption is isolated (server otherwise healthy) but it prevents both RDS removal AND CU application. It's deeper than the originally scoped RDS-removal work, and needs a planning conversation before committing more time.
### 6. Minor housekeeping
- Re-created the missing `C:\Users\guru\Downloads` folder (registry pointed there, the folder itself was gone)
---
## Current state of IMC1
| Area | State |
|---|---|
| AIMsi production | healthy, clients connecting |
| SQL instance | running; user DBs on S:, system DBs on C: |
| SQL backups | retained by scheduled script; 189 GB footprint (vs prior 905 GB) |
| C: drive | 278/419 GB used (~66%) — comfortable |
| S: drive | 53/238 GB used |
| RDS role | still installed (removal blocked) |
| OS updates | latest successful CU prior to this session; KB5075999 rolled back |
| SSH access | working, key-based, admin group |
---
## Open items / recommendations (for the next conversation)
**Decision needed on the Server 2019 upgrade path.** Three realistic options:
| Path | Rough effort | Notes |
|---|---|---|
| **A. Repair the component store, then retry RDS removal + in-place 2019 upgrade** | 24 hours, uncertain | Next step is to identify which KB owns the malformed provider GUID `{9c2a37f3-e5fd-5cae-bcd1-43dafeee1ff0}`, re-register its manifest via `wevtutil im`, retry. If the hive is just too damaged, this path fails the same way. |
| **B. In-place Server 2019 upgrade without fixing the corruption** | 23 hours attempt; high rollback risk | Microsoft's in-place upgrade rewrites system files wholesale; in practice it often resolves corruptions like this one, but if the upgrade itself encounters the same provider-manifest issue, we're worse off. |
| **C. Clean Server 2019 build + AD / SQL / file share / RDS migration** | 610 hours | Lowest risk of mid-flight breakage. Most predictable. Requires a cutover window. This is what I'd recommend if the upgrade is planned anyway. |
**Also pending (independent of the upgrade):**
- Verify the `IMC` database (9.8 GB) is actively being used; drop if not
- Drop `TestConv61223` (8.8 GB leftover from 2023-06-12 migration test) once confirmed unused
- Disable SMB1: `Set-SmbServerConfiguration -EnableSMB1Protocol $false` (currently enabled — security hygiene)
- `IMC2`, `IMC-VM` in AD — last-logon 2023 and 2021 respectively, likely decommissioned; clean up if confirmed
- `SERVERIMC` (192.168.0.63) — its role/status is unclear from AD, worth verifying what it's doing
---
## Key paths and references (for the next tech)
- SSH: `ssh IMC\guru@192.168.0.2` (ed25519 key, PowerShell shell)
- Authorized keys file: `C:\ProgramData\ssh\administrators_authorized_keys`
- Retention script: `C:\Scripts\Clean-AimsiBackups.ps1`
- Retention logs: `C:\Scripts\Logs\aimsi-retention-YYYYMM.log`
- Scheduled task: `IMC AIMsi Backup Retention` (daily 23:30, SYSTEM)
- SQL instance: `IMC1\AIMSQL` (MSSQL15 = SQL Server 2019 Express, despite folder name)
- SQL data: `S:\SQL\Data\` (user DBs); `C:\Program Files\Microsoft SQL Server\MSSQL15.AIMSQL\MSSQL\DATA\` (system DBs)
- SQL backups: `E:\SQL\MSSQL14.SQLEXPRESS\MSSQL\Backup\` (yes, MSSQL14 folder hosting a MSSQL15 instance's backups — legacy)
- DISM scratch + extracted KB payload: `C:\DISMScratch\`
- Server 2016 install media (RTM): `E:\W2016\sources\install.wim`
- Full session log: `clients/instrumental-music-center/session-logs/2026-04-12-imc1-cleanup-and-sql-move.md`
## Credentials
- Domain admin + AIMSQL sysadmin: `IMC\guru` (password handled verbally, not stored in this file)
- Local admin + SSH: same account
- `sa` on AIMSQL: exists and enabled; password unknown (tried one candidate, wrong, no lockout triggered)