Files
claudetools/clients/instrumental-music-center/PROJECT_STATE.md
Howard Enos f8c6b4b9ca sync: auto-sync from HOWARD-HOME at 2026-05-06 13:46:20
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-05-06 13:46:20
2026-05-06 13:46:23 -07:00

97 lines
7.6 KiB
Markdown

# Instrumental Music Center (IMC) — Project State
> READ THIS before starting work on this client.
> UPDATE THIS when you begin work (claim a lock) and when you finish (release lock + log changes).
> Last updated: 2026-05-06
---
## Active Session Locks
| Session | Working On | Status | Started |
|---------|-----------|--------|---------|
| _(none active)_ | | | |
**How to claim a lock:** Add a row before starting work. Remove it when done. Locks older than 2 hours with no update are considered stale.
---
## Current State
**Status:** ACTIVE
**Last Activity:** 2026-04-28
Music retail + repair shop running AIMsi POS on-prem. Primary server IMC1 (Dell R720, Windows Server 2016, DC for IMC.local). **Production AIM database lives on `IMC1\SQLEXPRESS` (TCP 61151), DB `IMCAIM`, service account `IMC\AIM` — note the instance name is misleading: `SERVERPROPERTY('Edition')` confirms SQL Server 2019 Standard, not Express.** A second instance `IMC1\AIMSQL` (true SQL 2019 Express, port 63116) is an orphan from a 2023 conversion test — hosts only legacy `AIM` / `IMC` / `TestConv61223` DBs, zero active clients (verified 2026-05-06). AIMsi DBs on dedicated SSD (`S:\SQL\Data\`). Local SQL backups nightly; Cloudberry off-site backups configured.
**Personnel:** Manda is the new General Manager (replacing Michael Santander, already deactivated). Manda's new laptop `DESKTOP-KRHQ5TS` provisioned 2026-04-28 (AIMsi `USER#=4` per Leslie).
**Known issues:**
- `IMC1` component store corruption (0x80073701) blocking RDS role removal — Server 2019 migration decision pending.
- **`ServerIMC` (192.168.0.63) — phantom/broken DC.** Registered as a DC in DNS (A + SRV records), responds to ICMP, but **TCP/389 LDAP and TCP/88 Kerberos refuse connections.** The DC locator round-robins between IMC1 and ServerIMC; clients picking ServerIMC time out. **This degrades authentication for every domain user at IMC** — intermittent slow logons, GPO failures, and was the root cause of the 2026-04-22 remote domain-join failure for `DESKTOP-KRHQ5TS`. Needs investigation: real-but-broken DC (repair AD services) or ghost from a demoted DC (`ntdsutil` metadata cleanup). Was flagged as "unclear" on 2026-04-13; promoted to confirmed issue 2026-04-28.
---
## Infrastructure / Access
| Resource | Address | Notes |
|----------|---------|-------|
| IMC1 (primary server) | 192.168.0.2 | Windows Server 2016, DC (IMC.local), Dell R720 |
| SQL instance — PRODUCTION | IMC1\SQLEXPRESS (TCP 61151) | **SQL 2019 Standard** (misleading instance name). DB `IMCAIM`. Service account `IMC\AIM`. Workstations connect here. |
| SQL instance — orphan | IMC1\AIMSQL (TCP 63116) | SQL 2019 Express GDR 15.0.2165.1. Holds legacy 2023 conversion DBs (`AIM`, `IMC`, `TestConv61223`). No active clients. Consolidation candidate. |
| SQL instance — system | IMC1\MICROSOFT##WID | Windows Internal DB (WSUS / AD RMS). Pages out under host pressure (17890 events). Consolidation candidate if WSUS unused. |
**SSH:** `ssh IMC\guru@192.168.0.2` (ed25519 key, PowerShell default shell)
**VPN:** OpenVPN (.ovpn profile) — disconnect Tailscale first (192.168.0.0/24 conflict). **Avoid remote domain-join over OpenVPN if your local LAN is also `192.168.0.0/24`** — DNS multi-homing race + the ServerIMC phantom-DC make it unreliable. Go onsite instead.
**Domain admin:** `IMC\guru` — also SQL sysadmin (added via single-user recovery 2026-04-12)
**Syncro customer ID:** 7088508 (prepay block: 12.5 hrs as of 2026-04-28)
**Primary contact:** Leslie Stirm — leslie@imc-az.com (Syncro contact_id 731730)
**Credentials:** vault `clients/imc/imc1.sops.yaml`
**Disks:** C: (OS, 77% full), E: (SQL backups + installers), F: (Windows Image Backups), S: (SSD — AIMsi DBs)
**Backup:** Nightly SQL at 22:00 to `E:\SQL\...\Backup\`; retention script `C:\Scripts\Clean-AimsiBackups.ps1` (14 dailies + 1st-of-month); Cloudberry at `C:\ProgramData\Online Backup\`
**Known issues:**
- Component store corrupted (0x80073701) — blocks RDS role removal; ETW manifest error on reboot
- C: drive at ~77% full — monitor
- SMB1 still enabled — disable when time permits
- `TestConv61223` DB (on orphan `IMC1\AIMSQL`, leftover from 2023 conversion test) safe to drop — verify first
- **AIM "connection broken" recurrence under host memory pressure.** SQLEXPRESS has no `max server memory` cap; under pressure Windows trims working sets and reaps idle Telerik pool slots, causing user-facing `Telerik.OpenAccess` connection-broken errors. First seen 2026-05-05 (Station 1), recurred 2026-05-06 ~12:14 PM (Station 1 again). Recommended fix: cap `max server memory` on SQLEXPRESS (~12-14 GB), WID (~512 MB), and AIMSQL (~256 MB). Long-term: dedicated DB host or Server 2019 migration. Note: misleading instance name burned ~1 day of triage — do NOT stop SQLEXPRESS thinking it's idle.
---
## Pending / Next Up
**High priority — AD hygiene**
- [ ] **Open ticket for `ServerIMC` (192.168.0.63) phantom-DC investigation.** SRV/A records claim it's a DC; LDAP/Kerberos refuse connections. Either repair (if a real-but-broken DC) or `ntdsutil` metadata cleanup (if a ghost demoted DC). Degrades authentication for all domain clients. Confirmed root cause of 2026-04-22 remote join failure.
**Documentation cleanup (Manda rollout)**
- [ ] Update `docs/overview.md` Speedway workstation table with `DESKTOP-KRHQ5TS` / Manda / AIM `USER#=4` entry
- [ ] Confirm Manda's full name in AD; add to overview/contacts
**Pre-existing**
- [ ] Decide Server 2019 migration path (in-place upgrade vs. clean build + migrate)
- [ ] Verify Cloudberry backup continuity
- [ ] Drop `TestConv61223` DB after confirming nothing references it
- [ ] Disable SMB1 on IMC1
- [ ] Clean up stale AD computer objects `IMC2` (last logon 2023), `IMC-VM` (last logon 2021)
- [ ] Follow up on disk management items from 2026-04-13 session
---
## Recent Changes
| Date | By | Change | Status |
|------|-----|--------|--------|
| 2026-05-06 | Howard | AIM connection-broken recurrence (Station 1, ~12:14 PM). Re-enumerated all 3 SQL instances on IMC1; corrected wrong-instance diagnosis from 2026-05-05 (production AIM is on `IMC1\SQLEXPRESS` Standard, not `IMC1\AIMSQL` Express). Confirmed AIMSQL is orphan (zero active clients, 2023-era DBs only). Unregistered the daily 02:30 AIMSQL restart task (it was restarting the wrong instance). Audit artifacts left on disk at `C:\Windows\Temp\aimsql-restart.{ps1,log}`. No service touches. | DIAGNOSED |
| 2026-05-05 | Howard | Initial AIM "connection broken" diagnosis on Station 1. GuruRMM client/site provisioned, IMC1 enrolled (agent `fa99e913-1027-4e33-a928-7695e31068e7`). Scheduled `MSSQL$AIMSQL` restart for 02:30 — fired clean but had no effect (wrong instance, see 2026-05-06 entry). | SUPERSEDED |
| 2026-04-28 | Howard | Provisioned `DESKTOP-KRHQ5TS` for Manda (new GM): joined to imc.local onsite, AD account created, Outlook M365 configured, Office activated, AIMsi `USER#=4` per Leslie. Ticket #32218 invoiced, 1.5 hrs from prepay (14.0 → 12.5). Confirmed `ServerIMC` (192.168.0.63) is a real authentication-degrading phantom DC (SRV/A claim DC, LDAP/Kerberos refuse). | DEPLOYED |
| 2026-04-22 | Howard | Attempted remote domain-join of `DESKTOP-KRHQ5TS` over OpenVPN — abandoned after subnet overlap (home Wi-Fi same /24 as IMC) + phantom-DC SRV pollution defeated NRPT/hosts/locator workarounds. Went onsite instead (see 2026-04-28 entry). | ABANDONED |
| 2026-04-12 | Mike | IMC1 cleanup and SQL move: AIMsi DBs moved to S: SSD, backup script + retention task deployed, sysadmin access restored | DEPLOYED |
---
## How to Update
**When starting:** Add your session to Active Session Locks.
**When finishing:** Remove your lock row, add entries to Recent Changes, update Current State if needed.