# IMC — AIM Station 1 recurrence + wrong-instance correction + AIMSQL orphan confirmed **Date:** 2026-05-06 ## User - **User:** Howard Enos (howard) - **Machine:** Howard-Home - **Role:** tech ## Summary Station 1 at IMC hit the same `Telerik.OpenAccess.RT.sql.SQLException: Connection has been closed` AIM error today around 12:14 PM MST, ~9 hours after last night's scheduled `MSSQL$AIMSQL` restart fired cleanly at 02:30. That fast a recurrence forced a fuller enumeration of all three SQL instances on IMC1 — which **reversed yesterday's diagnosis**. The "leftover" `MSSQL$SQLEXPRESS` is actually the live production AIM database (SQL Server 2019 **Standard** Edition installed under the default SQLEXPRESS instance name and never renamed). `MSSQL$AIMSQL` is the actual orphan, hosting only 2023-era conversion-test DBs with zero active client connections. Today's restart had no effect because it was on the wrong instance. Documented the correction in yesterday's session log (correction block at top + reversal in the Note for Mike), updated `PROJECT_STATE.md`, unregistered the now-pointless scheduled task, and saved a feedback memory so this trap doesn't bite us again. **No production service touches.** The user-facing Telerik error is still likely to recur — nothing today actually prevents it. Next reversible lever is capping `max server memory` on each instance to stop the buffer-pool trim cycle that's reaping idle pool slots; awaiting Howard's go-ahead. ## What was done ### 1. AIM Station 1 recurrence — initial diagnostic Read-only log pull on IMC1 via GuruRMM (local SSH from Howard-Home blocked by the documented 192.168.0.0/24 collision with home Wi-Fi). Output preserved at `clients/instrumental-music-center/scripts/out/imc1_stdout.txt`. | Signal | Yesterday (2026-05-05) | Today (10h after 02:30 restart) | |---|---|---| | Scheduled `AIMSQL_Restart_20260506_0230` | (created, pending) | LastRunTime 02:30:30, LastTaskResult 0 — fired clean | | AIMSQL PID / start time | 34536 / 2026-04-25 22:01:37 | 12772 / 2026-05-06 02:30:02 (post-restart) | | `MSSQL$MICROSOFT##WID` Event 17890 paging events | 8 in 4h | **65 in 10h** (~3x rate, all on WID, none on AIMSQL/SQLEXPRESS) | | AIMSQL Total Server Memory | 587 MB (Target 7,224 MB) | 371 MB (Target 5,778 MB) — actually **lower**, pool actively trimmed | | AIMSQL Page Life Expectancy | 842,990s (~9.7 d) | 16,811s (~4.7 hr) — collapsed | | AIMSQL `page_fault_count` | 5,689,041 over 11 days (~516k/day) | 605,504 over 10h 3m (**~1.45M/day, ~3x baseline**) | | Active RDP user count | 4 | **6** (added `station3` and one more) | | Free physical RAM | n/a | 6.99 GB / 32 GB (~21%) | The restart itself fired cleanly. AIMSQL ERRORLOG has been silent except for the 02:30 startup chatter. Yet the recurrence happened on Station 1. Something didn't add up — which led to the next step. ### 2. SQLEXPRESS enumeration — the bombshell Re-ran the same read-only enumeration pattern targeting `MSSQL$SQLEXPRESS` (yesterday's "leftover instance" question for Mike). Output: `imc1_sqlexpress_enum.txt`. **`SERVERPROPERTY('Edition')` returned `Standard Edition (64-bit)`.** It's not Express — somebody installed Standard with the default SQLEXPRESS instance name and never renamed it. The instance NAME is misleading. | Fact | Value | |---|---| | Instance | `IMC1\SQLEXPRESS`, TCP **61151** | | Edition | SQL Server 2019 **Standard** Edition (64-bit) | | Version | 15.0.2165.1 (KB5084817 — same March 2026 GDR as AIMSQL) | | Service account | `IMC\AIM` (domain account) | | Process | PID 20756, started 2026-04-25 21:47:53, working set **6.86 GB** (no `max server memory` cap) | | Production DB | **`IMCAIM`** (created 2023-08-21) — the live AIM database | | Other DBs | `AIM` (2021-03-18), `IMC` (2023-08-21), `IMCAIM_Training` (2024-03-01) — all backed up daily but no live sessions | | ERRORLOG | `E:\SQL\MSSQL14.SQLEXPRESS\MSSQL\Log\ERRORLOG` | | Backup chain | Cloudberry → `C:\ProgramData\Online Backup\MSSQL\IMC1_SQLEXPRESS\` + local `E:\SQL\MSSQL14.SQLEXPRESS\MSSQL\Backup\` (daily, both succeed) | **Active connections to SQLEXPRESS at time of check:** | Workstation | IP | DB | |---|---|---| | IMC-MINI | 192.168.0.72 | IMCAIM | | IMC-SVCSTR | 192.168.0.55 | IMCAIM | | IMC-LESSONS | 192.168.0.62 | IMCAIM | | IMC-STATION2 | 192.168.0.66 | IMCAIM | | IMC-L1-STATION9 | 192.168.0.41 | IMCAIM | | DESKTOP-44L80C0 | 192.168.0.46 | IMCAIM | | DESKTOP-MR3ALTK | 192.168.0.59 | IMCAIM | | REPAIRADMIN | 192.168.0.48 | IMCAIM | | C2B | 192.168.0.4 | IMCAIM | | (server-local AIM Webservice / Runtime) | IMC1 | IMCAIM x22 sessions | All sessions login as `AIMUser1` via `.Net SqlClient Data Provider` (the AIM front-end). Every register, repair workstation, lessons workstation, and the C2B credit-card module talks to this instance. **Nothing in the active list matches `IMC-STATION1` exactly** — Station 1 is likely either one of the `DESKTOP-*` machines (not yet renamed to the IMC- naming convention) or it was disconnected after the error and hadn't reconnected at the time of the check. Open question for Leslie. ### 3. AIMSQL counter-enumeration — orphan confirmed Same enumeration targeted at `MSSQL$AIMSQL` to verify nothing real depends on it. Output: `imc1_aimsql_enum.txt`. | Fact | Value | |---|---| | Instance | `IMC1\AIMSQL`, TCP **63116** (dynamic) | | Edition | SQL Server 2019 Express Edition GDR 15.0.2165.1 | | Service account | `IMC\IMC1$` (machine account) | | Process | PID 12772 (post-restart), working set 172 MB, very lightly loaded | | **Established TCP connections** | **0** (only `Listen` state on IPv4 + IPv6) | | **Active user sessions** | **0** real — only `NT SERVICE\SQLTELEMETRY$AIMSQL` heartbeat + our own `NT AUTHORITY\SYSTEM` query | | Databases | `AIM` (2023-06-09), `TestConv61223` (2023-06-12), `IMC` (2023-07-03) — all 2023-era conversion test artifacts | | ERRORLOG | Silent except 02:30:02 startup chatter from today's restart | | Backups | None — no `.bak` files in any AIMSQL Backup directory | | `max server memory` | uncapped (default 2,147,483,647 MB), but Express enforces ~1.4 GB buffer-pool ceiling regardless | **Verdict: confirmed orphan.** Zero live clients, zero session activity, no active backup chain landing on it, only legacy DBs from a 2023 conversion that didn't go to production (the live `IMCAIM` was created 2023-08-21 on SQLEXPRESS and the AIMSQL `AIM`/`IMC` DBs from June/July 2023 appear to be the precursors). **Caveat for any future shutdown:** the user `.mdf` files weren't surfaced by the filesystem walk under `MSSQL15.AIMSQL\MSSQL\DATA` or `S:\*AIMSQL*`. Locate and back up `AIM.mdf`, `IMC.mdf`, `TestConv61223.mdf` (and their `.ldf` siblings) before any uninstall. Mike to decide whether the 2023-era data is worth keeping. ### 4. Scheduled task removal (only authorized change today) Unregistered `AIMSQL_Restart_20260506_0230` on IMC1. Pre-removal LastRunTime 05/06 02:30:30, LastTaskResult 0. Confirmed gone. Audit-trail artifacts left intentionally on disk: - `C:\Windows\Temp\aimsql-restart.ps1` (984 bytes, modified 2026-05-05 18:53:27) - `C:\Windows\Temp\aimsql-restart.log` (1,150 bytes, modified 2026-05-06 02:30:19) GuruRMM command_id for audit: `1889a150-b775-4fb2-9f4b-cd794d4e7d9f`, exit 0. ### 5. Documentation amendments - **`session-logs/2026-05-05-howard-aim-connection-broken-investigation.md`** — added a `## Correction (2026-05-06)` block immediately after Summary with the full reversal narrative. Inside the existing `## Note for Mike`, added a header callout pointing to the correction block, then struck through item #1 (the "shut down SQLEXPRESS" recommendation) and added inline reversal text. Historical content preserved verbatim — additions only, no rewrites. - **`PROJECT_STATE.md`** — bumped `Last updated` to 2026-05-06; rewrote the Current State paragraph to name `IMC1\SQLEXPRESS` as production with the misleading-name caveat; expanded the Infrastructure table from a single SQL row to three rows (production, orphan, system) with role labels; added a Known Issue entry for the AIM connection-broken recurrence pattern + the misleading-name trap; added today's `DIAGNOSED` row and a `SUPERSEDED` flag on yesterday's row in Recent Changes. ### 6. Feedback memory saved `.claude/memory/feedback_sql_instance_role_by_connection.md` and indexed in `MEMORY.md`. Rule: verify SQL instance roles by `sys.dm_exec_sessions` + `Get-NetTCPConnection -OwningProcess`, not by instance name. The IMC1 near-miss is recorded as the originating incident. ## Why the error keeps happening **Same failure mode both days, on the right instance now.** SQLEXPRESS sustains memory pressure on a 32 GB box that's also a DC + RDS host with 6 RDP user sessions + AIMsi Webservice + AIMsi Runtime + 3 SQL instances + QuickBooks Enterprise installed locally. Windows trims SQL working sets when global pressure crosses a threshold (visible as the 17890 events on WID — the canary). The trim cycle reaps idle TCP pool slots from SQLEXPRESS too. Telerik OpenAccess discovers the dead handle on the next reuse, throws `connection broken and recovery is not possible`, and the user sees the raw stack trace because Telerik has no transient-fault retry. **Why yesterday's restart helped briefly** — restarting AIMSQL momentarily freed ~600 MB, which may have eased global pressure for a few hours. Pure side effect; the actual production instance was untouched. The 17890 rate doubling overnight (8/4h → 65/10h) shows pressure rebuilt fast. ## Plan ### Tonight / next session (Howard's call) 1. **Cap `max server memory`** on each instance — reversible 1-second config change, no service touch: - SQLEXPRESS: 12,288 MB (12 GB) — leaves headroom on a 32 GB box for OS + DC + 6 RDP users + AIMsi services + QuickBooks - WID: 512 MB - AIMSQL: 256 MB (or just leave it; we want it stopped eventually anyway) The cap stops the trim cycle even when global pressure rises, because SQL voluntarily stays under the ceiling instead of getting forcibly trimmed. 2. **Confirm Station 1's hostname / IP** with Leslie. None of the active SQLEXPRESS sessions matched `IMC-STATION1` exactly. Likely either `DESKTOP-44L80C0` / `DESKTOP-MR3ALTK` (un-renamed boxes) or a station that was disconnected at check time. 3. **Locate AIMSQL `.mdf` files** before any consolidation talk. They weren't where I expected. Worth a 5-minute filesystem search. 4. **Schedule SQLEXPRESS-targeted restart cadence** only if the memory cap doesn't hold — but it's after-hours-only because every register would briefly disconnect. ### Long term (Mike conversation) - Stop + uninstall `MSSQL$AIMSQL` once the 2023-era DBs are backed up and confirmed safe to retire. - Decide WID instance: WSUS isn't actively serving clients per yesterday's check. Probably also stoppable. - Server 2019 migration / dedicated DB host — same conversation as before, pushed by today's evidence. ## Note for Mike > **READ THIS BEFORE ACTING ON YESTERDAY'S NOTE FOR MIKE.** Yesterday's note had a critical wrong-instance error in item #1 — see the correction block at the top of `2026-05-05-howard-aim-connection-broken-investigation.md` for the strikethrough and reversal. **Bottom line:** Production AIM lives on `IMC1\SQLEXPRESS`, not `IMC1\AIMSQL`. Yesterday's labeling was reversed because the SQLEXPRESS instance is actually SQL Server 2019 **Standard** Edition installed under the default Express instance name. The "leftover" we proposed shutting down to free headroom is the live POS database. Stopping it would have killed every register, repair workstation, lessons workstation, and the C2B credit module instantly. **What's actually true now:** - `IMC1\SQLEXPRESS` (TCP 61151) — **PRODUCTION.** Standard Edition. DB `IMCAIM`. Service account `IMC\AIM`. ~9 store workstations connected. **Do not stop, do not uninstall, do not let anyone misled by the name shut it down.** - `IMC1\AIMSQL` (TCP 63116) — **the actual orphan.** True SQL 2019 Express. Zero clients, only 2023-era conversion-test DBs. This is the consolidation candidate. Today's scheduled restart task targeting AIMSQL was unregistered (it had no effect on the user-facing problem). - `IMC1\MICROSOFT##WID` — WSUS / AD RMS. Pages out under host pressure (canary for the AIM error). Possibly also stoppable if WSUS isn't serving clients. **Decisions on your plate:** 1. **Approve `max server memory` caps** as the next reversible fix (SQLEXPRESS 12 GB, WID 512 MB, AIMSQL 256 MB). Howard hasn't applied them yet — awaiting your green light. 2. **Approve AIMSQL consolidation** once the 2023-era `AIM` / `IMC` / `TestConv61223` DBs are backed up and confirmed safe to retire. Howard to locate the `.mdf` files first (they weren't where I expected). 3. **Approve WID consolidation** if WSUS/AD RMS isn't really being used at IMC. 4. **Server 2016 EOL is approaching** (extended support ends 2027-01-12). Today's incident is more evidence to push the migration timeline. Worth scoping at next ACG strategy call. 5. **Server 2019 migration / dedicated DB host conversation** — unchanged from yesterday, more urgent now. The `SvcRestartTask` that ran 11:00 today (yesterday's mystery) is a daily 11:00 task that returns 0 — it's not the AIM trigger, putting that thread to bed. **Until we apply the memory caps, the AIM error is likely to keep recurring** at roughly the same cadence we've seen. Howard is on the lookout for the next occurrence and can do another targeted log pull if it's helpful for your decision. ## References - Today's GuruRMM commands: - SQLEXPRESS enumeration: command run via agent `fa99e913-1027-4e33-a928-7695e31068e7` - AIMSQL enumeration + scheduled task removal: `1889a150-b775-4fb2-9f4b-cd794d4e7d9f` - Raw outputs: - `clients/instrumental-music-center/scripts/out/imc1_stdout.txt` (initial diag) - `clients/instrumental-music-center/scripts/out/imc1_sqlexpress_enum.txt` (SQLEXPRESS enum) - `clients/instrumental-music-center/scripts/out/imc1_aimsql_enum.txt` (AIMSQL counter-enum) - Yesterday's session log + correction block: `clients/instrumental-music-center/session-logs/2026-05-05-howard-aim-connection-broken-investigation.md` - Updated project state: `clients/instrumental-music-center/PROJECT_STATE.md` - Feedback memory: `.claude/memory/feedback_sql_instance_role_by_connection.md` - IMC1 vault (existing): `clients/imc/imc1.sops.yaml` - GuruRMM agent ID for IMC1: `fa99e913-1027-4e33-a928-7695e31068e7`