From b7752d3d7fe0b7ac96909c9201c26e62cce218c6 Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Wed, 22 Apr 2026 10:23:23 -0700 Subject: [PATCH] docs: Valleywide XenServer OFFLINE - critical investigation Updated emergency session log with XenServer offline status: - XenServer (older Dell) offline - investigating - Server3 VM unavailable - Added to critical next steps Server status summary: - HP ProLiant (MXQ80400X4): NVRAM fixed, VMs running, iLO pending - Dell VWP-QBS: Boot retry resolved, operational - XenServer: OFFLINE (CRITICAL) - 4th server: appears fine Power outage impact assessment ongoing. Timer running. Machine: Mikes-MacBook-Air.local Timestamp: 2026-04-22 10:23:23 Co-Authored-By: Claude Sonnet 4.5 --- clients/valleywide/README.md | 6 +- ...22-hp-server-nvram-corruption-emergency.md | 66 +++++++++++++++++-- 2 files changed, 63 insertions(+), 9 deletions(-) diff --git a/clients/valleywide/README.md b/clients/valleywide/README.md index 272b934..8d5f690 100644 --- a/clients/valleywide/README.md +++ b/clients/valleywide/README.md @@ -17,12 +17,14 @@ - SSH enabled (OpenSSH Server), key auth working for `vwp\guru` - Likely runs as VM on HP ProLiant host -**VWP-QBS (172.16.9.169)** +**VWP-QBS (172.16.9.169) - Dell Server with DRAC** - Windows Server 2022 Standard +- **Physical Dell server** (NOT a VM) - Internal network only (172.16.9.0/24 reachable via VWP site VPN) - Runs QuickBooks + **IIS with RD Gateway / RD Web Access** (`/RDWeb`, `/RDWeb/Pages`, `/RDWeb/Feed`, `/Rpc`, `/RpcWithCert`) - WinRM available on 5985 (used for remote admin via Invoke-Command) -- Likely runs as VM on HP ProLiant host +- DRAC available for remote management +- [NOTE] 2026-04-22: Boot retry issue after power outage, resolved via DRAC manual boot to Windows Boot Manager ### Networks - Internal: `172.16.9.0/24` diff --git a/clients/valleywide/session-logs/2026-04-22-hp-server-nvram-corruption-emergency.md b/clients/valleywide/session-logs/2026-04-22-hp-server-nvram-corruption-emergency.md index 79b8952..43b5217 100644 --- a/clients/valleywide/session-logs/2026-04-22-hp-server-nvram-corruption-emergency.md +++ b/clients/valleywide/session-logs/2026-04-22-hp-server-nvram-corruption-emergency.md @@ -13,7 +13,10 @@ ## Issue Summary -HP ProLiant Server (SN: MXQ80400X4) experienced non-volatile memory corruption following power outage. BIOS/UEFI settings lost, requiring factory reset and full reconfiguration. +Multiple server issues following power outage at Valleywide: +- HP ProLiant Server (SN: MXQ80400X4): Non-volatile memory corruption, BIOS/iLO reset required +- Dell Server (VWP-QBS): Boot retry loop, resolved via DRAC manual boot +- **XenServer (Older Dell): OFFLINE - investigating (CRITICAL)** ## Timeline @@ -25,6 +28,8 @@ HP ProLiant Server (SN: MXQ80400X4) experienced non-volatile memory corruption f ### 0935-[IN PROGRESS] - Recovery Actions +**HP ProLiant Server (SN: MXQ80400X4):** + **BIOS/UEFI Reconfiguration:** - Factory reset required due to NVRAM corruption - Reconfigured BIOS settings @@ -42,29 +47,76 @@ HP ProLiant Server (SN: MXQ80400X4) experienced non-volatile memory corruption f - Hypervisor operational after BIOS reconfiguration - No VM data loss reported +**Dell Server (VWP-QBS) - Separate Boot Issue:** + +**Boot Retry Loop:** +- VWP-QBS (Dell physical server, 172.16.9.169) stuck at "Boot Retry" screen +- Accessed via DRAC (Dell Remote Access Controller) +- Forced manual boot device selection -> Windows Boot Manager +- [OK] Server booted successfully +- [OK] Server appears to be functioning normally now +- Likely related to power outage affecting boot order/configuration +- NOTE: VWP-QBS is NOT a VM - it's a separate physical Dell server + +**XenServer (Older Dell) - OFFLINE:** + +**Status:** +- [CRITICAL] XenServer offline +- Impact: Server3 VM unavailable +- Investigating cause (likely power outage related) +- Checking hardware status, boot sequence, and hypervisor state +- Dell server - older hardware + ## Next Steps +**CRITICAL:** +- [ ] **Restore XenServer** (currently investigating offline status) +- [ ] **Verify Server3 VM status** once XenServer restored + +**High Priority:** - [ ] Complete onsite work (timer running) -- [ ] Reconfigure iLO settings (credentials, network) +- [ ] Reconfigure HP iLO settings (credentials, network) - [ ] Document iLO IP address and credentials - [ ] Verify all server settings match pre-incident configuration -- [ ] Test remote management access -- [ ] Update server documentation with serial number -- [ ] Create follow-up preventive measures (UPS check, power protection) + +**Follow-up:** +- [ ] Test remote management access (iLO, DRAC) +- [ ] Update server documentation with serial numbers and DRAC IPs +- [ ] Create follow-up preventive measures (UPS assessment critical) ## Server Information **HP ProLiant Server:** - Serial Number: MXQ80400X4 - Model: [TO BE DOCUMENTED] -- Role: VM Host +- Role: VM Host (runs VWP_ADSRVR and other VMs) - Location: Valleywide onsite +- Status: Reconfigured, operational -**iLO Management:** +**HP iLO Management:** - Status: Reset to factory defaults - IP: [TO BE RECONFIGURED] - Credentials: [TO BE RESET] +**Dell Server (VWP-QBS):** +- Model: Dell (with DRAC) +- Role: QuickBooks Server, RDS Host (Windows Server 2022) +- IP: 172.16.9.169 +- Location: Valleywide onsite +- Status: Boot issue resolved, operational +- NOTE: Physical server, NOT a VM + +**Dell DRAC Management:** +- Status: Functional (used to force manual boot) +- IP: [TO BE DOCUMENTED] + +**XenServer (Older Dell):** +- Model: Dell (older hardware) +- Role: VM Host for Server3 +- Location: Valleywide onsite +- Status: **OFFLINE - INVESTIGATING** +- Impact: Server3 VM unavailable + ## Notes - Power outage caused NVRAM corruption - rare but critical failure