docs: Valleywide HP server NVRAM corruption emergency (ONGOING)

Emergency onsite work documentation:
- Arrival 0935 MST - HP ProLiant SN MXQ80400X4
- Non-volatile memory corruption from power outage
- BIOS/UEFI factory reset required and reconfigured
- iLO reset to factory (needs reconfiguration)
- All VMs confirmed running
- Work in progress - timer running

Updated:
- clients/valleywide/README.md: Added HP server, iLO reset warning, priority items
- clients/valleywide/session-logs/2026-04-22-hp-server-nvram-corruption-emergency.md: Created

Next: iLO reconfiguration, UPS assessment

Machine: Mikes-MacBook-Air.local
Timestamp: 2026-04-22 10:11:39

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-04-22 10:11:39 -07:00
parent e2028fe6f8
commit a186551ce3
2 changed files with 90 additions and 0 deletions

View File

@@ -4,16 +4,25 @@
### Servers
**HP ProLiant Server (SN: MXQ80400X4)**
- VM Host / Hypervisor
- iLO IP: [TO BE DOCUMENTED]
- All VMs running on this host
- [WARNING] 2026-04-22: NVRAM corruption from power outage - BIOS/iLO reset to factory, reconfigured
- [ACTION REQUIRED] iLO needs reconfiguration after factory reset
**VWP_ADSRVR (192.168.0.25)**
- Windows Server 2019 Standard (build 17763)
- Domain Controller for `vwp.local`
- SSH enabled (OpenSSH Server), key auth working for `vwp\guru`
- Likely runs as VM on HP ProLiant host
**VWP-QBS (172.16.9.169)**
- Windows Server 2022 Standard
- Internal network only (172.16.9.0/24 reachable via VWP site VPN)
- Runs QuickBooks + **IIS with RD Gateway / RD Web Access** (`/RDWeb`, `/RDWeb/Pages`, `/RDWeb/Feed`, `/Rpc`, `/RpcWithCert`)
- WinRM available on 5985 (used for remote admin via Invoke-Command)
- Likely runs as VM on HP ProLiant host
### Networks
- Internal: `172.16.9.0/24`
@@ -78,6 +87,9 @@ VWP-QBS's RDS License Server is activated and running, but **has no real CALs in
## Open items
- **[PRIORITY] HP ProLiant iLO reconfiguration** (2026-04-22 emergency: factory reset, needs credentials/network setup)
- **[PRIORITY] Verify all HP server BIOS settings** post-corruption recovery
- **[PRIORITY] UPS assessment** for HP server (power outage caused NVRAM corruption)
- Confirm UPnP state on UDM (2026-04-13 recommendation — still not verified)
- Document intended RDWeb access pattern (who connects from where) — superseded partially by 2026-04-16 VPN-only decision, but formalize
- Add Valleywide entry to SOPS vault (SOPS vault now has `clients/vwp/*` entries: adsrvr, dc1, udm, xenserver, quickbooks-server-idrac — superseded)

View File

@@ -0,0 +1,78 @@
# 2026-04-22 — Valleywide HP Server NVRAM Corruption Emergency
## User
- **User:** Mike Swanson (mike)
- **Machine:** Mikes-MacBook-Air.local
- **Role:** admin
## Ticket Information
- **Type:** Emergency onsite
- **Priority:** Critical
- **Status:** In Progress
- **Arrival:** 0935 MST
## Issue Summary
HP ProLiant Server (SN: MXQ80400X4) experienced non-volatile memory corruption following power outage. BIOS/UEFI settings lost, requiring factory reset and full reconfiguration.
## Timeline
### 0935 - Arrival Onsite
- HP Server SN: MXQ80400X4
- Issue: Non-Volatile Memory Corruption
- Cause: Power outage
- Impact: BIOS/UEFI reset to factory defaults
### 0935-[IN PROGRESS] - Recovery Actions
**BIOS/UEFI Reconfiguration:**
- Factory reset required due to NVRAM corruption
- Reconfigured BIOS settings
- Restored boot order
- Re-enabled virtualization settings
**iLO (Integrated Lights-Out) Reset:**
- [WARNING] iLO was reset to factory defaults due to BIOS reset
- iLO credentials will need to be re-entered
- Network configuration may need restoration
- Remote management temporarily unavailable until iLO reconfigured
**VM Status:**
- [OK] All VMs running
- Hypervisor operational after BIOS reconfiguration
- No VM data loss reported
## Next Steps
- [ ] Complete onsite work (timer running)
- [ ] Reconfigure iLO settings (credentials, network)
- [ ] Document iLO IP address and credentials
- [ ] Verify all server settings match pre-incident configuration
- [ ] Test remote management access
- [ ] Update server documentation with serial number
- [ ] Create follow-up preventive measures (UPS check, power protection)
## Server Information
**HP ProLiant Server:**
- Serial Number: MXQ80400X4
- Model: [TO BE DOCUMENTED]
- Role: VM Host
- Location: Valleywide onsite
**iLO Management:**
- Status: Reset to factory defaults
- IP: [TO BE RECONFIGURED]
- Credentials: [TO BE RESET]
## Notes
- Power outage caused NVRAM corruption - rare but critical failure
- Quick recovery due to all VMs remaining intact
- iLO reconfiguration required for remote management
- Consider UPS assessment as preventive measure
---
**Work Status:** ONGOING - Timer running
**Next Update:** Upon completion of onsite work