docs: Valleywide HP server NVRAM corruption emergency (ONGOING)
Emergency onsite work documentation: - Arrival 0935 MST - HP ProLiant SN MXQ80400X4 - Non-volatile memory corruption from power outage - BIOS/UEFI factory reset required and reconfigured - iLO reset to factory (needs reconfiguration) - All VMs confirmed running - Work in progress - timer running Updated: - clients/valleywide/README.md: Added HP server, iLO reset warning, priority items - clients/valleywide/session-logs/2026-04-22-hp-server-nvram-corruption-emergency.md: Created Next: iLO reconfiguration, UPS assessment Machine: Mikes-MacBook-Air.local Timestamp: 2026-04-22 10:11:39 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -4,16 +4,25 @@
|
||||
|
||||
### Servers
|
||||
|
||||
**HP ProLiant Server (SN: MXQ80400X4)**
|
||||
- VM Host / Hypervisor
|
||||
- iLO IP: [TO BE DOCUMENTED]
|
||||
- All VMs running on this host
|
||||
- [WARNING] 2026-04-22: NVRAM corruption from power outage - BIOS/iLO reset to factory, reconfigured
|
||||
- [ACTION REQUIRED] iLO needs reconfiguration after factory reset
|
||||
|
||||
**VWP_ADSRVR (192.168.0.25)**
|
||||
- Windows Server 2019 Standard (build 17763)
|
||||
- Domain Controller for `vwp.local`
|
||||
- SSH enabled (OpenSSH Server), key auth working for `vwp\guru`
|
||||
- Likely runs as VM on HP ProLiant host
|
||||
|
||||
**VWP-QBS (172.16.9.169)**
|
||||
- Windows Server 2022 Standard
|
||||
- Internal network only (172.16.9.0/24 reachable via VWP site VPN)
|
||||
- Runs QuickBooks + **IIS with RD Gateway / RD Web Access** (`/RDWeb`, `/RDWeb/Pages`, `/RDWeb/Feed`, `/Rpc`, `/RpcWithCert`)
|
||||
- WinRM available on 5985 (used for remote admin via Invoke-Command)
|
||||
- Likely runs as VM on HP ProLiant host
|
||||
|
||||
### Networks
|
||||
- Internal: `172.16.9.0/24`
|
||||
@@ -78,6 +87,9 @@ VWP-QBS's RDS License Server is activated and running, but **has no real CALs in
|
||||
|
||||
## Open items
|
||||
|
||||
- **[PRIORITY] HP ProLiant iLO reconfiguration** (2026-04-22 emergency: factory reset, needs credentials/network setup)
|
||||
- **[PRIORITY] Verify all HP server BIOS settings** post-corruption recovery
|
||||
- **[PRIORITY] UPS assessment** for HP server (power outage caused NVRAM corruption)
|
||||
- Confirm UPnP state on UDM (2026-04-13 recommendation — still not verified)
|
||||
- Document intended RDWeb access pattern (who connects from where) — superseded partially by 2026-04-16 VPN-only decision, but formalize
|
||||
- Add Valleywide entry to SOPS vault (SOPS vault now has `clients/vwp/*` entries: adsrvr, dc1, udm, xenserver, quickbooks-server-idrac — superseded)
|
||||
|
||||
@@ -0,0 +1,78 @@
|
||||
# 2026-04-22 — Valleywide HP Server NVRAM Corruption Emergency
|
||||
|
||||
## User
|
||||
- **User:** Mike Swanson (mike)
|
||||
- **Machine:** Mikes-MacBook-Air.local
|
||||
- **Role:** admin
|
||||
|
||||
## Ticket Information
|
||||
- **Type:** Emergency onsite
|
||||
- **Priority:** Critical
|
||||
- **Status:** In Progress
|
||||
- **Arrival:** 0935 MST
|
||||
|
||||
## Issue Summary
|
||||
|
||||
HP ProLiant Server (SN: MXQ80400X4) experienced non-volatile memory corruption following power outage. BIOS/UEFI settings lost, requiring factory reset and full reconfiguration.
|
||||
|
||||
## Timeline
|
||||
|
||||
### 0935 - Arrival Onsite
|
||||
- HP Server SN: MXQ80400X4
|
||||
- Issue: Non-Volatile Memory Corruption
|
||||
- Cause: Power outage
|
||||
- Impact: BIOS/UEFI reset to factory defaults
|
||||
|
||||
### 0935-[IN PROGRESS] - Recovery Actions
|
||||
|
||||
**BIOS/UEFI Reconfiguration:**
|
||||
- Factory reset required due to NVRAM corruption
|
||||
- Reconfigured BIOS settings
|
||||
- Restored boot order
|
||||
- Re-enabled virtualization settings
|
||||
|
||||
**iLO (Integrated Lights-Out) Reset:**
|
||||
- [WARNING] iLO was reset to factory defaults due to BIOS reset
|
||||
- iLO credentials will need to be re-entered
|
||||
- Network configuration may need restoration
|
||||
- Remote management temporarily unavailable until iLO reconfigured
|
||||
|
||||
**VM Status:**
|
||||
- [OK] All VMs running
|
||||
- Hypervisor operational after BIOS reconfiguration
|
||||
- No VM data loss reported
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [ ] Complete onsite work (timer running)
|
||||
- [ ] Reconfigure iLO settings (credentials, network)
|
||||
- [ ] Document iLO IP address and credentials
|
||||
- [ ] Verify all server settings match pre-incident configuration
|
||||
- [ ] Test remote management access
|
||||
- [ ] Update server documentation with serial number
|
||||
- [ ] Create follow-up preventive measures (UPS check, power protection)
|
||||
|
||||
## Server Information
|
||||
|
||||
**HP ProLiant Server:**
|
||||
- Serial Number: MXQ80400X4
|
||||
- Model: [TO BE DOCUMENTED]
|
||||
- Role: VM Host
|
||||
- Location: Valleywide onsite
|
||||
|
||||
**iLO Management:**
|
||||
- Status: Reset to factory defaults
|
||||
- IP: [TO BE RECONFIGURED]
|
||||
- Credentials: [TO BE RESET]
|
||||
|
||||
## Notes
|
||||
|
||||
- Power outage caused NVRAM corruption - rare but critical failure
|
||||
- Quick recovery due to all VMs remaining intact
|
||||
- iLO reconfiguration required for remote management
|
||||
- Consider UPS assessment as preventive measure
|
||||
|
||||
---
|
||||
|
||||
**Work Status:** ONGOING - Timer running
|
||||
**Next Update:** Upon completion of onsite work
|
||||
Reference in New Issue
Block a user