sync: auto-sync from HOWARD-HOME at 2026-06-17 15:47:50

Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-17 15:47:50
This commit is contained in:
2026-06-17 15:47:59 -07:00
parent 0166f1db64
commit cbe7175fbb

View File

@@ -0,0 +1,94 @@
# Cascades — CS-SERVER failing-drive review + noon network-spike question
- **Date:** 2026-06-17
- **Machine:** Howard-Home
- **Client:** Cascades of Tucson
## User
- **User:** Howard Enos (howard)
- **Machine:** Howard-Home
- **Role:** tech
## Session Summary
Review/Q&A session on the CS-SERVER degraded RAID-1 (failing C: drive). Pulled up the
2026-06-15 CS-SERVER RAID/VPN session log and the client wiki hardware warning, and recapped
the state: C: (Virtual Disk2, RAID-1, ~297 GB usable) is DEGRADED on a single surviving 320 GB
5400 RPM laptop spindle after the mirror's other member failed. The healthy D: array and the
single-DC / newly-installed-but-unverified cloud backup posture were summarized. No changes
made to the server — this was an advisory/planning conversation.
Answered three hardware questions for Howard ahead of a possible drive swap. (1) Hot-swap: yes,
the R610 hot-plug backplane + SAS 6/iR support hot-swapping the already-removed failed member
with the server running; gating items are a verified backup first and correct-drive
identification, not power state. (2) SSD limits: the SAS 6/iR (LSI 1068E, SATA II / 3 Gbps, no
TRIM, no Dell certified-drive lockout) sets the constraints — min size >= 320 GB raw (so 480/500 GB
class; 240/256 GB too small to rebuild), 2.5" SATA negotiating to 3 Gbps, enterprise SSD with
PLP (no TRIM => avoid consumer QLC/DRAM-less), <= 2 TB, 512e; buy two identical for the
rebuild-then-swap. (3) The two installed OS drives: surviving 0:0:2 Hitachi HTS545032B9A300 and
failed 0:0:3 WDC WD3200BEVT, both 320 GB SATA.
Howard then asked what caused a network spike at Cascades today around noon. No spike is
recorded in the logs (that is live monitoring data), but the only network change today was the
VLAN 30 voice build (port-16 bounce + desktop re-DHCP onto 10.0.30.201, pfSense filter reload),
which is the leading correlation candidate; the larger phone cutover was deferred to tonight.
Per Howard's follow-up command, the next step is to pull the UniFi skill and investigate the
Cascades site (va6iba3v on UOS 172.16.3.29) live for the noon window — that investigation
begins after this save.
## Key Decisions
- Treated the failing-drive discussion as advisory only; no server-side commands run this
session (the standing rule is no drive work until the cloud backup completes and verifies).
- Recommended spec: two 480 GB enterprise 2.5" SATA SSDs with power-loss protection (e.g.
Solidigm D3-S4520 480 GB or Samsung PM893 480 GB) for the rebuild-then-swap.
- Offered to re-pull OMSA live before any physical swap to confirm the surviving Hitachi has not
also degraded since 2026-06-15.
## Problems Encountered
- None. Advisory session.
## Configuration Changes
- None on CS-SERVER or any infrastructure. Session log written only.
## Credentials & Secrets
- None created or discovered this session.
## Infrastructure & Servers
- **CS-SERVER** — Cascades DC/file/Hyper-V host, Dell PowerEdge R610 (~2009), Win Server 2019
Std, 48 GB RAM. GuruRMM agent `c39f1de7-d5b6-45ae-b132-e06977ab1713`. LAN 192.168.2.254.
- C: = Virtual Disk2, RAID-1, ~297 GB, DEGRADED. Surviving member `0:0:2` Hitachi
HTS545032B9A300 (320 GB SATA, 5400 RPM). Failed member `0:0:3` WDC WD3200BEVT (320 GB SATA,
Critical/Removed).
- D: = Virtual Disk0, RAID-1, two 1.2 TB SAS, OK. Spare `1:0:4` 1.2 TB SAS "Ready" (wrong size
to rebuild the 320 GB mirror).
- Controller: SAS 6/iR Integrated (LSI 1068E), SATA II 3 Gbps, no TRIM, no certified-drive
lockout.
- Cloud backup (MSP360/CloudBerry -> ACG-backup) installed/started 2026-06-15, not yet verified.
- **Cascades UniFi** — site `va6iba3v` on UOS controller `172.16.3.29` (for the spike
investigation).
- **Cascades pfSense** — `192.168.0.1`.
## Commands & Outputs
- No commands run against infrastructure this session.
## Pending / Incomplete Tasks
- **Investigate the noon network spike** via the UniFi skill (Cascades site va6iba3v on UOS
172.16.3.29) + optionally pfSense WAN throughput — begins after this save.
- **CS-SERVER drive remediation** still gated on: backup first full completes + verifies +
confirmed image/bare-metal + retention set. Then rebuild-then-swap to 2x 480 GB enterprise
SATA SSDs. Re-pull OMSA live before any physical action.
- DC migration off the EOL R610 remains the strategic fix.
## Reference Information
- Prior drive session: `clients/cascades-tucson/session-logs/2026-06/2026-06-15-howard-cs-server-raid-vpn-reset.md`
- Today's VLAN build: `clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-voice-vlan30-build.md`
- Client wiki: `wiki/clients/cascades-tucson.md`
- GuruRMM agent (CS-SERVER): `c39f1de7-d5b6-45ae-b132-e06977ab1713`. RMM API http://172.16.3.30:3001.