sync: auto-sync from HOWARD-HOME at 2026-06-17 15:47:50
Author: Howard Enos Machine: HOWARD-HOME Timestamp: 2026-06-17 15:47:50
This commit is contained in:
@@ -0,0 +1,94 @@
|
||||
# Cascades — CS-SERVER failing-drive review + noon network-spike question
|
||||
|
||||
- **Date:** 2026-06-17
|
||||
- **Machine:** Howard-Home
|
||||
- **Client:** Cascades of Tucson
|
||||
|
||||
## User
|
||||
- **User:** Howard Enos (howard)
|
||||
- **Machine:** Howard-Home
|
||||
- **Role:** tech
|
||||
|
||||
## Session Summary
|
||||
|
||||
Review/Q&A session on the CS-SERVER degraded RAID-1 (failing C: drive). Pulled up the
|
||||
2026-06-15 CS-SERVER RAID/VPN session log and the client wiki hardware warning, and recapped
|
||||
the state: C: (Virtual Disk2, RAID-1, ~297 GB usable) is DEGRADED on a single surviving 320 GB
|
||||
5400 RPM laptop spindle after the mirror's other member failed. The healthy D: array and the
|
||||
single-DC / newly-installed-but-unverified cloud backup posture were summarized. No changes
|
||||
made to the server — this was an advisory/planning conversation.
|
||||
|
||||
Answered three hardware questions for Howard ahead of a possible drive swap. (1) Hot-swap: yes,
|
||||
the R610 hot-plug backplane + SAS 6/iR support hot-swapping the already-removed failed member
|
||||
with the server running; gating items are a verified backup first and correct-drive
|
||||
identification, not power state. (2) SSD limits: the SAS 6/iR (LSI 1068E, SATA II / 3 Gbps, no
|
||||
TRIM, no Dell certified-drive lockout) sets the constraints — min size >= 320 GB raw (so 480/500 GB
|
||||
class; 240/256 GB too small to rebuild), 2.5" SATA negotiating to 3 Gbps, enterprise SSD with
|
||||
PLP (no TRIM => avoid consumer QLC/DRAM-less), <= 2 TB, 512e; buy two identical for the
|
||||
rebuild-then-swap. (3) The two installed OS drives: surviving 0:0:2 Hitachi HTS545032B9A300 and
|
||||
failed 0:0:3 WDC WD3200BEVT, both 320 GB SATA.
|
||||
|
||||
Howard then asked what caused a network spike at Cascades today around noon. No spike is
|
||||
recorded in the logs (that is live monitoring data), but the only network change today was the
|
||||
VLAN 30 voice build (port-16 bounce + desktop re-DHCP onto 10.0.30.201, pfSense filter reload),
|
||||
which is the leading correlation candidate; the larger phone cutover was deferred to tonight.
|
||||
Per Howard's follow-up command, the next step is to pull the UniFi skill and investigate the
|
||||
Cascades site (va6iba3v on UOS 172.16.3.29) live for the noon window — that investigation
|
||||
begins after this save.
|
||||
|
||||
## Key Decisions
|
||||
|
||||
- Treated the failing-drive discussion as advisory only; no server-side commands run this
|
||||
session (the standing rule is no drive work until the cloud backup completes and verifies).
|
||||
- Recommended spec: two 480 GB enterprise 2.5" SATA SSDs with power-loss protection (e.g.
|
||||
Solidigm D3-S4520 480 GB or Samsung PM893 480 GB) for the rebuild-then-swap.
|
||||
- Offered to re-pull OMSA live before any physical swap to confirm the surviving Hitachi has not
|
||||
also degraded since 2026-06-15.
|
||||
|
||||
## Problems Encountered
|
||||
|
||||
- None. Advisory session.
|
||||
|
||||
## Configuration Changes
|
||||
|
||||
- None on CS-SERVER or any infrastructure. Session log written only.
|
||||
|
||||
## Credentials & Secrets
|
||||
|
||||
- None created or discovered this session.
|
||||
|
||||
## Infrastructure & Servers
|
||||
|
||||
- **CS-SERVER** — Cascades DC/file/Hyper-V host, Dell PowerEdge R610 (~2009), Win Server 2019
|
||||
Std, 48 GB RAM. GuruRMM agent `c39f1de7-d5b6-45ae-b132-e06977ab1713`. LAN 192.168.2.254.
|
||||
- C: = Virtual Disk2, RAID-1, ~297 GB, DEGRADED. Surviving member `0:0:2` Hitachi
|
||||
HTS545032B9A300 (320 GB SATA, 5400 RPM). Failed member `0:0:3` WDC WD3200BEVT (320 GB SATA,
|
||||
Critical/Removed).
|
||||
- D: = Virtual Disk0, RAID-1, two 1.2 TB SAS, OK. Spare `1:0:4` 1.2 TB SAS "Ready" (wrong size
|
||||
to rebuild the 320 GB mirror).
|
||||
- Controller: SAS 6/iR Integrated (LSI 1068E), SATA II 3 Gbps, no TRIM, no certified-drive
|
||||
lockout.
|
||||
- Cloud backup (MSP360/CloudBerry -> ACG-backup) installed/started 2026-06-15, not yet verified.
|
||||
- **Cascades UniFi** — site `va6iba3v` on UOS controller `172.16.3.29` (for the spike
|
||||
investigation).
|
||||
- **Cascades pfSense** — `192.168.0.1`.
|
||||
|
||||
## Commands & Outputs
|
||||
|
||||
- No commands run against infrastructure this session.
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
|
||||
- **Investigate the noon network spike** via the UniFi skill (Cascades site va6iba3v on UOS
|
||||
172.16.3.29) + optionally pfSense WAN throughput — begins after this save.
|
||||
- **CS-SERVER drive remediation** still gated on: backup first full completes + verifies +
|
||||
confirmed image/bare-metal + retention set. Then rebuild-then-swap to 2x 480 GB enterprise
|
||||
SATA SSDs. Re-pull OMSA live before any physical action.
|
||||
- DC migration off the EOL R610 remains the strategic fix.
|
||||
|
||||
## Reference Information
|
||||
|
||||
- Prior drive session: `clients/cascades-tucson/session-logs/2026-06/2026-06-15-howard-cs-server-raid-vpn-reset.md`
|
||||
- Today's VLAN build: `clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-voice-vlan30-build.md`
|
||||
- Client wiki: `wiki/clients/cascades-tucson.md`
|
||||
- GuruRMM agent (CS-SERVER): `c39f1de7-d5b6-45ae-b132-e06977ab1713`. RMM API http://172.16.3.30:3001.
|
||||
Reference in New Issue
Block a user