sync: auto-sync from GURU-5070 at 2026-06-14 06:29:50

Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-14 06:29:50
This commit is contained in:
2026-06-14 06:30:07 -07:00
parent 956a39b631
commit 65f2045385
2 changed files with 161 additions and 0 deletions

View File

@@ -0,0 +1,145 @@
## User
- **User:** Mike Swanson (mike)
- **Machine:** GURU-5070
- **Role:** admin
## Session Summary
Continuation of the Peaceful Spirit AD+DFS rebuild (follows
`2026-06-13-mike-pst-server2-dc-rebuild-and-g-cleanup.md`). Completed two follow-ups and the
core of Gate 4 (data DFS-R rebuild). All work via GuruRMM against PST-SERVER (192.168.0.2) and
PST-SERVER2.
First, the SERVER2 static-IP follow-up: confirmed `192.168.1.5` was SERVER2's intended address
(the lingering A record was from its prior life, not stale junk). Verified `.5` free, Mike set the
static IP manually (192.168.1.5/24, GW 192.168.1.1, DNS 192.168.0.2+127.0.0.1) after my scripted
attempt was rejected. Cleaned up 4 stale `.127` DNS A records (`@`, DomainDnsZones,
ForestDnsZones, and the old `WIN-MUM1QS0V4LN` default-hostname record); dcdiag Connectivity +
Advertising then passed. Also fixed SERVER2 timezone (was Pacific -> set to US Mountain Standard,
i.e. Arizona).
Then Gate 4. Chose `C:\Shares` on SERVER2 as the DFS-R target (only C: exists, 634 GB free, no
unallocated space; the original D: was a separate physical disk now gone). Confirmed the live data
on PST-SERVER `G:\Shares` is fully intact (~265 GB: Private 154, Scanned 105, ITServices 5, qbooks
2) — the recurring "G:\Shares 0 GB" reading is a SYSTEM-cant-enumerate-the-share-root ACL artifact,
not data loss (direct subpaths show the real sizes). Removed the stale `\\PST-SERVER2\Shares`
namespace folder target so NW clients keep using PST-SERVER's full data. Created empty `C:\Shares`
+ matching `Shares` SMB share (Everyone=Full, like PST-SERVER's). Tore down the broken PST-DFS
group (orphaned member + State 5) and rebuilt it clean with **PST-SERVER as PRIMARY** (G:\Shares,
authoritative) and SERVER2 non-primary (C:\Shares, empty) — the data-safety-critical step,
triple-checked. Replication started and SERVER2 pulled **~221 / 265 GB (~84%)** into C:\Shares
before the session paused.
Session ended blocked: **PST-SERVER2 began flapping** (online ~1 min, offline several min,
repeating) — a NW-site stability problem (reboot loop / power / network), NOT caused by the DFS
changes. PST-SERVER (the source) stayed rock-solid the whole time; no data risk. The final Gate 4
steps (re-add SERVER2 namespace folder target, add 2nd namespace root target, drain remaining
~44 GB backlog) are deferred until SERVER2 is stably online.
## Key Decisions
- **SERVER2 DFS-R target = `C:\Shares`** (folder on existing volume), not a new D: partition —
634 GB free, no remote-partitioning risk; original D: physical disk is gone.
- **PST-SERVER = PRIMARY member** for the rebuilt PST-DFS. Primary must be the member that HAS the
data; SERVER2 (empty) is non-primary and receives. Getting this backwards would have pushed
SERVER2's empty folder as authoritative and moved PST-SERVER's 265 GB to ConflictAndDeleted.
- **Removed (not just disabled) the stale SERVER2 namespace folder target.** `Set-DfsnFolderTarget
-State Offline` failed ("object could not be found") on this 2016 namespace; `Remove-DfsnFolderTarget`
worked. Will re-add it Online after initial sync completes.
- **Clean teardown+rebuild of PST-DFS** rather than repair — removed the orphaned blank member
(leftover from SERVER2's metadata cleanup). `Remove-DfsReplicationGroup` needs
`-RemoveReplicatedFolders`; config removal never touches the `G:\Shares` content.
- **Stopped pushing commands once SERVER2 began flapping** — surfaced it as a NW-health issue for
Mike rather than fighting an unstable box.
## Problems Encountered
- **PST-SERVER2 flapping (UNRESOLVED):** offline 04:37-04:46 UTC (~9 min), online ~1 min, offline
again 04:47+. Looks like a reboot loop / power / NW network instability. Needs investigation when
it next holds a stable window — pull System log (kernel-power 41, unexpected-shutdown 6008,
1074) to find the cause. Possibly on-site (power/UPS/hardware).
- **DFSR initial state confusion:** after the clean rebuild, PST-SERVER Shares RF briefly showed
State 5 and SERVER2 had no Shares RF — resolved by restarting the DFSR service on both + pollad;
PST-SERVER then logged event 4112 (designated primary) and reached State 4. SERVER2 already had
221 GB from the first (messy) membership config, kept as pre-existing data (hash-matched).
- **Apparent time skew was a red herring** — both DCs UTC-synced; SERVER2 was just on the wrong
timezone (Pacific). Fixed to Mountain.
- **DA creds over RMM:** AD/DFS/DFSN/DFSR config writes need Domain Admin (agent SYSTEM can't).
Working pattern: `Invoke-Command -ComputerName PST-SERVER.PEACEFULSPIRIT.local -Credential
$cred -ScriptBlock {...}` (FQDN; loopback works now with the CORRECT password). `-Server
localhost -Credential` fails with a Kerberos SPN error.
- **PowerShell-over-RMM quoting (recurring):** scripts wrapped in bash single-quotes can't contain
single quotes; `\"` over-escaping breaks `-join`/`-eq`; use a `$var` for literal strings instead.
Several commands no-op'd on parse errors until fixed.
- **IP-change-over-RMM:** a direct New-NetIPAddress command was killed mid-NIC-blip (agent dropped),
leaving config unapplied (reverted to DHCP). The robust pattern is a one-shot scheduled task that
applies the change a few seconds AFTER the command returns — but Mike set the IP manually instead.
## Configuration Changes
- PST-SERVER2: static IP 192.168.1.5/24, GW 192.168.1.1, DNS 192.168.0.2+127.0.0.1 (manual by
Mike); timezone -> US Mountain Standard Time.
- DNS (PEACEFULSPIRIT.local): removed stale A records `@ -> .127`, `DomainDnsZones -> .127`,
`ForestDnsZones -> .127`, `WIN-MUM1QS0V4LN -> .127`. Remaining A records all point to .5.
- PST-SERVER2: created `C:\Shares` + SMB share `Shares` (Everyone=Full, Unrestricted enum, Manual
caching).
- DFS namespace `\\PEACEFULSPIRIT.local\PST-Files\Shares`: removed the `\\PST-SERVER2\Shares`
folder target (only `\\PST-SERVER\Shares` remains, Online). **TO RE-ADD after sync.**
- DFS-R group **PST-DFS** rebuilt: RF `Shares`; member PST-SERVER `G:\Shares` PrimaryMember=TRUE,
staging 20 GB; member PST-SERVER2 `C:\Shares` non-primary, staging 20 GB; bidirectional
connection. Initial replication ~221/265 GB done.
- No repo files changed (this save creates the session log).
## Credentials & Secrets
- No new credentials. Used `PEACEFULSPIRIT\sysadmin` / `r3tr0gradE99!` (vault
`clients/peaceful-spirit/server`, field path `credentials.password` — read via full `vault.sh
get`, since `get-field credentials.password` returns the literal "null" — known bug). DA password
was passed base64-wrapped in RMM command_text again (recoverable from RMM DB; rotation optional,
internal).
- SERVER2 local-admin + DSRM passwords remain in vault `clients/peaceful-spirit/server2` (created
prior session, already pushed).
## Infrastructure & Servers
- **PST-SERVER** 192.168.0.2, site CC, all FSMO, GC, DNS, Server 2016 Essentials. RMM
`87293069-33b6-45e8-a68f-6811216cdb96`. Data on `G:\Shares` ~265 GB. G: now 182 GB free.
- **PST-SERVER2** **192.168.1.5**/24 (static), site NW, GC, DNS, Server 2019 Standard. RMM
`5d2d7ba0-3903-4aa3-9e97-6ca4424ffe65`. Data replica at `C:\Shares` (~221 GB so far). **Flapping
as of session end.**
- DFS namespace `\\PEACEFULSPIRIT.local\PST-Files` -> folder `Shares`. Root target: PST-SERVER only
(SERVER2 root target still TO ADD for HA). Folder targets: PST-SERVER only currently.
- S2S VPN CC<->NW confirmed up earlier (389/445/88 reachable) — but flaps when SERVER2 drops.
## Commands & Outputs
- DNS cleanup: `Get-DnsServerResourceRecord -ZoneName PEACEFULSPIRIT.local -RRType A | ? {IPv4Address -eq '192.168.1.127'} | Remove-DnsServerResourceRecord`.
- Namespace target remove (DA Invoke-Command): `Remove-DfsnFolderTarget -Path "\\PEACEFULSPIRIT.local\PST-Files\Shares" -TargetPath "\\PST-SERVER2.PEACEFULSPIRIT.local\Shares" -Force`.
- DFS-R rebuild (DA Invoke-Command on PST-SERVER FQDN): Remove-DfsReplicationGroup -RemoveReplicatedFolders -Force;
New-DfsReplicationGroup/New-DfsReplicatedFolder/Add-DfsrMember/Add-DfsrConnection;
`Set-DfsrMembership -ComputerName PST-SERVER -ContentPath G:\Shares -PrimaryMember $true -StagingPathQuotaInMB 20480 -Force`;
`Set-DfsrMembership -ComputerName PST-SERVER2 -ContentPath C:\Shares -StagingPathQuotaInMB 20480 -Force`.
- Kickoff: `Restart-Service DFSR -Force; dfsrdiag pollad` on both. PST-SERVER event **4112** (designated
primary), Shares RF State 4. SERVER2 `C:\Shares` reached 221.78 GB.
- Data integrity proof: direct `Get-ChildItem G:\Shares\Private|Scanned|ITServices` => 153.77 / 104.83 / 4.73 GB.
## Pending / Incomplete Tasks
1. **PST-SERVER2 stability (BLOCKER):** diagnose why it's flapping (System log 41/6008/1074); likely
NW power/UPS/hardware/network. Possibly on-site.
2. **Finish Gate 4 once SERVER2 is stable:**
- Confirm DFS-R backlog drains to 0 (`Get-DfsrBacklog` via DA; ~44 GB remained).
- Re-add SERVER2 folder target: `New-DfsnFolderTarget -Path "\\PEACEFULSPIRIT.local\PST-Files\Shares" -TargetPath "\\PST-SERVER2.PEACEFULSPIRIT.local\Shares"` (Online).
- Add SERVER2 as 2nd namespace ROOT target: `New-DfsnRootTarget -Path "\\PEACEFULSPIRIT.local\PST-Files" -TargetPath "\\PST-SERVER2.PEACEFULSPIRIT.local\PST-Files"` (needs a PST-Files root share/DFSRoots on SERVER2) — for VPN-outage namespace HA.
- Verify both RFs State 4, dcdiag clean, referral ordering site-aware.
3. **Lower priority:** D: backup-junk cleanup on PST-SERVER (~700 GB); remove `C:\PST-Backup\*`
once rebuild confirmed stable; wiki update (`wiki/clients/peaceful-spirit.md` — 2 DCs, SERVER2
192.168.1.5, DFS-R rebuilt); optional sysadmin rotation.
## Reference Information
- Prior log: `clients/peaceful-spirit/session-logs/2026-06/2026-06-13-mike-pst-server2-dc-rebuild-and-g-cleanup.md`.
- Runbook: `clients/peaceful-spirit/AD-DC2-REBUILD-RUNBOOK.md`.
- RMM API `http://172.16.3.30:3001`. Agents: PST-SERVER `87293069-...`, PST-SERVER2 `5d2d7ba0-...`.
- DFS-R group `PST-DFS`, RF `Shares`. Namespace `\\PEACEFULSPIRIT.local\PST-Files`.
- Vault: `clients/peaceful-spirit/server` (DA), `clients/peaceful-spirit/server2` (SERVER2 local/DSRM).

View File

@@ -173,3 +173,19 @@ Orders project tree. Staged 12.2 MB of pure source (147 .frm, 4 .bas, 5 .vbp; ne
recovered-source status + locations, modernization assessment + first concrete steps, sources).
- Linked from `wiki/index.md` (Projects table) and `wiki/clients/valleywide.md` (app-mod section
pointer + "source RECOVERED" note superseding the old "lost" text).
## Update: 06:16 PT — VWP-FILES RMM enrollment, fleet check, MSP360 backup verified
- **VWP-FILES enrolled in GuruRMM** (Mike pushed the agent via ScreenConnect after the per-site MSI
endpoint 500'd). Agent id `8e02fbbc-0db1-4044-b4c2-b0732d64f029`, online, site "Main Office", v0.6.66.
- **VWP server fleet check (RMM):** all 11 enrolled Windows servers online/healthy (VWP-HYPERV1 2025,
VWP-QBS/VWP-SERVER 2022, VWP-DC1/VWP_ADSRVR/VWP-FIN/WIN-Server97/WINFileSvr/WIN-AD-SRVR-2/
WIN-Backup-SRV 2019, SERVER19 2012 R2). VWP-FILES had been the only unmonitored server until enrolled.
- **MSP360 backup verified** on VWP-FILES via `GET /api/agents/<id>/backup-status`:
provider msp360, plan "Files Backup Plan VWP-FILES", **status=success**, 103,849 files / 0 failed,
~63.7 GB of ~91 GB copied to Cloud (B2), build 8.6.0.338, last 2026-06-13 18:18 PT, next Jun 15 07:00 UTC.
(Only message is Info-level "987 system/hidden files skipped" — normal.)
- **Tagged Winter** (`<@624666486362996755>`) in Discord **#bot-alerts** re: the new VWP-FILES
backup, for her backup monitoring/billing.
- New RMM API note: **`GET /api/agents/<id>/backup-status`** returns the MSP360/CloudBerry backup
result (provider, plan_name, status, files_copied/failed, data_copied/total_bytes, last/next_backup_at).