Files
claudetools/clients/peaceful-spirit/AD-DC2-REBUILD-RUNBOOK.md
Mike Swanson 93eb2fb9bb sync: auto-sync from GURU-5070 at 2026-06-13 20:21:10
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-13 20:21:10
2026-06-13 20:21:37 -07:00

4.4 KiB

Peaceful Spirit — PST-SERVER2 evict + re-promote runbook

Created: 2026-06-13 by Mike Swanson (GURU-5070) Why: PST-SERVER2 is a past-tombstone-lifetime DC. AD replication dead both directions (err 8614 "exceeded tombstone lifetime"; err 0x8009030C broken secure channel). SYSVOL + data DFS-R in State 5 (InError), stale 200-224 days. A past-TSL DC must NOT be allowed to resume replication (lingering-object reanimation risk). So: evict SERVER2, metadata-clean, re-promote fresh.

Authoritative/healthy DC: PST-SERVER (192.168.0.2) — holds ALL 5 FSMO. Server 2016 Essentials. Domain PEACEFULSPIRIT.local (Win2016 functional level). DC to rebuild: PST-SERVER2 (192.168.1.127, NW site) — Server 2019 Standard, additional DC only.

Execution channel: GuruRMM (SYSTEM context). PST-SERVER 87293069-33b6-45e8-a68f-6811216cdb96, PST-SERVER2 5d2d7ba0-3903-4aa3-9e97-6ca4424ffe65. Domain admin = sysadmin (vault: clients/peaceful-spirit/server.sops.yaml). NOTE: promotion needs Domain Admin creds passed in the RMM command — that password lands in RMM command_text/history (internal). Consider rotation after if RMM DB exposure is a concern.


Gates (confirm with Mike before each)

Gate 0 — Pre-flight + safety backup (SAFE: read-only + backup)

  • Confirm PST-SERVER is a Global Catalog. (If SERVER2 were the only GC, must GC-flag SERVER first.)
  • Confirm all 5 FSMO on PST-SERVER (done: yes).
  • dcdiag focused (Advertising/FSMOCheck/Services) on PST-SERVER — must be clean.
  • Enable Strict Replication Consistency on PST-SERVER (protective; reg key) — change, but safe/recommended.
  • BACK UP authoritative SYSVOL: robocopy C:\Windows\SYSVOL\domain\Policies -> C:\PST-Backup\SYSVOL-Policies and Backup-GPO -All. Insurance before any AD change.

Gate 1 — Force-demote PST-SERVER2 (DESTRUCTIVE to SERVER2; reboots SERVER2)

  • On SERVER2: Uninstall-ADDSDomainController -ForceRemoval -DemoteOperationMasterRole -Force -LocalAdministratorPassword <new, vaulted> (graceful demote impossible — replication dead).
  • SERVER2 becomes a member/standalone server and reboots. Blast radius = SERVER2 only.
  • Risk: AD changes made ONLY on SERVER2 during isolation are lost (already stranded; PDC authoritative).

Gate 2 — Metadata cleanup on PST-SERVER (DESTRUCTIVE to AD metadata for SERVER2)

  • Remove SERVER2 NTDS Settings / server object (ntdsutil metadata cleanup, or Remove-ADObject of the NTDS Settings object with -Credential domain admin).
  • Remove SERVER2 from AD Sites & Services (NW site server object).
  • DNS cleanup: SERVER2 host A, _msdcs CNAME/GUID, NS records, SRV records.
  • DFSR cleanup: remove SERVER2 member from "Domain System Volume" (SYSVOL) and "PST-DFS" groups.
  • Verify: repadmin /viewlist * shows only PST-SERVER; dcdiag clean.

Gate 3 — Re-promote PST-SERVER2 (re-introduces a DC)

  • Ensure SERVER2 DNS points to PST-SERVER (192.168.0.2) primary. (Currently 192.168.0.2,192.168.1.5,8.8.8.8,1.1.1.1.)
  • Install-ADDSDomainController -DomainName PEACEFULSPIRIT.local -Credential <DA> -InstallDns -SiteName NW -SafeModeAdministratorPassword <new, vaulted> — fresh promotion.
  • SYSVOL initializes clean via DFSR initial sync from PST-SERVER (no D2/D4 needed).
  • Verify: repadmin /replsummary 0% fails; SYSVOL+NETLOGON shared on SERVER2; dcdiag clean; GPO count matches SERVER (11).

Gate 4 — Rebuild data DFS-R (deferred — separate decision)

  • Provision SERVER2 data volume (shrink C: / add disk / folder-on-C: — TBD after G: cleanup + sizing).
  • Recreate Shares folder target on SERVER2 + re-establish PST-DFS replication.
  • Add PST-SERVER2 as 2nd namespace ROOT target (namespace HA for VPN-outage resilience).
  • Confirm backlog drains to 0.

Gate 5 — G: cleanup on PST-SERVER (separate)

  • ~160 GB candidates: G:\Windows (32), G:\Program Files (x86) (13), G:\ProgramData (10), G:\Users (51), G:$Recycle.Bin (5.6), VSS in System Volume Information (~46). Confirm junk first.
  • D: recovery junk (~700 GB): Recovery-EXT, Recovery2019, "Unknown folder" — confirm before delete.

Rollback notes

  • Gate 0 changes (strict consistency reg) are trivially reversible.
  • After Gate 1 demotion, SERVER2 is a plain member server — re-promotion (Gate 3) restores it. No rollback needed for the eviction itself; the domain runs fine on PST-SERVER alone meanwhile.
  • The SYSVOL/GPO backup from Gate 0 is the restore point if PST-SERVER's SYSVOL were ever harmed (it should not be touched by this procedure).