Files
claudetools/monitor_zombies.ps1
Mike Swanson 359c2cf1b4 Fix zombie process accumulation and broken context recall (Phase 1 - Emergency Fixes)
CRITICAL: This commit fixes both the zombie process issue AND the broken
context recall system that was failing silently due to encoding errors.

ROOT CAUSES FIXED:
1. Periodic save running every 1 minute (540 processes/hour)
2. Missing timeouts on subprocess calls (hung processes)
3. Background spawning with & (orphaned processes)
4. No mutex lock (overlapping executions)
5. Missing UTF-8 encoding in log functions (BREAKING context saves)

FIXES IMPLEMENTED:

Fix 1.1 - Reduce Periodic Save Frequency (80% reduction)
  - File: .claude/hooks/setup_periodic_save.ps1
  - Change: RepetitionInterval 1min -> 5min
  - Impact: 540 -> 108 processes/hour from periodic saves

Fix 1.2 - Add Subprocess Timeouts (prevent hangs)
  - Files: periodic_save_check.py (3 calls), periodic_context_save.py (4 calls)
  - Change: Added timeout=5 to all subprocess.run() calls
  - Impact: Prevents indefinitely hung git/ssh processes

Fix 1.3 - Remove Background Spawning (eliminate orphans)
  - Files: user-prompt-submit (line 68), task-complete (lines 171, 178)
  - Change: Removed & from sync-contexts spawning, made synchronous
  - Impact: Eliminates 290 orphaned processes/hour

Fix 1.4 - Add Mutex Lock (prevent overlaps)
  - File: periodic_save_check.py
  - Change: Added acquire_lock()/release_lock() with try/finally
  - Impact: Prevents Task Scheduler from spawning overlapping instances

Fix 1.5 - Add UTF-8 Encoding (CRITICAL - enables context saves)
  - Files: periodic_context_save.py, periodic_save_check.py
  - Change: Added encoding="utf-8" to all log file opens
  - Impact: FIXES silent failure preventing ALL context saves since deployment

TOOLS ADDED:
  - monitor_zombies.ps1: PowerShell script to track process counts and memory

EXPECTED RESULTS:
  - Before: 1,010 processes/hour, 3-7 GB RAM/hour
  - After: ~151 processes/hour (85% reduction), minimal RAM growth
  - Context recall: NOW WORKING (was completely broken)

TESTING:
  - Run monitor_zombies.ps1 before and after 30min work session
  - Verify context auto-injection on Claude Code restart
  - Check .claude/periodic-save.log for successful saves (no encoding errors)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 13:51:22 -07:00

79 lines
2.7 KiB
PowerShell

# Zombie Process Monitor - Test Phase 1 Fixes
# Run this before and after 30-minute test period
$Timestamp = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
$OutputFile = "D:\ClaudeTools\zombie_test_results.txt"
Write-Host "[OK] Zombie Process Monitor - $Timestamp" -ForegroundColor Green
Write-Host ""
# Count target processes
$GitProcesses = @(Get-Process | Where-Object { $_.ProcessName -like "*git*" })
$BashProcesses = @(Get-Process | Where-Object { $_.ProcessName -like "*bash*" })
$SSHProcesses = @(Get-Process | Where-Object { $_.ProcessName -like "*ssh*" })
$ConhostProcesses = @(Get-Process | Where-Object { $_.ProcessName -like "*conhost*" })
$PythonProcesses = @(Get-Process | Where-Object { $_.ProcessName -like "*python*" })
$GitCount = $GitProcesses.Count
$BashCount = $BashProcesses.Count
$SSHCount = $SSHProcesses.Count
$ConhostCount = $ConhostProcesses.Count
$PythonCount = $PythonProcesses.Count
$TotalCount = $GitCount + $BashCount + $SSHCount + $ConhostCount + $PythonCount
# Memory info
$OS = Get-WmiObject Win32_OperatingSystem
$TotalMemoryGB = [math]::Round($OS.TotalVisibleMemorySize / 1MB, 2)
$FreeMemoryGB = [math]::Round($OS.FreePhysicalMemory / 1MB, 2)
$UsedMemoryGB = [math]::Round($TotalMemoryGB - $FreeMemoryGB, 2)
$MemoryUsagePercent = [math]::Round(($UsedMemoryGB / $TotalMemoryGB) * 100, 1)
# Display results
Write-Host "Process Counts:" -ForegroundColor Cyan
Write-Host " Git: $GitCount"
Write-Host " Bash: $BashCount"
Write-Host " SSH: $SSHCount"
Write-Host " Conhost: $ConhostCount"
Write-Host " Python: $PythonCount"
Write-Host " ---"
Write-Host " TOTAL: $TotalCount" -ForegroundColor Yellow
Write-Host ""
Write-Host "Memory Usage:" -ForegroundColor Cyan
Write-Host " Total: ${TotalMemoryGB} GB"
Write-Host " Used: ${UsedMemoryGB} GB (${MemoryUsagePercent}%)"
Write-Host " Free: ${FreeMemoryGB} GB"
Write-Host ""
# Save to file
$LogEntry = @"
========================================
Timestamp: $Timestamp
========================================
Process Counts:
Git: $GitCount
Bash: $BashCount
SSH: $SSHCount
Conhost: $ConhostCount
Python: $PythonCount
TOTAL: $TotalCount
Memory Usage:
Total: ${TotalMemoryGB} GB
Used: ${UsedMemoryGB} GB (${MemoryUsagePercent}%)
Free: ${FreeMemoryGB} GB
"@
Add-Content -Path $OutputFile -Value $LogEntry
Write-Host "[OK] Results logged to: $OutputFile" -ForegroundColor Green
Write-Host ""
Write-Host "TESTING INSTRUCTIONS:" -ForegroundColor Yellow
Write-Host "1. Note the TOTAL count above (baseline)"
Write-Host "2. Work normally for 30 minutes"
Write-Host "3. Run this script again"
Write-Host "4. Compare TOTAL counts:"
Write-Host " - Old behavior: ~505 new processes in 30min"
Write-Host " - Fixed behavior: ~75 new processes in 30min"
Write-Host ""