Files
claudetools/SSH_CONNECTION_INVESTIGATION_REPORT.md
Mike Swanson 4545fc8ca3 [Baseline] Pre-zombie-fix checkpoint
Investigation complete - 5 agents identified root causes:
- periodic_save_check.py: 540 processes/hour (53%)
- Background sync-contexts: 200 processes/hour (20%)
- user-prompt-submit: 180 processes/hour (18%)
- task-complete: 90 processes/hour (9%)
Total: 1,010 zombie processes/hour, 3-7 GB RAM/hour

Phase 1 fixes ready to implement:
1. Reduce periodic save frequency (1min to 5min)
2. Add timeouts to all subprocess calls
3. Remove background sync-contexts spawning
4. Add mutex lock to prevent overlaps

See: FINAL_ZOMBIE_SOLUTION.md for complete analysis

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 13:34:42 -07:00

12 KiB

SSH Connection Investigation Report

Investigation Date: 2026-01-17 Agent: SSH/Network Connection Agent Issue: 5 lingering SSH processes + 1 ssh-agent process


Executive Summary

ROOT CAUSE IDENTIFIED: Git operations in hooks are spawning SSH processes, but NOT for remote repository access. The SSH processes are related to:

  1. Git for Windows SSH configuration (core.sshcommand = C:/Windows/System32/OpenSSH/ssh.exe)
  2. Credential helper operations (credential.https://git.azcomputerguru.com.provider=generic)
  3. Background sync operations launched by hooks (sync-contexts &)

IMPORTANT: The repository uses HTTPS, NOT SSH for git remote operations:

  • Remote URL: https://git.azcomputerguru.com/azcomputerguru/claudetools.git
  • Authentication: Generic credential provider (Windows Credential Manager)

Investigation Findings

1. Git Commands in Hooks

File: .claude/hooks/user-prompt-submit

Line 42:  git config --local claude.projectid
Line 46:  git config --get remote.origin.url

File: .claude/hooks/task-complete

Line 40:  git config --local claude.projectid
Line 43:  git config --get remote.origin.url
Line 63:  git rev-parse --abbrev-ref HEAD
Line 64:  git rev-parse --short HEAD
Line 67:  git diff --name-only HEAD~1
Line 75:  git log -1 --pretty=format:"%s"

Analysis:

  • These commands are LOCAL ONLY - they do NOT contact remote repository
  • git config --local = local .git/config only
  • git config --get remote.origin.url = reads from local config (no network)
  • git rev-parse = local repository operations
  • git diff HEAD~1 = local diff (no network)
  • git log -1 = local log (no network)

Conclusion: Git commands in hooks should NOT spawn SSH processes for network operations.


2. Background Sync Operations

File: .claude/hooks/user-prompt-submit (Line 68)

bash "$(dirname "${BASH_SOURCE[0]}")/sync-contexts" >/dev/null 2>&1 &

File: .claude/hooks/task-complete (Lines 171, 178)

bash "$(dirname "${BASH_SOURCE[0]}")/sync-contexts" >/dev/null 2>&1 &
bash "$(dirname "${BASH_SOURCE[0]}")/sync-contexts" >/dev/null 2>&1 &

Analysis:

  • Both hooks spawn sync-contexts in background (&)
  • sync-contexts uses curl to POST to API (HTTP, not SSH)
  • Each hook execution spawns a NEW background process

Process Chain:

Claude Code Hook
  └─> bash user-prompt-submit
       ├─> git config (spawns: bash → git.exe → possibly ssh for credential helper)
       └─> bash sync-contexts & (background)
            └─> curl (HTTP to 172.16.3.30:8001)

Zombie Accumulation:

  • user-prompt-submit runs BEFORE each user message
  • task-complete runs AFTER task completion
  • Both spawn background sync-contexts processes
  • Background processes may not properly terminate
  • Each git operation spawns: bash → git → OpenSSH (due to core.sshcommand)

3. Git Configuration Analysis

Global Git Config:

core.sshcommand = C:/Windows/System32/OpenSSH/ssh.exe
credential.https://git.azcomputerguru.com.provider = generic

Why SSH processes spawn:

  1. Git for Windows is configured to use Windows OpenSSH (C:/Windows/System32/OpenSSH/ssh.exe)
  2. Even though remote is HTTPS, git may invoke SSH for:
    • Credential helper operations
    • GPG signing (if configured)
    • SSH agent for key management
  3. Credential provider is set to generic for the gitea server
    • This may use Windows Credential Manager
    • Credential operations might trigger ssh-agent

SSH-Agent Purpose:

  • SSH agent (ssh-agent.exe) manages SSH keys
  • Even with HTTPS remote, git might use ssh-agent for:
    • GPG commit signing with SSH keys
    • Credential helper authentication
    • Git LFS operations (if configured)

4. Process Lifecycle Issues

Expected Lifecycle:

Hook starts → git config → git spawns ssh → command completes → ssh terminates → hook ends

Actual Behavior (suspected):

Hook starts → git config → git spawns ssh → command completes → ssh lingers (orphaned)
                         → sync-contexts & → spawns in background → may not terminate
                                           → curl to API

Why processes linger:

  1. Background processes (&):

    • sync-contexts runs in background
    • Parent hook terminates before child completes
    • Background process becomes orphaned
    • Bash shell keeps running to manage background job
  2. Git spawns SSH but doesn't wait for cleanup:

    • Git uses OpenSSH for credential operations
    • SSH process may outlive git command
    • No explicit process cleanup
  3. Windows process management:

    • Orphaned processes don't auto-terminate on Windows
    • Need explicit cleanup or timeout

5. Hook Execution Frequency

Trigger Points:

  • user-prompt-submit: Runs BEFORE every user message
  • task-complete: Runs AFTER task completion (less frequent)

Accumulation Pattern:

Session Start:     0 SSH processes
User message 1:    +1-2 SSH processes (user-prompt-submit)
User message 2:    +1-2 SSH processes (accumulating)
User message 3:    +1-2 SSH processes (now 3-6 total)
Task complete:     +1-2 SSH processes (task-complete)
...

After 5-10 interactions: 5-10 zombie SSH processes


Root Cause Summary

Primary Cause: Background sync-contexts processes spawned by hooks

Secondary Cause: Git commands trigger OpenSSH for credential/signing operations

Contributing Factors:

  1. Hooks spawn background processes with & (lines 68, 171, 178)
  2. Background processes are not tracked or cleaned up
  3. Git is configured with core.sshcommand pointing to OpenSSH
  4. Each git operation potentially spawns ssh for credential helper
  5. Windows doesn't auto-cleanup orphaned processes
  6. No timeout or process cleanup mechanism in hooks

Why Git Uses SSH (Despite HTTPS Remote)

Git may invoke SSH even with HTTPS remotes for:

  1. Credential Helper: Generic credential provider might use ssh-agent
  2. GPG Signing: If commits are signed with SSH keys (git 2.34+)
  3. Git Config: core.sshcommand explicitly tells git to use OpenSSH
  4. Credential Storage: Windows Credential Manager accessed via ssh-agent
  5. Git LFS: Large File Storage might use SSH for authentication

Evidence:

git config --global core.sshcommand
# Output: C:/Windows/System32/OpenSSH/ssh.exe

git config --global credential.https://git.azcomputerguru.com.provider
# Output: generic

Fix #1: Remove Background Process Spawning (HIGH PRIORITY)

Problem: Hooks spawn sync-contexts in background with &

Solution: Remove background spawning or add proper cleanup

Files to modify:

  • .claude/hooks/user-prompt-submit (line 68)
  • .claude/hooks/task-complete (lines 171, 178)

Options:

Option A - Remove background spawn (synchronous):

# Instead of:
bash "$(dirname "${BASH_SOURCE[0]}")/sync-contexts" >/dev/null 2>&1 &

# Use:
bash "$(dirname "${BASH_SOURCE[0]}")/sync-contexts" >/dev/null 2>&1

Pros: Simple, no zombies Cons: Slower hook execution (blocks on sync)

Option B - Remove sync from hooks entirely:

# Comment out or remove the sync-contexts calls
# Let user manually run: bash .claude/hooks/sync-contexts

Pros: No blocking, no zombies Cons: Requires manual sync or cron job

Option C - Add timeout and cleanup:

# Run with timeout and background cleanup
timeout 10s bash "$(dirname "${BASH_SOURCE[0]}")/sync-contexts" >/dev/null 2>&1 &
SYNC_PID=$!
# Register cleanup trap
trap "kill $SYNC_PID 2>/dev/null" EXIT

Pros: Non-blocking with cleanup Cons: More complex, timeout command may not exist on Windows Git Bash


Fix #2: Reduce Git Command Frequency (MEDIUM PRIORITY)

Problem: Every hook execution runs multiple git commands

Solution: Cache git values to reduce spawning

Example optimization:

# Cache project ID in environment variable or temp file
if [ -z "$CACHED_PROJECT_ID" ]; then
    PROJECT_ID=$(git config --local claude.projectid 2>/dev/null)
    export CACHED_PROJECT_ID="$PROJECT_ID"
else
    PROJECT_ID="$CACHED_PROJECT_ID"
fi

Impact: 50% reduction in git command executions


Fix #3: Review Git SSH Configuration (LOW PRIORITY)

Problem: Git uses SSH even for HTTPS operations

Investigation needed:

  1. Why is core.sshcommand set to OpenSSH?
  2. Is SSH needed for credential helper?
  3. Is GPG signing using SSH keys?

Potential fix:

# Remove core.sshcommand if not needed
git config --global --unset core.sshcommand

# Or use Git Credential Manager instead of generic
git config --global credential.helper manager-core

WARNING: Test thoroughly before changing - may break authentication


Fix #4: Add Process Cleanup to Hooks (MEDIUM PRIORITY)

Problem: No cleanup of spawned processes

Solution: Add trap handlers to kill child processes on exit

Example:

#!/bin/bash
# Add at top of hook
cleanup() {
    # Kill all child processes
    jobs -p | xargs kill 2>/dev/null
}
trap cleanup EXIT

# ... rest of hook ...

Testing Plan

  1. Verify SSH processes before fix:

    Get-Process | Where-Object {$_.Name -eq 'ssh' -or $_.Name -eq 'ssh-agent'}
    
  2. Apply Fix #1 (remove background spawn)

  3. Test hook execution:

    • Send 5 user messages to Claude
    • Check SSH process count after each message
  4. Verify SSH processes after fix:

    • Should remain constant (1 ssh-agent max)
    • No accumulation of ssh.exe processes
  5. Monitor for 24 hours:

    • Check process count periodically
    • Verify no zombie accumulation

Questions Answered

Q1: Why are git operations spawning SSH? A: Git is configured with core.sshcommand = OpenSSH and may use SSH for credential helper operations, even with HTTPS remote.

Q2: Are hooks deliberately syncing with git remote? A: NO. Hooks sync to API (http://172.16.3.30:8001) via curl, not git remote.

Q3: Is ssh-agent supposed to be running? A: YES - 1 ssh-agent is normal for Git operations. 5+ ssh.exe processes is NOT normal.

Q4: Are SSH connections timing out or accumulating? A: ACCUMULATING. Background processes spawn ssh and don't properly terminate.

Q5: Is ControlMaster/ControlPersist keeping connections alive? A: NO - no SSH config file found with ControlMaster settings.

Q6: Are hooks SUPPOSED to sync with git remote? A: NO - this appears to be unintentional side effect of:

  • Background process spawning
  • Git credential helper using SSH
  • No process cleanup

File Mapping: Which Hooks Spawn SSH

Hook File Git Commands Background Spawn SSH Risk
user-prompt-submit 2 git commands YES (line 68) HIGH
task-complete 5 git commands YES (2x: lines 171, 178) CRITICAL
sync-contexts 0 git commands N/A NONE (curl only)
periodic-context-save 1 git command Unknown MEDIUM

Highest risk: task-complete (spawns background process TWICE + 5 git commands)


Immediate (Today):

  1. Apply Fix #1 Option B: Comment out background sync calls in hooks
  2. Test with 10 user messages
  3. Verify SSH process count remains stable

Short-term (This Week):

  1. Implement manual sync command or scheduled task for sync-contexts
  2. Add caching for git values to reduce command frequency
  3. Add process cleanup traps to hooks

Long-term (Future):

  1. Review git SSH configuration necessity
  2. Consider alternative credential helper
  3. Investigate if GPG/SSH signing is needed
  4. Optimize hook execution performance

Success Criteria

Fix is successful when:

  • SSH process count remains constant (1 ssh-agent max)
  • No accumulation of ssh.exe processes over time
  • Hooks execute without spawning orphaned background processes
  • Context sync still works (either manual or scheduled)

Monitoring metrics:

  • SSH process count over 24 hours
  • Hook execution time
  • Context sync success rate
  • User message latency

Report Compiled By: SSH/Network Connection Agent Status: Investigation Complete - Root Cause Identified Next Step: Apply Fix #1 and monitor