Files
claudetools/.claude/agents/CODE_REVIEW_ST_TESTING.md
Mike Swanson 6c316aa701 Add VPN configuration tools and agent documentation
Created comprehensive VPN setup tooling for Peaceful Spirit L2TP/IPsec connection
and enhanced agent documentation framework.

VPN Configuration (PST-NW-VPN):
- Setup-PST-L2TP-VPN.ps1: Automated L2TP/IPsec setup with split-tunnel and DNS
- Connect-PST-VPN.ps1: Connection helper with PPP adapter detection, DNS (192.168.0.2), and route config (192.168.0.0/24)
- Connect-PST-VPN-Standalone.ps1: Self-contained connection script for remote deployment
- Fix-PST-VPN-Auth.ps1: Authentication troubleshooting for CHAP/MSChapv2
- Diagnose-VPN-Interface.ps1: Comprehensive VPN interface and routing diagnostic
- Quick-Test-VPN.ps1: Fast connectivity verification (DNS/router/routes)
- Add-PST-VPN-Route-Manual.ps1: Manual route configuration helper
- vpn-connect.bat, vpn-disconnect.bat: Simple batch file shortcuts
- OpenVPN config files (Windows-compatible, abandoned for L2TP)

Key VPN Implementation Details:
- L2TP creates PPP adapter with connection name as interface description
- UniFi auto-configures DNS (192.168.0.2) but requires manual route to 192.168.0.0/24
- Split-tunnel enabled (only remote traffic through VPN)
- All-user connection for pre-login auto-connect via scheduled task
- Authentication: CHAP + MSChapv2 for UniFi compatibility

Agent Documentation:
- AGENT_QUICK_REFERENCE.md: Quick reference for all specialized agents
- documentation-squire.md: Documentation and task management specialist agent
- Updated all agent markdown files with standardized formatting

Project Organization:
- Moved conversation logs to dedicated directories (guru-connect-conversation-logs, guru-rmm-conversation-logs)
- Cleaned up old session JSONL files from projects/msp-tools/
- Added guru-connect infrastructure (agent, dashboard, proto, scripts, .gitea workflows)
- Added guru-rmm server components and deployment configs

Technical Notes:
- VPN IP pool: 192.168.4.x (client gets 192.168.4.6)
- Remote network: 192.168.0.0/24 (router at 192.168.0.10)
- PSK: rrClvnmUeXEFo90Ol+z7tfsAZHeSK6w7
- Credentials: pst-admin / 24Hearts$

Files: 15 VPN scripts, 2 agent docs, conversation log reorganization,
guru-connect/guru-rmm infrastructure additions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 11:51:47 -07:00

395 lines
12 KiB
Markdown

---
name: "Code Review Sequential Thinking Testing"
description: "Test scenarios for Code Review Agent with Sequential Thinking MCP"
---
# Code Review Agent - Sequential Thinking Testing
This document demonstrates the enhanced Code Review Agent with Sequential Thinking MCP integration.
**Last Updated:** 2026-01-17
---
## Test Scenarios
### Scenario 1: Simple Issue (No ST Required)
**Code Submitted:**
```python
def calculate_tax(amount):
return amount * 0.08
```
**Issues:**
- Missing type hints
- No docstring
- Magic number (0.08)
**Expected Behavior:** Code reviewer fixes directly, approves
**ST Triggered:** NO (simple, first review)
**Review Result:**
```python
def calculate_tax(amount: Decimal) -> Decimal:
"""Calculate sales tax on given amount.
Args:
amount: Pre-tax amount
Returns:
Tax amount (8% rate)
"""
TAX_RATE = Decimal("0.08")
return amount * TAX_RATE
```
**Status:** APPROVED (after minor fixes)
---
### Scenario 2: First Rejection (Standard Escalation)
**Code Submitted:**
```python
def login(username, password):
user = db.query(f"SELECT * FROM users WHERE name='{username}'")
if user and user.password == password:
return True
return False
```
**Issues:**
- SQL injection vulnerability
- Plaintext password comparison
**Expected Behavior:** Standard escalation format
**ST Triggered:** NO (only 1 critical issue per category, first review)
**Review Result:**
```markdown
## Code Review - Requires Revision
**Specification Compliance:** FAIL
**Security:** CRITICAL ISSUES
### CRITICAL: SQL Injection
- **Location:** login function, line 2
- **Problem:** String interpolation in SQL query
- **Impact:** Database compromise possible
- **Required Fix:** Use parameterized queries
### CRITICAL: Plaintext Password
- **Location:** login function, line 3
- **Problem:** Direct password comparison
- **Impact:** Passwords should be hashed (bcrypt)
- **Required Fix:** Hash passwords, use secure comparison
**Checklist:**
- [ ] Parameterized SQL queries
- [ ] Bcrypt password hashing
```
**Status:** ESCALATED (standard format)
---
### Scenario 3: Second Rejection (ST TRIGGERED)
**Code Resubmitted After Fix:**
```python
def login(username, password):
user = db.query("SELECT * FROM users WHERE name=?", username)
if user and bcrypt.checkpw(password, user.password):
session['user'] = username
return True
return False
```
**Issues:**
- Session management insecure (no token, no expiry)
- No rate limiting (brute force possible)
- Username might not exist (error handling)
**Expected Behavior:** ST TRIGGERED (2nd rejection)
**ST Analysis:**
```
Thought 1: This is the second rejection. First was SQL injection + plaintext
passwords. Now session management issues. Pattern: treating authentication
as simple data retrieval rather than security-critical component.
Thought 2: Root cause is mental model problem. Coding Agent fixing surface
issues but missing the security-first mindset required for authentication.
Thought 3: Need comprehensive authentication pattern, not incremental fixes.
Thought 4: Should provide complete example with security threat model
explanation to shift mental model.
Thought 5: Include: parameterized queries (done), bcrypt (done), session
tokens (missing), rate limiting (missing), input validation (missing),
timing-safe comparison (missing).
Thought 6: Enhanced escalation format with educational context needed.
```
**Review Result:**
```markdown
## Code Review - Requires Revision (Complex Issues Analyzed)
**Review Iteration:** 2 (USING SEQUENTIAL THINKING ANALYSIS)
**Reason:** 2nd rejection - pattern indicates mental model mismatch
## Root Cause Analysis
**Surface Issues:**
- Insecure session management
- No brute force protection
- Missing input validation
**Root Cause:**
Authentication is being treated as a CRUD operation (create/read user)
rather than a security-critical component with an active threat model.
**Why Previous Attempts Failed:**
Incremental security fixes without understanding the holistic security
requirements for authentication systems.
## Comprehensive Solution Strategy
**Recommended Approach:**
Implement complete authentication system with defense-in-depth:
- JWT tokens (not simple sessions)
- Rate limiting (prevent brute force)
- Input validation (prevent injection beyond SQL)
- Timing-safe comparison (prevent timing attacks)
- Audit logging (security monitoring)
**Why This Approach:**
- Addresses all OWASP authentication risks
- Provides defense-in-depth
- Industry standard pattern
**Complete Example:**
```python
from datetime import datetime, timedelta
import jwt
import bcrypt
from functools import wraps
from typing import Optional
# Rate limiting decorator
def rate_limit(max_attempts=5, window_minutes=15):
attempts = {}
def decorator(func):
@wraps(func)
def wrapper(username, *args, **kwargs):
now = datetime.now()
if username in attempts:
count, first = attempts[username]
if (now - first) < timedelta(minutes=window_minutes):
if count >= max_attempts:
raise Exception("Rate limit exceeded")
attempts[username] = (count + 1, first)
else:
attempts[username] = (1, now)
else:
attempts[username] = (1, now)
return func(username, *args, **kwargs)
return wrapper
return decorator
@rate_limit(max_attempts=5, window_minutes=15)
def login(username: str, password: str) -> Optional[str]:
"""Authenticate user and return JWT token.
Security measures:
- Parameterized queries (SQL injection prevention)
- Bcrypt password hashing (credential theft protection)
- Rate limiting (brute force prevention)
- Input validation (injection prevention)
- JWT tokens (secure session management)
- Timing-safe comparison (timing attack prevention)
Args:
username: User's username (validated)
password: User's plaintext password
Returns:
JWT token string if authenticated, None otherwise
Raises:
ValueError: If inputs are invalid
Exception: If rate limit exceeded
"""
# Input validation
if not username or not isinstance(username, str):
raise ValueError("Invalid username")
if len(username) > 50:
raise ValueError("Username too long")
if not password or len(password) < 8:
raise ValueError("Invalid password")
# Parameterized query (SQL injection prevention)
user = db.query(
"SELECT id, username, password_hash FROM users WHERE username = ?",
params=(username,)
)
# Timing-safe comparison to prevent timing attacks
if not user:
# Run bcrypt anyway to prevent timing leak
bcrypt.checkpw(b"dummy", bcrypt.hashpw(b"dummy", bcrypt.gensalt()))
return None
# Bcrypt comparison (secure password verification)
if not bcrypt.checkpw(password.encode(), user.password_hash):
return None
# Generate JWT token (secure session management)
token = jwt.encode({
'user_id': user.id,
'username': user.username,
'exp': datetime.utcnow() + timedelta(hours=24),
'iat': datetime.utcnow()
}, SECRET_KEY, algorithm='HS256')
# Audit logging (security monitoring)
audit_log.info(f"User {username} logged in", extra={
'user_id': user.id,
'ip': request.remote_addr,
'timestamp': datetime.utcnow()
})
return token
```
## Educational Context
**Key Concept: Defense in Depth**
Authentication is an attack surface. Every decision has security implications.
Multiple layers of defense ensure that if one fails, others protect the system.
**Security Threat Model for Authentication:**
1. SQL Injection → Parameterized queries
2. Credential theft → Password hashing
3. Brute force → Rate limiting
4. Session hijacking → JWT with expiry
5. Timing attacks → Constant-time comparison
6. Injection attacks → Input validation
7. Forensics → Audit logging
**Why It Matters:**
Authentication failures lead to complete system compromise. A single
vulnerability can bypass all other security measures.
**Reference Patterns:**
- OWASP Authentication Cheatsheet
- JWT Best Practices
- Bcrypt/Argon2 for password hashing
```
**Status:** ESCALATED (enhanced format with ST analysis)
---
### Scenario 4: Complex Multiple Issues (ST TRIGGERED)
**Code Submitted:**
```python
def process_orders(user_id):
orders = []
user = db.query(f"SELECT * FROM users WHERE id={user_id}")
for order in db.query("SELECT * FROM orders"):
if order.user_id == user_id:
items = []
for item in db.query("SELECT * FROM items"):
if item.order_id == order.id:
items.append(item)
orders.append({'order': order, 'items': items})
return orders
```
**Issues:**
- SQL injection (user_id)
- N+1 query problem (orders loop)
- N+1 query problem (items loop)
- Missing indexes implied
- Inefficient filtering in Python
- No error handling
- No type hints
**Expected Behavior:** ST TRIGGERED (3+ critical issues, interrelated)
**ST Analysis:**
```
Thought 1: Multiple issues: SQL injection + N+1 queries (2 levels) +
inefficient filtering. These are interrelated - fixing one affects others.
Thought 2: Root cause: fundamental misunderstanding of database query
optimization. Treating database like in-memory collection.
Thought 3: Can't fix incrementally. Need to teach JOIN queries and
query optimization principles.
Thought 4: If I just say "use JOIN", they might not understand why or
how to structure it properly.
Thought 5: Need to show the performance impact (N+1 = 1000 orders * 100 items
= 100,000 queries vs 1 JOIN query), explain query planning, and provide
complete rewrite.
Thought 6: Also opportunity to teach about ORM vs raw SQL trade-offs.
```
**Review Result:** [Enhanced escalation format with performance analysis,
query optimization education, and complete JOIN example]
**Status:** ESCALATED (enhanced format with ST analysis)
---
## Testing Checklist
When testing the enhanced code reviewer:
- [ ] Test simple issues (no ST, direct fix)
- [ ] Test first rejection (standard escalation)
- [ ] Test second rejection (ST triggered, enhanced format)
- [ ] Test 3+ critical issues (ST triggered, complexity)
- [ ] Test architectural issues (ST for trade-off analysis)
- [ ] Verify enhanced format includes root cause analysis
- [ ] Verify comprehensive examples in feedback
- [ ] Verify educational context in complex cases
---
## Expected Behavior Summary
| Scenario | Rejection Count | Issue Complexity | ST Triggered? | Format Used |
|----------|----------------|------------------|---------------|-------------|
| Simple formatting | 0 | Low | NO | Direct fix |
| First security issue | 0 | Medium | NO | Standard escalation |
| Second rejection | 2 | Medium | YES | Enhanced escalation |
| 3+ critical issues | 0-1 | High | YES | Enhanced escalation |
| Architectural trade-offs | 0-1 | High | YES | Enhanced escalation |
| Complex interrelated | 0-1 | Very High | YES | Enhanced escalation |
---
## Success Metrics
Enhanced code reviewer should:
1. **Reduce rejection cycles** - ST analysis breaks patterns faster
2. **Provide better education** - Comprehensive examples teach patterns
3. **Identify root causes** - Not just symptoms
4. **Make better architectural decisions** - Trade-off analysis with ST
5. **Save tokens overall** - Fewer rejections = less total token usage
---
**Last Updated:** 2026-01-17
**Status:** Ready for Testing