feat: Major directory reorganization and cleanup
Reorganized project structure for better maintainability and reduced disk usage by 95.9% (11 GB -> 451 MB). Directory Reorganization (85% reduction in root files): - Created docs/ with subdirectories (deployment, testing, database, etc.) - Created infrastructure/vpn-configs/ for VPN scripts - Moved 90+ files from root to organized locations - Archived obsolete documentation (context system, offline mode, zombie debugging) - Moved all test files to tests/ directory - Root directory: 119 files -> 18 files Disk Cleanup (10.55 GB recovered): - Deleted Rust build artifacts: 9.6 GB (target/ directories) - Deleted Python virtual environments: 161 MB (venv/ directories) - Deleted Python cache: 50 KB (__pycache__/) New Structure: - docs/ - All documentation organized by category - docs/archives/ - Obsolete but preserved documentation - infrastructure/ - VPN configs and SSH setup - tests/ - All test files consolidated - logs/ - Ready for future logs Benefits: - Cleaner root directory (18 vs 119 files) - Logical organization of documentation - 95.9% disk space reduction - Faster navigation and discovery - Better portability (build artifacts excluded) Build artifacts can be regenerated: - Rust: cargo build --release (5-15 min per project) - Python: pip install -r requirements.txt (2-3 min) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
212
docs/deployment/DEPLOYMENT_SAFEGUARDS_README.md
Normal file
212
docs/deployment/DEPLOYMENT_SAFEGUARDS_README.md
Normal file
@@ -0,0 +1,212 @@
|
||||
# Deployment Safeguards - Never Waste 4 Hours Again
|
||||
|
||||
## What Happened (2026-01-18)
|
||||
|
||||
Spent 4 hours debugging why the Context Recall API wasn't working:
|
||||
- **Root cause:** Production code was outdated (from Jan 16), local code was current
|
||||
- **Why it happened:** No version checking, manual file copying, missed dependent files
|
||||
- **Impact:** Couldn't test system, wasted development time
|
||||
|
||||
## What We Built to Prevent This
|
||||
|
||||
### 1. Version Endpoint (`/api/version`)
|
||||
|
||||
**What it does:**
|
||||
- Returns git commit hash of running code
|
||||
- Shows file checksums of critical files
|
||||
- Displays last commit date and branch
|
||||
|
||||
**How to use:**
|
||||
```bash
|
||||
# Check what's running in production
|
||||
curl http://172.16.3.30:8001/api/version
|
||||
|
||||
# Compare with local
|
||||
git rev-parse --short HEAD
|
||||
```
|
||||
|
||||
**Example response:**
|
||||
```json
|
||||
{
|
||||
"api_version": "1.0.0",
|
||||
"git_commit": "a6eedc1...",
|
||||
"git_commit_short": "a6eedc1",
|
||||
"git_branch": "main",
|
||||
"last_commit_date": "2026-01-18 22:15:00",
|
||||
"file_checksums": {
|
||||
"api/routers/conversation_contexts.py": "abc12345",
|
||||
"api/services/conversation_context_service.py": "def67890"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Automated Deployment Script (`deploy.ps1`)
|
||||
|
||||
**What it does:**
|
||||
- Checks local vs production version automatically
|
||||
- Copies ALL dependent files together (no more missing files!)
|
||||
- Verifies deployment succeeded
|
||||
- Tests the recall endpoint
|
||||
- Fails fast with clear error messages
|
||||
|
||||
**How to use:**
|
||||
```powershell
|
||||
# Standard deployment
|
||||
.\deploy.ps1
|
||||
|
||||
# Force deployment even if versions match
|
||||
.\deploy.ps1 -Force
|
||||
|
||||
# Skip tests (faster)
|
||||
.\deploy.ps1 -SkipTests
|
||||
```
|
||||
|
||||
**What it checks:**
|
||||
1. Local git status (uncommitted changes)
|
||||
2. Production API version
|
||||
3. Files to deploy
|
||||
4. Local tests
|
||||
5. File copy success
|
||||
6. Service restart
|
||||
7. New version verification
|
||||
8. Recall endpoint functionality
|
||||
|
||||
### 3. File Dependency Map (`FILE_DEPENDENCIES.md`)
|
||||
|
||||
**What it does:**
|
||||
- Documents which files must deploy together
|
||||
- Explains WHY they're coupled
|
||||
- Shows symptoms of mismatched deployments
|
||||
|
||||
**Critical dependencies:**
|
||||
- Router ↔ Service (parameter mismatches)
|
||||
- Service ↔ Models (schema mismatches)
|
||||
- Main App ↔ Router (import failures)
|
||||
|
||||
### 4. Deployment Checklist
|
||||
|
||||
**Before every deployment:**
|
||||
- [ ] Run `.\deploy.ps1` (not manual file copying!)
|
||||
- [ ] Check output for any warnings
|
||||
- [ ] Verify "DEPLOYMENT SUCCESSFUL" message
|
||||
- [ ] Test recall endpoint manually if critical
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Standard Deployment Workflow
|
||||
|
||||
```powershell
|
||||
# 1. Make your code changes
|
||||
# 2. Test locally
|
||||
# 3. Commit to git
|
||||
git add .
|
||||
git commit -m "Your changes"
|
||||
|
||||
# 4. Deploy to production (ONE command!)
|
||||
.\deploy.ps1
|
||||
|
||||
# 5. Verify
|
||||
curl http://172.16.3.30:8001/api/version
|
||||
```
|
||||
|
||||
### Check if Production is Out of Date
|
||||
|
||||
```powershell
|
||||
# Quick check
|
||||
$local = git rev-parse --short HEAD
|
||||
$prod = (Invoke-RestMethod http://172.16.3.30:8001/api/version).git_commit_short
|
||||
|
||||
if ($local -ne $prod) {
|
||||
Write-Host "Production is OUTDATED!" -ForegroundColor Red
|
||||
Write-Host "Local: $local, Production: $prod"
|
||||
} else {
|
||||
Write-Host "Production is up to date" -ForegroundColor Green
|
||||
}
|
||||
```
|
||||
|
||||
### Emergency: Verify What's Running
|
||||
|
||||
```bash
|
||||
# On RMM server
|
||||
cd /opt/claudetools
|
||||
git log -1 # Shows last deployed commit
|
||||
grep -c "search_term" api/services/conversation_context_service.py # Check for new code
|
||||
```
|
||||
|
||||
## What to Do If Deploy Fails
|
||||
|
||||
### Symptom: "get_recall_context() got an unexpected keyword argument"
|
||||
|
||||
**Cause:** Service file not deployed with router file
|
||||
|
||||
**Fix:**
|
||||
```powershell
|
||||
# Deploy BOTH files together
|
||||
.\deploy.ps1 -Force
|
||||
```
|
||||
|
||||
### Symptom: "Module 'version' has no attribute 'router'"
|
||||
|
||||
**Cause:** main.py not deployed with version.py
|
||||
|
||||
**Fix:**
|
||||
```powershell
|
||||
# Deploy.ps1 handles this automatically
|
||||
.\deploy.ps1 -Force
|
||||
```
|
||||
|
||||
### Symptom: API won't start after deployment
|
||||
|
||||
**Fix:**
|
||||
```bash
|
||||
# Check logs on server
|
||||
ssh guru@172.16.3.30
|
||||
journalctl -u claudetools-api -n 50
|
||||
|
||||
# Common causes:
|
||||
# - Syntax error in Python file
|
||||
# - Missing import
|
||||
# - File permission issue
|
||||
```
|
||||
|
||||
## Rules Going Forward
|
||||
|
||||
### ✅ DO:
|
||||
- Use `.\deploy.ps1` for ALL deployments
|
||||
- Commit changes before deploying
|
||||
- Check version endpoint before and after
|
||||
- Test recall endpoint after deployment
|
||||
|
||||
### ❌ DON'T:
|
||||
- Manually copy files with pscp
|
||||
- Deploy only router without service
|
||||
- Deploy only service without router
|
||||
- Skip version verification
|
||||
- Assume deployment worked without testing
|
||||
|
||||
## Files Created
|
||||
|
||||
1. `api/routers/version.py` - Version endpoint
|
||||
2. `api/main.py` - Updated to include version router
|
||||
3. `deploy.ps1` - Automated deployment script
|
||||
4. `FILE_DEPENDENCIES.md` - Dependency documentation
|
||||
5. `DEPLOYMENT_SAFEGUARDS_README.md` - This file
|
||||
|
||||
## Time Saved
|
||||
|
||||
**Before:** 4 hours debugging code mismatches
|
||||
**After:** 2 minutes automated deployment with verification
|
||||
**ROI:** 120x time savings
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Deploy these safeguards to production
|
||||
2. Test deployment script end-to-end
|
||||
3. Update .claude/CLAUDE.md with deployment instructions
|
||||
4. Create pre-commit hook to warn about dependencies (optional)
|
||||
|
||||
---
|
||||
|
||||
**Generated:** 2026-01-18
|
||||
**Motivation:** Never waste 4 hours on code mismatches again
|
||||
**Status:** Ready for production deployment
|
||||
Reference in New Issue
Block a user