Adds a transcript-driven bumper filter to the diarization pipeline. When
a transcript segment matches qa_extractor's promo/bumper signatures, the
overlapping audio windows are labeled BUMPER and the WavLM cosine match
is skipped. Prevents music/promo from being matched against speaker
profiles (the failure mode Mike caught in 2018-s10e18 @ 09:20-10:05).
Code changes:
- src/voice_profiler.py: identify_speakers() takes optional skip_ranges
parameter; windows whose midpoint falls in a skip range get labeled
"[bumper]" and skip cosine match
- src/diarizer.py: diarize() takes optional transcript_path; pre-computes
bumper time ranges via qa_extractor._is_promo_or_bumper, passes to
identify_speakers; adds BUMPER speaker label
- benchmark.py: passes transcript_path to diarize()
Aggregate impact across 9-episode test set:
Tara attribution: 4880s -> 3680s (-1200s / -25%)
Q&A pairs: 17 -> 19 (+2)
(bumper-flagged segments had been disrupting conversation detection
in 2017-s9e30 and 2018-s10e18)
CALLER total: 1320s -> 1190s (bumpers previously labeled CALLER moved)
Per-episode bumpers caught: 1-8, total ~165 bumper segments across set
Remaining Tara false positives are real callers acoustically similar to
Tara (Christopher in 2018, Kay in 2012, William and Charles in 2015) and
guest Clay in 2015-s7e19 — those need profile rebuild + Clay profile,
not bumper filtering.
Adds download_full_archive.py — resumable mirror-style downloader that
walks IX server's /home/gurushow/public_html/archive/{year}/ and copies
all MP3s to archive-data/episodes/. Run is in progress (~589 files,
~10-15GB). Used to source clean profile windows for the remaining
co-hosts (Tara rebuild, Clay, Tony, Rob, Randall, producers).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ClaudeTools Active Projects
Directory: D:\ClaudeTools\projects\
Purpose: Active development projects and related conversation archives
Last Updated: 2026-01-17
Overview
This directory contains active projects being developed or maintained as part of the ClaudeTools ecosystem. Unlike the imported-conversations/ directory which serves as an archive, projects here are actively worked on and may include both source code and conversation history.
Current Projects
MSP Tools (94 files, 20.1 MB)
Moved From: D:\ClaudeTools\imported-conversations\msp-tools/
Move Date: 2026-01-17
Status: Active development
Managed Service Provider (MSP) tooling and infrastructure projects, including conversation history and development artifacts.
Structure
msp-tools/
├── guru-rmm/ # 54 files, 14 MB
│ └── [JSONL conversation files] # RMM system development history
└── guru-connect/ # 40 files, 6.1 MB
└── [JSONL conversation files] # MSP integration development history
guru-rmm (54 files, 14 MB)
Description: Remote Monitoring and Management (RMM) system development conversations
Source: C:\Users\MikeSwanson\.claude\projects\C--Users-MikeSwanson-claude-projects-gururmm-guru-rmm
Key Topics:
- RMM system architecture
- Monitoring solutions
- Agent deployment
- Infrastructure management
- MSP automation
Project Type: MSP Infrastructure
guru-connect (40 files, 6.1 MB)
Description: MSP connectivity and integration tooling conversations
Source: C:\Users\MikeSwanson\.claude\projects\C--Users-MikeSwanson-claude-projects-guru-connect
Key Topics:
- Integration patterns
- API connectivity
- Service orchestration
- Client management
- Cross-platform integration
Project Type: MSP Integration
File Format
All conversation files are in JSONL (JSON Lines) format:
- Extension:
.jsonl - Format: Each line is a valid JSON object
- Content: Individual conversation messages from Claude
- Encoding: UTF-8
- Can be processed line-by-line for analysis
Usage
Accessing Project Files
# List all projects
ls -lh D:\ClaudeTools\projects\
# Browse MSP tools conversations
ls -lh D:\ClaudeTools\projects\msp-tools\guru-rmm\
# Count conversation files
find D:\ClaudeTools\projects\ -name "*.jsonl" | wc -l
# Search for specific topics
grep -r "FastAPI" D:\ClaudeTools\projects\
Integration with ClaudeTools
These conversations can be:
- Analyzed and indexed into context recall system
- Used to extract reusable code snippets
- Mined for technical decisions and patterns
- Converted into knowledge base entries
- Referenced for similar future projects
Adding New Projects
When adding new active projects to this directory:
- Create a descriptive folder name (e.g.,
project-name/) - Include conversation history if available
- Update this README with project details
- Consider creating a project-specific README
- Tag appropriately for context recall
Related Documentation
- imported-conversations/INDEX.md - Archive of all imported conversations
- imported-conversations/IMPORT_MANIFEST.json - Detailed import metadata
- .claude/CLAUDE.md - Main ClaudeTools project documentation
- SESSION_STATE.md - Current project state and development history
Project Organization
Active Projects (this directory):
- Currently under development
- May include both code and conversation history
- Subject to frequent updates
- Integrated with ClaudeTools development
Archived Conversations (imported-conversations/):
- Historical reference only
- Read-only archive
- Organized by project type
- Preserved for knowledge extraction
Future Projects
This directory will grow as new projects are added. Potential additions:
- guru-backup - Backup and recovery tooling
- guru-dashboard - MSP management dashboard
- integration-tools - Third-party integration utilities
- automation-scripts - MSP automation workflows
Statistics
Current Totals:
- Projects: 1 (msp-tools)
- Conversation Files: 94 JSONL files
- Total Size: 20.1 MB
- Subcategories: 2 (guru-rmm, guru-connect)
Breakdown:
- guru-rmm: 54 files (57.4%), 14 MB (69.7%)
- guru-connect: 40 files (42.6%), 6.1 MB (30.3%)
Notes
- This directory was created on 2026-01-17
- First project (msp-tools) moved from imported-conversations archive
- All conversation files preserved with original timestamps
- Original source paths documented in IMPORT_MANIFEST.json
Maintained By: ClaudeTools Project
Location: D:\ClaudeTools\projects
Documentation Status: Active