Files
claudetools/projects/radio-show/audio-processor/training-data/transcripts/2014-s6e05/diarization.json
Mike Swanson 79abef9dc9 radio: diarization pipeline fixes, benchmark setup, test episode set
- Fix voice_profiler threshold bug (HOST label overwrote Unknown unconditionally)
- Audio preload optimization: single ffmpeg per episode, 149.5x realtime on 5070 Ti
- WavLM threshold raised to 0.85 (Mike 0.90-0.99, callers 0.46-0.83)
- Promo/bumper filter: weighted signature scoring, 42->27 clean Q&A pairs
- Text-only Q&A fallback for episodes with no CALLER diarization labels
- TRANSFORMERS_OFFLINE=1 to skip HuggingFace freshness checks
- Add diarize_2018.py for targeted re-run + FTS5 rebuild
- Add benchmark.py + BENCH_SETUP.md for GURU-BEAST-ROG (RTX 4090) comparison
- Commit 9-episode training diarization.json outputs
- Session log: 2026-04-27-diarization-pipeline.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 13:20:40 -07:00

189 lines
3.4 KiB
JSON

{
"num_speakers": 2,
"speaker_map": {
"CALLER": "CALLER",
"HOST": "HOST"
},
"turns": [
{
"speaker": "CALLER",
"start": 0.0,
"end": 40.0,
"confidence": 0.61
},
{
"speaker": "HOST",
"start": 35.0,
"end": 530.0,
"confidence": 0.96
},
{
"speaker": "CALLER",
"start": 525.0,
"end": 540.0,
"confidence": 0.66
},
{
"speaker": "HOST",
"start": 535.0,
"end": 595.0,
"confidence": 0.87
},
{
"speaker": "CALLER",
"start": 590.0,
"end": 630.0,
"confidence": 0.64
},
{
"speaker": "HOST",
"start": 625.0,
"end": 1620.0,
"confidence": 0.98
},
{
"speaker": "CALLER",
"start": 1615.0,
"end": 1640.0,
"confidence": 0.78
},
{
"speaker": "HOST",
"start": 1635.0,
"end": 1715.0,
"confidence": 0.95
},
{
"speaker": "CALLER",
"start": 1710.0,
"end": 1730.0,
"confidence": 0.74
},
{
"speaker": "HOST",
"start": 1725.0,
"end": 1790.0,
"confidence": 0.89
},
{
"speaker": "CALLER",
"start": 1785.0,
"end": 1815.0,
"confidence": 0.66
},
{
"speaker": "HOST",
"start": 1810.0,
"end": 1820.0,
"confidence": 0.97
},
{
"speaker": "CALLER",
"start": 1815.0,
"end": 1835.0,
"confidence": 0.65
},
{
"speaker": "HOST",
"start": 1830.0,
"end": 1845.0,
"confidence": 0.94
},
{
"speaker": "CALLER",
"start": 1840.0,
"end": 1850.0,
"confidence": 0.67
},
{
"speaker": "HOST",
"start": 1845.0,
"end": 1910.0,
"confidence": 0.97
},
{
"speaker": "CALLER",
"start": 1905.0,
"end": 1925.0,
"confidence": 0.72
},
{
"speaker": "HOST",
"start": 1920.0,
"end": 1940.0,
"confidence": 0.89
},
{
"speaker": "CALLER",
"start": 1935.0,
"end": 1960.0,
"confidence": 0.66
},
{
"speaker": "HOST",
"start": 1955.0,
"end": 1985.0,
"confidence": 0.98
},
{
"speaker": "CALLER",
"start": 1980.0,
"end": 2000.0,
"confidence": 0.81
},
{
"speaker": "HOST",
"start": 1995.0,
"end": 2065.0,
"confidence": 0.91
},
{
"speaker": "CALLER",
"start": 2060.0,
"end": 2070.0,
"confidence": 0.74
},
{
"speaker": "HOST",
"start": 2065.0,
"end": 2190.0,
"confidence": 0.97
},
{
"speaker": "CALLER",
"start": 2185.0,
"end": 2220.0,
"confidence": 0.63
},
{
"speaker": "HOST",
"start": 2215.0,
"end": 2370.0,
"confidence": 0.97
},
{
"speaker": "CALLER",
"start": 2365.0,
"end": 2375.0,
"confidence": 0.61
},
{
"speaker": "HOST",
"start": 2370.0,
"end": 2770.0,
"confidence": 0.94
},
{
"speaker": "CALLER",
"start": 2765.0,
"end": 2780.0,
"confidence": 0.76
},
{
"speaker": "HOST",
"start": 2775.0,
"end": 2845.0,
"confidence": 0.96
}
]
}