Files

Mike Swanson 79abef9dc9 radio: diarization pipeline fixes, benchmark setup, test episode set

- Fix voice_profiler threshold bug (HOST label overwrote Unknown unconditionally)
- Audio preload optimization: single ffmpeg per episode, 149.5x realtime on 5070 Ti
- WavLM threshold raised to 0.85 (Mike 0.90-0.99, callers 0.46-0.83)
- Promo/bumper filter: weighted signature scoring, 42->27 clean Q&A pairs
- Text-only Q&A fallback for episodes with no CALLER diarization labels
- TRANSFORMERS_OFFLINE=1 to skip HuggingFace freshness checks
- Add diarize_2018.py for targeted re-run + FTS5 rebuild
- Add benchmark.py + BENCH_SETUP.md for GURU-BEAST-ROG (RTX 4090) comparison
- Commit 9-episode training diarization.json outputs
- Session log: 2026-04-27-diarization-pipeline.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-27 13:20:40 -07:00

4.0 KiB

Raw Blame History

GURU-BEAST-ROG Benchmark Setup

RTX 4090 performance comparison against DESKTOP-0O8A1RL (RTX 5070 Ti baseline: 149.5x realtime).

Step 1 — Sync repo

The audio-processor lives inside the claudetools repo. Pull latest on main.

cd D:\claudetools   # or wherever claudetools is cloned on this machine
git pull

If not yet cloned:

git clone https://azcomputerguru@git.azcomputerguru.com/azcomputerguru/claudetools.git D:\claudetools
cd D:\claudetools\projects\radio-show\audio-processor

Step 2 — Python environment

Requires Python 3.11+. Use py launcher on Windows.

cd D:\claudetools\projects\radio-show\audio-processor

py -m venv .venv
.venv\Scripts\activate

# PyTorch with CUDA 12.8 (matches RTX 4090 driver)
pip install torch==2.11.0+cu128 --index-url https://download.pytorch.org/whl/cu128

# Core deps
pip install faster-whisper==1.2.1 transformers==5.6.2 soundfile==0.13.1
pip install numpy==2.4.4 rich==15.0.0 ollama==0.6.1 pyyaml scikit-learn

# Install project in editable mode
pip install -e . --no-deps

Verify GPU is visible:

.venv\Scripts\python -c "import torch; print(torch.cuda.get_device_name(0))"

Step 3 — Copy voice profiles from DESKTOP-0O8A1RL

Voice profiles are not in git (binary numpy files). Copy from the 5070 Ti machine via Tailscale. DESKTOP-0O8A1RL Tailscale IP: 100.92.127.64

# From GURU-BEAST-ROG — pulls the voice-profiles directory over Tailscale
robocopy "\\100.92.127.64\claudetools\projects\radio-show\audio-processor\voice-profiles" `
         "D:\claudetools\projects\radio-show\audio-processor\voice-profiles" /E /COPYALL

If the network share isn't available, copy manually or use scp:

scp -r mike@100.92.127.64:"D:/claudetools/projects/radio-show/audio-processor/voice-profiles" .

Expected contents after copy:

voice-profiles/
  profiles.json
  mike-swanson/
    composite.npy
    embedding_0000.npy ... embedding_0179.npy   (180 files)

Step 4 — Download test episodes from IX server

Tailscale must be running. IX server: 172.16.3.10 (use Python paramiko — raw SSH has key agent interference).

.venv\Scripts\python - << 'EOF'
import paramiko, os
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect('172.16.3.10', username='root', password='Gptf*77ttb!@#!@#',
               look_for_keys=False, allow_agent=False, timeout=30)
sftp = client.open_sftp()

os.makedirs('test-data/episodes', exist_ok=True)

downloads = [
    ('/home/gurushow/public_html/archive/2011/3-12-11 HR 1.mp3',        'test-data/episodes/2011-03-12-hr1.mp3'),
    ('/home/gurushow/public_html/archive/2012/3 - March/3-10-12HR1.mp3','test-data/episodes/2012-03-10-hr1.mp3'),
    ('/home/gurushow/public_html/archive/2012/6 - June/6-9-12-HR1.mp3', 'test-data/episodes/2012-06-09-hr1.mp3'),
    ('/home/gurushow/public_html/archive/2014/06/s6e19.mp3',            'test-data/episodes/2014-s6e19.mp3'),
    ('/home/gurushow/public_html/archive/2016/06/s8e43.mp3',            'test-data/episodes/2016-s8e43.mp3'),
    ('/home/gurushow/public_html/archive/2017/04/s9e30.mp3',            'test-data/episodes/2017-s9e30.mp3'),
]

for remote, local in downloads:
    size_mb = sftp.stat(remote).st_size / 1024 / 1024
    print(f'Downloading {local} ({size_mb:.1f} MB)...', flush=True)
    sftp.get(remote, local)
    print('  done', flush=True)

sftp.close()
client.close()
print('All downloads complete.')
EOF

Step 5 — Run benchmark

.venv\Scripts\python benchmark.py

This diarizes all 6 test episodes, prints per-episode timing, and compares to the 5070 Ti baseline.

Step 6 — Report results

Post the benchmark output in the session log or share back to DESKTOP-0O8A1RL.

The key number to compare: total realtime factor (5070 Ti got 149.5x).

Also note any Q&A pair count differences — same episodes should produce same pairs on both machines (results are deterministic given the same voice profiles).

4.0 KiB Raw Blame History