Files
claudetools/projects/radio-show/audio-processor/test-data/transcripts/2014-s6e19/diarization.json
Mike Swanson a4f527f31e radio: per-year test set (one episode per year, 2010-2018)
Added 2010, 2015, 2018 test episodes to round out the test set to one
per available year:
- 2010-05-08-hr1 (May 2010, earliest available; pre-Tara era)
- 2015-s7e19 (Jan 2015, avoids training's s7e30)
- 2018-s10e18 (only 3 non-training 2018 episodes exist)

Archive has no 2019 directory — Rob's "2018/2019 appearances" are
constrained to the 5 available 2018 episodes only.

Per-year diarization summary (Tara presence, post-rename):
  2010-05-08    30s   1.2%   likely false positive (pre-Tara)
  2011-03-12   140s   5.6%   likely false positive (call-in only)
  2012-03-10    30s   1.1%   likely false positive (call-in only)
  2012-06-09   340s  12.8%   suspicious — Mike to confirm
  2014-s6e19   680s  23.3%   confirmed
  2015-s7e19   280s   9.9%   plausible — Mike to confirm
  2016-s8e43  1890s  35.5%   confirmed
  2017-s9e30   610s  11.4%   plausible
  2018-s10e18  880s  17.1%   COULD BE ROB — Mike flagged Rob for
                              2018/2019 appearances; cosine threshold may
                              be hitting on Rob being acoustically similar
                              to Tara

Total Tara across 9 episodes: 1h 21m / 8h 52m audio (15.3%).

Q&A counts (still suspect — every voice that isn't Mike-or-Tara is
labeled CALLER, so Randall/Rob/producers inflate the bucket):
  2010=4, 2011=1, 2012a=2, 2012b=0, 2014=0, 2015=1, 2016=2, 2017=4, 2018=3
  Total: 17 pairs across 9 episodes

4090 perf on the expanded set:
- Diarization: 31928s in 121.5s = 262.7x realtime (vs 209.7x on 5070 Ti, +25.3%)
- Transcription (3 new episodes only): 10554s in 112.4s = 93.9x

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:20:09 -07:00

196 lines
3.6 KiB
JSON

{
"num_speakers": 3,
"speaker_map": {
"CALLER": "CALLER",
"HOST": "HOST",
"CO-HOST": "CO-HOST"
},
"turns": [
{
"speaker": "CO-HOST",
"start": 0.0,
"end": 40.0,
"confidence": 0.96
},
{
"speaker": "HOST",
"start": 35.0,
"end": 200.0,
"confidence": 0.98
},
{
"speaker": "CO-HOST",
"start": 195.0,
"end": 260.0,
"confidence": 0.96
},
{
"speaker": "HOST",
"start": 255.0,
"end": 325.0,
"confidence": 0.98
},
{
"speaker": "CO-HOST",
"start": 320.0,
"end": 425.0,
"confidence": 0.98
},
{
"speaker": "HOST",
"start": 420.0,
"end": 605.0,
"confidence": 0.9
},
{
"speaker": "CO-HOST",
"start": 600.0,
"end": 650.0,
"confidence": 0.97
},
{
"speaker": "HOST",
"start": 645.0,
"end": 655.0,
"confidence": 0.98
},
{
"speaker": "CO-HOST",
"start": 650.0,
"end": 665.0,
"confidence": 0.96
},
{
"speaker": "HOST",
"start": 660.0,
"end": 680.0,
"confidence": 0.98
},
{
"speaker": "CO-HOST",
"start": 675.0,
"end": 710.0,
"confidence": 0.94
},
{
"speaker": "HOST",
"start": 705.0,
"end": 985.0,
"confidence": 0.9
},
{
"speaker": "CO-HOST",
"start": 980.0,
"end": 990.0,
"confidence": 0.96
},
{
"speaker": "HOST",
"start": 985.0,
"end": 1320.0,
"confidence": 0.98
},
{
"speaker": "CO-HOST",
"start": 1315.0,
"end": 1330.0,
"confidence": 0.92
},
{
"speaker": "HOST",
"start": 1325.0,
"end": 1505.0,
"confidence": 0.96
},
{
"speaker": "CALLER",
"start": 1500.0,
"end": 1510.0,
"confidence": 0.8
},
{
"speaker": "HOST",
"start": 1505.0,
"end": 1515.0,
"confidence": 0.85
},
{
"speaker": "CALLER",
"start": 1510.0,
"end": 1520.0,
"confidence": 0.81
},
{
"speaker": "HOST",
"start": 1515.0,
"end": 1550.0,
"confidence": 0.96
},
{
"speaker": "CO-HOST",
"start": 1545.0,
"end": 1555.0,
"confidence": 0.97
},
{
"speaker": "HOST",
"start": 1550.0,
"end": 1810.0,
"confidence": 0.91
},
{
"speaker": "CO-HOST",
"start": 1805.0,
"end": 1825.0,
"confidence": 0.93
},
{
"speaker": "HOST",
"start": 1820.0,
"end": 2055.0,
"confidence": 0.97
},
{
"speaker": "CO-HOST",
"start": 2050.0,
"end": 2060.0,
"confidence": 0.87
},
{
"speaker": "HOST",
"start": 2055.0,
"end": 2155.0,
"confidence": 0.94
},
{
"speaker": "CALLER",
"start": 2150.0,
"end": 2160.0,
"confidence": 0.83
},
{
"speaker": "CO-HOST",
"start": 2155.0,
"end": 2170.0,
"confidence": 0.97
},
{
"speaker": "HOST",
"start": 2165.0,
"end": 2700.0,
"confidence": 0.97
},
{
"speaker": "CO-HOST",
"start": 2695.0,
"end": 2710.0,
"confidence": 0.98
},
{
"speaker": "HOST",
"start": 2705.0,
"end": 2910.0,
"confidence": 0.98
}
]
}