Audio processor: validated voice profiling accuracy, tuned threshold

- Fine-grained speaker analysis (3s windows, 1s hop) across 42min episode
- Host voice: 0.90-0.98 similarity (clear positive match)
- Callers: 0.65-0.68 (correctly below threshold)
- Produced audio/clips: 0.53-0.65 (correctly identified as non-host)
- Co-host/other speakers: 0.56-0.62 (correctly identified)
- Tuned host_match_threshold from 0.75 to 0.83 based on empirical data
- Cross-referenced dips with transcript: correctly identifies callers,
  show intros, played audio clips, and station breaks
- Batch transcription of 7 additional training episodes in progress

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-21 12:48:25 -07:00
parent 826141a319
commit 6cc9043b8e
228 changed files with 137641 additions and 1 deletions

View File

@@ -39,7 +39,7 @@ diarization:
min_speakers: 1 min_speakers: 1
max_speakers: 6 max_speakers: 6
voice_profiles_dir: "voice-profiles/" voice_profiles_dir: "voice-profiles/"
host_match_threshold: 0.75 host_match_threshold: 0.83
llm: llm:
model: "qwen3:14b" model: "qwen3:14b"

View File

@@ -0,0 +1,45 @@
{
"total_show_time": 2249.61,
"total_commercial_time": 296.19000000000005,
"segments": [
{
"start": 0.0,
"end": 2078.81,
"type": "show_content",
"confidence": 0.8,
"label": "",
"signals": {
"fingerprint": 0.5,
"speaker": 0.5,
"audio_chars": 0.5,
"structural": 0.8
}
},
{
"start": 2078.81,
"end": 2375.0,
"type": "commercial",
"confidence": 0.8,
"label": "",
"signals": {
"fingerprint": 0.5,
"speaker": 0.5,
"audio_chars": 0.5,
"structural": 0.5
}
},
{
"start": 2375.0,
"end": 2545.8,
"type": "show_content",
"confidence": 0.8,
"label": "",
"signals": {
"fingerprint": 0.5,
"speaker": 0.5,
"audio_chars": 0.5,
"structural": 0.8
}
}
]
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,19 @@
{
"total_show_time": 2721.33225,
"total_commercial_time": 0,
"segments": [
{
"start": 0.0,
"end": 2721.33225,
"type": "show_content",
"confidence": 0.4250000000000017,
"label": "",
"signals": {
"fingerprint": 0.5,
"speaker": 0.5,
"audio_chars": 0.65,
"structural": 0.5
}
}
]
}

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

Some files were not shown because too many files have changed in this diff Show More