Mike Swanson
82940d96d7
radio: utf-8 transcript writes + sqlite archive importer + session log
- src/transcriber.py: open transcript.{json,txt,srt} with encoding="utf-8".
Windows cp1252 default crashed on Whisper output containing U+2044.
- import_to_sqlite.py: new. Walks archive-data/transcripts, builds
archive.db (5 tables + 2 FTS5 virtual tables, sha256-keyed idempotency).
20.5 MB / 208 episodes at smoke-test time, 1.9s rebuild.
- batch_process.py: tracked from prior session — full-archive batch with
resumable transcribe/diarize/intros/qa pipeline.
- .gitignore: archive-data/ and logs/.
Session log: 2026-04-27-archive-batch-and-sqlite-import.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 19:38:02 -07:00
..
2026-03-21 11:51:59 -07:00
2026-03-21 11:51:59 -07:00
2026-03-21 11:51:59 -07:00
2026-03-21 12:19:13 -07:00
2026-04-27 13:20:40 -07:00
2026-03-21 11:51:59 -07:00
2026-04-27 16:17:50 -07:00
2026-03-21 11:59:54 -07:00
2026-04-27 13:20:40 -07:00
2026-04-27 16:55:31 -07:00
2026-03-21 11:59:54 -07:00
2026-04-27 13:20:40 -07:00
2026-04-27 16:55:31 -07:00
2026-04-27 19:38:02 -07:00
2026-04-27 16:17:50 -07:00