Commit Graph

27 Commits

Author SHA1 Message Date
4c89402df8 radio: skip Clay profile build (failed) — accept 2015-s7e19 Q&A as noisy
First attempt at Clay's voice profile from 2015-s7e19 produced
Clay-vs-Mike cosine similarity of 0.994 — essentially a Mike clone.
Root cause: 10s WavLM x-vector chunks averaged Mike's frequent
interjections together with Clay's dialogue, and Mike's well-trained
profile dominated the resulting embedding signal.

Mike's call: skip Clay, accept the 2015-s7e19 Q&A as noisy. Clay rarely
appears in other episodes, so the cost of not having his profile is
bounded to this one episode plus any rare future appearances.

Cleanup:
- voice-profiles/clay/ removed
- voice-profiles/profiles.json: Clay entry removed
- Memory updated to record the decision and the failure mode

Kept build_clay_profile.py in-repo as documentation of the attempt and
the Mike-similarity-filter pattern. Useful starting point if a future
attempt provides cleaner pure-Clay timestamps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:36:46 -07:00
c760e430c0 radio: bumper detection in diarizer + full archive download script
Adds a transcript-driven bumper filter to the diarization pipeline. When
a transcript segment matches qa_extractor's promo/bumper signatures, the
overlapping audio windows are labeled BUMPER and the WavLM cosine match
is skipped. Prevents music/promo from being matched against speaker
profiles (the failure mode Mike caught in 2018-s10e18 @ 09:20-10:05).

Code changes:
- src/voice_profiler.py: identify_speakers() takes optional skip_ranges
  parameter; windows whose midpoint falls in a skip range get labeled
  "[bumper]" and skip cosine match
- src/diarizer.py: diarize() takes optional transcript_path; pre-computes
  bumper time ranges via qa_extractor._is_promo_or_bumper, passes to
  identify_speakers; adds BUMPER speaker label
- benchmark.py: passes transcript_path to diarize()

Aggregate impact across 9-episode test set:
  Tara attribution: 4880s -> 3680s  (-1200s / -25%)
  Q&A pairs: 17 -> 19 (+2)
    (bumper-flagged segments had been disrupting conversation detection
     in 2017-s9e30 and 2018-s10e18)
  CALLER total: 1320s -> 1190s  (bumpers previously labeled CALLER moved)
  Per-episode bumpers caught: 1-8, total ~165 bumper segments across set

Remaining Tara false positives are real callers acoustically similar to
Tara (Christopher in 2018, Kay in 2012, William and Charles in 2015) and
guest Clay in 2015-s7e19 — those need profile rebuild + Clay profile,
not bumper filtering.

Adds download_full_archive.py — resumable mirror-style downloader that
walks IX server's /home/gurushow/public_html/archive/{year}/ and copies
all MP3s to archive-data/episodes/. Run is in progress (~589 files,
~10-15GB). Used to source clean profile windows for the remaining
co-hosts (Tara rebuild, Clay, Tony, Rob, Randall, producers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:17:50 -07:00
fb683d6a05 radio: rename Tom -> Tara, expand speaker roster
Mike confirmed there is no co-host named "Tom" — the voice in 2014-s6e19
and 2016-s8e43 is Tara. The 5070 Ti session fabricated the Tom identity.
The voice profile itself (44 embeddings, 0.698 cosine vs Mike) is correct;
only the human label was wrong.

Rename swept:
- voice-profiles/tom/ -> voice-profiles/tara/ (git mv preserves all .npy)
- voice-profiles/profiles.json: "Tom" key -> "Tara"
- build_cohost_profile.py: TOM_WINDOWS -> TARA_WINDOWS, COHOST_NAME, comments
- 2026-04-27-qa-extraction-cohost-indexing.md: correction header + body sweep
- 2026-04-27-4090-benchmark-and-test-set.md: closure note
- .claude/memory/radio_show_no_cohost_named_tom.md: resolution + speaker roster

Diarization re-run after rename so speaker_map emits "Cohost: Tara".
Q&A counts unchanged (rename is label-only): 9 pairs across 6 test episodes.

Tara distribution from the post-rename diarization (per-episode % of audio):
  2011-03-12-hr1   140s   5.6%   likely false positive (call-in only)
  2012-03-10-hr1    30s   1.1%   likely false positive (call-in only)
  2012-06-09-hr1   340s  12.8%   suspicious — pending Mike confirm
  2014-s6e19       680s  23.3%   confirmed
  2016-s8e43      1890s  35.5%   confirmed
  2017-s9e30       610s  11.4%   plausible — pending Mike confirm

Broader speaker-roster context Mike provided this session (saved to
memory): the show has had multiple co-hosts (Tara, Randall, Rob) plus
producers/board ops (Andrew, Shannon, Ken, others) who would sometimes
go on-air. Only Tara has a profile so far. Every other speaker is
currently labeled CALLER, which means small CO-HOST attributions in
unexpected episodes (e.g. 2011/2012) may actually be a producer rather
than a false positive — Mike to spot-check.

Action item before full-archive run: build profiles for Randall, Rob,
and the named producers to avoid systematic Q&A false positives in
early-years and 2018/2019 episodes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:11:03 -07:00
b9a4bb8807 scc: 4090 benchmark with new code state — 338.1x diarize, 94.8x transcribe
Re-ran benchmark.py on GURU-BEAST-ROG against the post-overhaul code
(co-host profile, batched Whisper int8_float16, revised Q&A extractor).

Results vs 5070 Ti baseline:
- Diarization: 209.7x -> 338.1x (+61.2%)
- Transcription: 63.8x -> 94.8x (+48.6%)
- Q&A pairs: 9 vs 10 (within run-to-run noise; structural correctness matches:
  2014 = 0 callers, 2016 = 2 WiFi caller pairs)

Setup change: BENCH_SETUP.md now lists ffmpeg as a Step-2 prereq
(winget install Gyan.FFmpeg). Was missing on this machine and the pipeline
fails silently at the first diarize call without ffprobe.

Code change: benchmark.py BASELINE_RTF updated 149.5 -> 209.7 to reflect
the 5070 Ti's post-overhaul measurement (e9ac607).

Data: 6 test episode transcripts and diarizations regenerated under the
new code path (batched Whisper output + co-host-aware speaker_map).

Correction memory: voice-profiles/tom/ directory + 5070 Ti session log
fabricated a co-host named "Tom" — Mike confirms no such person exists on
the show. The audio profile is real and the diarization separation is
sound, but the human identity attached to it is wrong. Saved under
.claude/memory/radio_show_no_cohost_named_tom.md pending Mike providing
the correct name for rename.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:54:07 -07:00
7bb683a3ed sync: auto-sync from GURU-BEAST-ROG at 2026-04-27 14:42:18
Author: Mike Swanson
Machine: GURU-BEAST-ROG
Timestamp: 2026-04-27 14:42:18
2026-04-27 14:42:25 -07:00
206cd2f929 sync: auto-sync from GURU-BEAST-ROG at 2026-04-27 13:15:49
Author: Mike Swanson
Machine: GURU-BEAST-ROG
Timestamp: 2026-04-27 13:15:49
2026-04-27 13:15:52 -07:00
6e2d99bd23 sync: auto-sync from HOWARD-HOME at 2026-04-23 21:12:42
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-04-23 21:12:42
2026-04-23 21:12:43 -07:00
34aad7639f sync: auto-sync from HOWARD-HOME at 2026-04-23 13:34:46
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-04-23 13:34:46
2026-04-23 13:34:48 -07:00
7e2e3a5882 sync: auto-sync from HOWARD-HOME at 2026-04-23 06:21:23
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-04-23 06:21:23
2026-04-23 06:21:24 -07:00
6bd416657c sync: auto-sync from HOWARD-HOME at 2026-04-22 17:39:56
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-04-22 17:39:56
2026-04-22 17:39:57 -07:00
e226d2857e sync: auto-sync from DESKTOP-0O8A1RL at 2026-04-19 12:55:40
Author: unknown
Machine: DESKTOP-0O8A1RL
Timestamp: 2026-04-19 12:55:40
2026-04-19 12:55:42 -07:00
dfcc3cefef chore: leave setup note for Mac Claude session (gururmm hooks)
Memory entry prompts Mac session to run scripts/install-hooks.sh
before any GuruRMM work. Syncs via Gitea on next pull.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 08:27:00 -07:00
b99f8512e4 sync: auto-sync from ACG-TECH03L at 2026-04-17 13:02:04
Author: Howard Enos
Machine: ACG-TECH03L
Timestamp: 2026-04-17 13:02:04
2026-04-17 13:02:09 -07:00
32888ea9d4 sync: auto-sync from ACG-TECH03L at 2026-04-17 11:26:41
Author: Howard Enos
Machine: ACG-TECH03L
Timestamp: 2026-04-17 11:26:41
2026-04-17 11:26:46 -07:00
8d975c1b44 import: ingested 160 files from C:\Users\howar\Clients
Howard's personal MSP client documentation folder imported into shared
ClaudeTools repo via /import command. Scope:

Clients (structured MSP docs under clients/<name>/docs/):
- anaise       (NEW)  - 13 files
- cascades-tucson     - 47 files merged (existing had only reports/)
- dataforth           - 18 files merged (alongside incident reports)
- instrumental-music-center - 14 files merged
- khalsa       (NEW)  - 22 files, multi-site (camden, river)
- kittle       (NEW)  - 16 files incl. fix-pdf-preview, gpo-intranet-zone
- lens-auto-brokerage (NEW) - 3 files (name matches SOPS vault)
- _client_template    - 13-file scaffold for new clients

MSP tooling (projects/msp-tools/):
- msp-audit-scripts/ - server_audit.ps1, workstation_audit.ps1, README
- utilities/         - clean_printer_ports, win11_upgrade,
                       screenconnect-toolbox-commands

Credential handling:
- Extracted 1 inline password (Anaise DESKTOP-O8GF4SD / david)
  to SOPS vault: clients/anaise/desktop-o8gf4sd.sops.yaml
- Redacted overview.md with vault reference pattern
- Scanned all 160 files for keys/tokens/connection strings -
  no other credentials found

Skipped:
- Cascades/.claude/settings.local.json (per-machine config)
- Source-root CLAUDE.md (personal, claudetools has its own)
- scripts/server_audit.ps1 and workstation_audit.ps1 at source root
  (identical duplicates of msp-audit-scripts versions)

Memory updates:
- reference_client_docs_structure.md (layout, conventions, active list)
- reference_msp_audit_scripts.md (locations, ScreenConnect 80-char rule)

Session log: session-logs/2026-04-16-howard-client-docs-import.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 19:43:58 -07:00
100a491ac6 Session log: multi-user setup, audit + gap fixes, Howard onboarding package
Two session logs:
- session-logs/2026-04-16-session.md: cross-cutting (multi-user, audit, infrastructure)
- guru-rmm session log appended: MSI installer, Len's Auto Brokerage, Uranus, migration drift

Gap fixes: GrepAI initialized + MCP server added, Ollama models pulling,
settings.json created (bypassPermissions), MCP_SERVERS.md written.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:56:26 -07:00
ea48061389 Multi-user support: identity tracking for Mike + Howard
- .claude/identity.json (gitignored, per-machine) identifies who's at the keyboard
- .claude/users.json (tracked) registers known team members + roles + machines
- CLAUDE.md: on first sync, Claude asks "Mike or Howard?" and creates identity.json
- Session logs must include User section for attribution
- Git commits use per-user name/email (shared Gitea push account)
- Howard Enos (tech, full trust) added as second team member
- Memory entry created for Howard

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:11:14 -07:00
c73dcfd9a8 Session log: M365 remediation tool upgrades, multi-client password resets, transport rule fix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 14:43:04 -07:00
af71d317b0 Session log: GuruRMM audit, installer system, infrastructure fixes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 13:58:45 -07:00
a47a97219c Session log: M365 remediation (MVAN, grabblaw, cascades), data recovery discussion
- MVAN: investigated credential stuffing on Mitch VanDeveer, enforced MFA CA policy
- Grabblaw: consent flow failed, needs alternative approach
- Cascades Tucson: onboarded to remediation tool successfully
- Memory: "365 remediation tool" = Graph API app fabb3421
- Data recovery: Hitachi Deskstar firmware/service area diagnosis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 10:23:35 -07:00
b26e185a80 Add TickTick integration, MCP server, and dev project tracking
New integration with TickTick API for project/task management:
- OAuth 2.0 auth flow (mcp-servers/ticktick/ticktick_auth.py)
- MCP server with 9 tools for Claude Code (ticktick_mcp.py)
- FastAPI service with SOPS vault credentials (api/services/ticktick_service.py)
- JWT-protected REST router at /api/ticktick/ (api/routers/ticktick.py)
- Credentials stored in SOPS vault (services/ticktick.sops.yaml)

Dev project tracking (hybrid TickTick + DB):
- New dev_projects table migration (14 columns, status index)
- TickTick "Dev Projects" list for mobile visibility
- First project seeded: TickTick Integration (linked both sides)

Security: .tokens.json gitignored, token file permissions restricted,
HTML-escaped OAuth callback, SOPS vault (not env vars) for secrets.

Also: Installed Tailscale on ACG-5070 for office network access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 10:08:53 -07:00
e34f51fe5d Session 2026-03-30: SOPS vault, SC-Syncro sync, Syncro scripts
- SOPS+age credential vault created (59 encrypted files, separate repo)
- Updated CLAUDE.md credential access to reference SOPS vault
- Updated memory for ACG-5070 (Windows 11, replaces CachyOS)
- SC-Syncro sync script: enriched 410 SC sessions with company/device data
- Syncro scripts: SC property updater, SC deployer, rogue SC killer
- Session log with full details

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 19:38:38 -07:00
OC-5070
ece3222d3a Add AD1 session data, memory entries for datasheet pipeline and security incident
- Imported AD1 Claude session files to clients/dataforth/session-logs/
- Created memory: project_datasheet_pipeline.md (full pipeline architecture)
- Created memory: project_dataforth_incident_2026-03-27.md (security incident + MFA)
- Updated MEMORY.md index
- Updated session log with AD1 pipeline rebuild findings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 20:07:20 -07:00
9011670fce sync: Auto-sync from GURU-BEAST-ROG at 2026-03-25 03:45:04
Synced files:
- Session logs updated
- Latest context and credentials
- Command/directive updates

Machine: GURU-BEAST-ROG
Timestamp: 2026-03-25 03:45:04

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-25 03:46:07 -07:00
9288f3ba93 Session log: Windows setup continuation, bypass permissions fix, machine registration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 03:46:07 -07:00
5a73b18409 Memory: Windows guru workstation setup status
Documented software verification results:
- Installed: Python 3.12.10, Git 2.52.0, Windows OpenSSH, credentials.md
- Missing: Node.js, Ollama, GrepAI, .mcp.json

Next session should continue with installing missing components.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-25 03:45:30 -07:00
ad88fc31f0 sync: Auto-sync from acg-guru-5070 at 2026-03-22 22:31:46
Synced files:
- Session logs updated
- Latest context and credentials
- Command/directive updates

Machine: acg-guru-5070
Timestamp: 2026-03-22 22:31:46

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-22 22:31:46 -07:00