diff --git a/.claude/machines/acg-guru-5070.md b/.claude/machines/acg-guru-5070.md new file mode 100644 index 0000000..eaf18b9 --- /dev/null +++ b/.claude/machines/acg-guru-5070.md @@ -0,0 +1,91 @@ +# Machine: acg-guru-5070 + +**Hostname:** acg-guru-5070 +**Last Updated:** 2026-03-21 + +--- + +## Hardware Specs + +| Spec | Value | +|------|-------| +| Model | Lenovo Legion Pro 7 16IAX10H (DMI: 83F5) | +| CPU | Intel Core Ultra 9 275HX (24 cores, up to 5.4 GHz) | +| Memory | 32 GB DDR5 | +| GPU | NVIDIA GeForce RTX 5070 Ti Laptop GPU (12 GB VRAM) | +| Storage 1 | 954 GB NVMe (SK Hynix) - CachyOS root, btrfs | +| Storage 2 | 954 GB NVMe (SK Hynix) - /home, ext4 | + +--- + +## Software + +| Spec | Value | +|------|-------| +| OS | CachyOS Linux (Arch-based) | +| Kernel | 6.19.9-1-cachyos | +| DE | KDE Plasma 6.6.3 (Wayland) | +| NVIDIA Driver | 595.45.04 (open kernel module) | +| CUDA | 13.2 | +| Python | 3.14 | + +--- + +## Claude Code Environment + +- **Working Directory:** /home/guru/ClaudeTools +- **User:** guru +- **Shell:** fish +- **Git:** Configured for Gitea (git.azcomputerguru.com) + +--- + +## Network + +| Interface | Address | +|-----------|---------| +| WiFi (wlan0) | 10.3.36.218 | +| Tailscale | 100.95.216.79 | + +--- + +## Capabilities + +- [x] Git operations +- [x] SSH access to infrastructure +- [x] GrepAI semantic search (watcher running) +- [x] Ollama local AI (qwen3:14b, codestral:22b, nomic-embed-text) +- [x] MCP servers available +- [x] NVIDIA GPU (CUDA compute) +- [x] Claude Code CLI + +--- + +## Known Issues + +### GPU Firmware Bug (RTX 5070 Ti) + +The RTX 5070 Ti enters an error state (NVRM rpcSendMessage 0x00000062) after ~3-5 minutes of sustained GPU compute. This is a known Blackwell/RTX 50-series GSP firmware bug on Linux (NVIDIA bug #5953411). Affects all tested drivers (580.x, 590.x, 595.x). + +**Impact:** GPU-accelerated ML workloads (Whisper transcription, etc.) cannot complete. GPU enters full ERR! state requiring hard power-off (warm reboot hangs with spinning symbol). + +**Workarounds tried (none effective):** +- Disable Runtime D3 power management +- Enable persistence mode +- Lock GPU clocks +- Power cap reduction + +**Status:** Waiting for NVIDIA driver fix. Heavy GPU compute delegated to Mac (M4). + +### Custom Kernel for Audio + +Running a custom-patched CachyOS kernel with the `nadimkobeissi/16iax10h-linux-sound-saga` patch for Awinic AW88399 smart amplifier support. Stock kernel has terrible speaker output. Patch is not upstreamed. + +--- + +## Notes + +- Primary development workstation +- GPU works fine for display, light compute, Ollama inference — only fails under sustained heavy compute (Whisper, training) +- Sudo: NOPASSWD configured for guru user +- Old btrfs @home subvolume on nvme0n1 (from initial install before /home was moved to nvme1n1)