Files
claudetools/.claude/memory/project_guruconnect_deploy.md
Mike Swanson 3895aa363c sync: auto-sync from GURU-5070 at 2026-05-30 15:26:54
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-05-30 15:26:54
2026-05-30 15:27:00 -07:00

55 lines
4.1 KiB
Markdown

---
name: project_guruconnect_deploy
description: How to deploy GuruConnect (v2+) to production — the server (172.16.3.30) builds its own Linux binary; gotchas with the systemd watchdog, trusted-proxy env, and auto-run migrations
metadata:
type: project
---
GuruConnect v2 went live in production on 2026-05-30 (server + dashboard at v0.2.0,
public at connect.azcomputerguru.com via NPM -> localhost:3002). The deploy is **manual**
(the `.gitea/workflows/deploy.yml` "deploy to server" step is a stub that only builds a
package artifact). Repo on the box: `/home/guru/guru-connect` (separate repo
`azcomputerguru/guru-connect`, NOT a submodule there).
**Build host = the server itself.** 172.16.3.30 has rust (rustup, cargo 1.94, the
`x86_64-unknown-linux-gnu` target), node 20 + npm 10, and protoc (~/.local/bin, libprotoc 28.3)
— all on PATH only in a **login shell** (`ssh guru@172.16.3.30 'bash -lc "..."'`; a
non-interactive shell does NOT source ~/.profile so cargo/protoc look "missing"). GURU-5070
builds the *Windows* agent + a windows-target server, NOT the Linux release — so build the
Linux server ON the box. See [[reference_guru5070_rust_toolchain]].
Deploy sequence (build while v1 runs, then a quick cutover restart):
1. **Backup first:** `pg_dump "$DATABASE_URL" | gzip > ~/backups/guruconnect/pre-deploy-*.sql.gz`;
save the current commit + copy the running binary to `~/guruconnect-server.vN.bak`.
2. Get the code: the server's local `main` may have **diverged** from origin (the v2 greenfield
respec rewrote history — `git pull --ff-only` will refuse). Tree is clean, so
`git fetch origin && git reset --hard origin/main` (rollback SHA is saved). `.env` is
gitignored, untouched.
3. SPA: `cd dashboard && npm ci && npm run build` -> emits to `../server/static/app/` (gitignored).
4. Binary (from repo ROOT, login shell, PROTOC set): `cargo build --release -p guruconnect-server
--target x86_64-unknown-linux-gnu`. `-p` scopes to the server so the Windows-only agent crate
isn't compiled; explicit `--target` overrides `.cargo/config.toml`'s windows-msvc default.
Output lands at `target/x86_64-unknown-linux-gnu/release/guruconnect-server` = the unit's ExecStart.
~3 min. sqlx uses RUNTIME queries (no `query!` macros, no `.sqlx` cache) so the build needs no DB.
5. **Cutover:** `sudo systemctl restart guruconnect`. Migrations are sqlx-embedded in the binary and
**auto-run on startup** (`db.migrate()`), so no manual `psql`. Watch
`journalctl -u guruconnect` for "Migrations complete" + "Server listening".
GOTCHAS (all hit on the 2026-05-30 deploy):
- **systemd unit:** the INSTALLED `/etc/systemd/system/guruconnect.service` has **no `WatchdogSec`**
(correct for v2, which sends no `sd_notify`). The repo's `server/guruconnect.service` DOES set
`WatchdogSec=30s` — so do NOT run `setup-systemd.sh` / copy the repo unit, or v2 restart-loops
every 30s. Unit: User=guru, EnvironmentFile=server/.env, WorkingDirectory=server/, ProtectSystem=strict.
- **`CONNECT_TRUSTED_PROXIES`** is a v2 env var (comma-separated IPs; defaults to loopback fail-closed).
Public `connect.azcomputerguru.com` ingresses through **NPM on Jupiter (172.16.3.20)**, which forwards to
the relay on 172.16.3.30:3002. So set `CONNECT_TRUSTED_PROXIES=127.0.0.1,::1,172.16.3.20` in `server/.env`
(the Jupiter NPM hop, NOT the relay host .30 — that was a wrong first guess). Without trusting 172.16.3.20
the relay logs every public agent as 172.16.3.20 instead of reading X-Forwarded-For; with it, the real client
IP shows (verified: a Pavon agent logged its true public IP 98.172.64.243). Only `JWT_SECRET` is hard-required.
- **NULL tags bug:** `connect_machines.tags` is `text[]` nullable with no default; v2 decodes it as
non-`Option`, so rows with NULL tags throw "unexpected null" at reconcile (and likely the Machines
list). Mitigated with `UPDATE connect_machines SET tags='{}' WHERE tags IS NULL`. Real fix is a
todo (decode Option + migration default).
- DB is Postgres 14 `guruconnect` on localhost; existing users (admin, howard, both role admin)
survive migration. Rollback: `git reset --hard <saved-sha>`, rebuild, restart, `psql < backup`.