55 lines
4.1 KiB
Markdown
55 lines
4.1 KiB
Markdown
---
|
|
name: project_guruconnect_deploy
|
|
description: How to deploy GuruConnect (v2+) to production — the server (172.16.3.30) builds its own Linux binary; gotchas with the systemd watchdog, trusted-proxy env, and auto-run migrations
|
|
metadata:
|
|
type: project
|
|
---
|
|
|
|
GuruConnect v2 went live in production on 2026-05-30 (server + dashboard at v0.2.0,
|
|
public at connect.azcomputerguru.com via NPM -> localhost:3002). The deploy is **manual**
|
|
(the `.gitea/workflows/deploy.yml` "deploy to server" step is a stub that only builds a
|
|
package artifact). Repo on the box: `/home/guru/guru-connect` (separate repo
|
|
`azcomputerguru/guru-connect`, NOT a submodule there).
|
|
|
|
**Build host = the server itself.** 172.16.3.30 has rust (rustup, cargo 1.94, the
|
|
`x86_64-unknown-linux-gnu` target), node 20 + npm 10, and protoc (~/.local/bin, libprotoc 28.3)
|
|
— all on PATH only in a **login shell** (`ssh guru@172.16.3.30 'bash -lc "..."'`; a
|
|
non-interactive shell does NOT source ~/.profile so cargo/protoc look "missing"). GURU-5070
|
|
builds the *Windows* agent + a windows-target server, NOT the Linux release — so build the
|
|
Linux server ON the box. See [[reference_guru5070_rust_toolchain]].
|
|
|
|
Deploy sequence (build while v1 runs, then a quick cutover restart):
|
|
1. **Backup first:** `pg_dump "$DATABASE_URL" | gzip > ~/backups/guruconnect/pre-deploy-*.sql.gz`;
|
|
save the current commit + copy the running binary to `~/guruconnect-server.vN.bak`.
|
|
2. Get the code: the server's local `main` may have **diverged** from origin (the v2 greenfield
|
|
respec rewrote history — `git pull --ff-only` will refuse). Tree is clean, so
|
|
`git fetch origin && git reset --hard origin/main` (rollback SHA is saved). `.env` is
|
|
gitignored, untouched.
|
|
3. SPA: `cd dashboard && npm ci && npm run build` -> emits to `../server/static/app/` (gitignored).
|
|
4. Binary (from repo ROOT, login shell, PROTOC set): `cargo build --release -p guruconnect-server
|
|
--target x86_64-unknown-linux-gnu`. `-p` scopes to the server so the Windows-only agent crate
|
|
isn't compiled; explicit `--target` overrides `.cargo/config.toml`'s windows-msvc default.
|
|
Output lands at `target/x86_64-unknown-linux-gnu/release/guruconnect-server` = the unit's ExecStart.
|
|
~3 min. sqlx uses RUNTIME queries (no `query!` macros, no `.sqlx` cache) so the build needs no DB.
|
|
5. **Cutover:** `sudo systemctl restart guruconnect`. Migrations are sqlx-embedded in the binary and
|
|
**auto-run on startup** (`db.migrate()`), so no manual `psql`. Watch
|
|
`journalctl -u guruconnect` for "Migrations complete" + "Server listening".
|
|
|
|
GOTCHAS (all hit on the 2026-05-30 deploy):
|
|
- **systemd unit:** the INSTALLED `/etc/systemd/system/guruconnect.service` has **no `WatchdogSec`**
|
|
(correct for v2, which sends no `sd_notify`). The repo's `server/guruconnect.service` DOES set
|
|
`WatchdogSec=30s` — so do NOT run `setup-systemd.sh` / copy the repo unit, or v2 restart-loops
|
|
every 30s. Unit: User=guru, EnvironmentFile=server/.env, WorkingDirectory=server/, ProtectSystem=strict.
|
|
- **`CONNECT_TRUSTED_PROXIES`** is a v2 env var (comma-separated IPs; defaults to loopback fail-closed).
|
|
Public `connect.azcomputerguru.com` ingresses through **NPM on Jupiter (172.16.3.20)**, which forwards to
|
|
the relay on 172.16.3.30:3002. So set `CONNECT_TRUSTED_PROXIES=127.0.0.1,::1,172.16.3.20` in `server/.env`
|
|
(the Jupiter NPM hop, NOT the relay host .30 — that was a wrong first guess). Without trusting 172.16.3.20
|
|
the relay logs every public agent as 172.16.3.20 instead of reading X-Forwarded-For; with it, the real client
|
|
IP shows (verified: a Pavon agent logged its true public IP 98.172.64.243). Only `JWT_SECRET` is hard-required.
|
|
- **NULL tags bug:** `connect_machines.tags` is `text[]` nullable with no default; v2 decodes it as
|
|
non-`Option`, so rows with NULL tags throw "unexpected null" at reconcile (and likely the Machines
|
|
list). Mitigated with `UPDATE connect_machines SET tags='{}' WHERE tags IS NULL`. Real fix is a
|
|
todo (decode Option + migration default).
|
|
- DB is Postgres 14 `guruconnect` on localhost; existing users (admin, howard, both role admin)
|
|
survive migration. Rollback: `git reset --hard <saved-sha>`, rebuild, restart, `psql < backup`.
|