Commit Graph

9 Commits

Author SHA1 Message Date
5ee6675337 feat(server): operator removal of stale sessions/machines (SPEC-004 Task 5, server)
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 7m29s
Build and Test / Build Server (Linux) (push) Successful in 10m58s
Build and Test / Security Audit (push) Successful in 4m4s
Build and Test / Build Summary (push) Successful in 8s
Admin-gated soft-delete + purge so operators can clear ghost machines/sessions
(the ~15-rows-for-one-host accumulation) from the console.

- migration 009: deleted_at on connect_sessions + connect_machines, with partial
  indexes WHERE deleted_at IS NULL.
- DELETE /api/machines/:agent_id?purge=true and DELETE /api/sessions/:id?purge=true
  soft-delete the row and purge the in-memory session (remove_session); the
  non-purge path keeps the legacy hard-delete / live-only disconnect. POST
  /api/machines/bulk-remove handles multi-select (batch cap 500). All admin-gated
  (AdminUser -> 403; tightens the prior any-user delete) and audited to
  connect_session_events (actor + target + trusted client IP).
- list/get queries filter deleted_at IS NULL so removed units leave the console;
  upsert revives (deleted_at = NULL) a genuinely-reconnecting machine. The
  keyed-reattach identity resolver (get_machine_by_id) is intentionally unfiltered.

Dashboard removal UI is the A3b follow-up. 86 server tests pass; fmt/clippy/test
clean. Implements specs/v2-stable-identity/plan.md Task 5 (server portion).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 13:52:36 -07:00
ffca7f0cee feat(server): dedup machines on machine_uid (SPEC-004 Task 2)
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m48s
Build and Test / Build Agent (Windows) (push) Successful in 7m34s
Build and Test / Security Audit (push) Successful in 4m44s
Build and Test / Build Summary (push) Has been skipped
Persist the agent-reported machine_uid and dedup connect_machines on it so a
single physical machine can't register duplicate rows when its config-file
agent_id regenerates (the ghost-session root cause).

- migration 008: nullable connect_machines.machine_uid + partial unique index
  (WHERE machine_uid IS NOT NULL); idempotent, startup-applied.
- upsert_machine: two-path dedup (ON CONFLICT machine_uid when present, else
  the legacy ON CONFLICT agent_id path, unchanged).
- session reattach: a machine_uid index consulted before agent_id, with all
  removal paths purging it.
- security: keyed (cak_) agents stay authoritative — their claimed machine_uid
  is dropped (effective_machine_uid=None); uid is dedup-only for un-keyed /
  support-code agents. Startup restore skips uid-indexing keyed machines and
  fails closed if the keyed-set query errors.

74 server tests pass; clippy clean. Implements specs/v2-stable-identity/plan.md Task 2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:06:50 -07:00
abc55abb0b fix(server): tolerate NULL connect_machines columns (tags decode bug)
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m37s
Build and Test / Build Agent (Windows) (push) Successful in 7m30s
Build and Test / Security Audit (push) Successful in 4m49s
Build and Test / Build Summary (push) Has been skipped
connect_machines.tags is text[] nullable with no default; the derived
FromRow decoded it as non-Option Vec<String>, so rows with NULL tags
threw "unexpected null" - breaking managed-session reconcile on startup
and the authed Machines list. Hit in production on the v2 cutover.

- Replace the derived FromRow on Machine with a manual impl that decodes
  every nullable-non-Option column as Option<T> with unwrap_or_default
  (tags, is_elevated, is_persistent, status, timestamps), fixing all six
  read sites at once. Public field types unchanged.
- migrations/007: backfill NULL tags to empty array, set DEFAULT '{}',
  set NOT NULL (no writer inserts NULL: upsert omits tags, metadata
  update binds a non-null array). Idempotent with the prod hot-patch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 15:17:12 -07:00
bfcdbb5379 feat(server): v2 secure-session-core Task 4 - rate limit + single-use codes
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 6m12s
Build and Test / Build Agent (Windows) (push) Successful in 6m43s
Build and Test / Security Audit (push) Successful in 4m23s
Build and Test / Build Summary (push) Has been skipped
SPEC-002 Phase 1 Task 4 (the final keystone task), code-reviewed APPROVED.
Closes the audit's reusable-code HIGH and rate-limiting-disabled HIGH.

- Rebuilt rate limiting as a self-contained in-memory per-IP limiter (replaces
  the non-compiling tower_governor; removed that dep). Fixed-window caps wired
  to login (8/min), change-password (5/min), code-validate (15/min) -> 429;
  per-IP lockout after 10 consecutive failed code validations (15-min cooldown).
- Single-use support codes: atomic consume on first agent bind (in-memory
  Pending->Connected under write lock + DB conditional UPDATE), rejecting a
  second presenter; validate/preview does not consume.
- Widened code format: XXX-XXX-XXX, 31-char unambiguous alphabet (no 0/O/1/I/L),
  CSPRNG + rejection sampling, ~44.6 bits (replaces 6-digit numeric); migration
  006 widens the code columns to TEXT.

Completes the keystone (Tasks 1-4): every audit CRITICAL + HIGH in the secure
auth/session core is now addressed. Known follow-up todos (not blocking): (1)
trusted-proxy client-IP extraction (NPM-on-loopback collapses clients to
127.0.0.1); (2) multi-instance fail-closed DB single-use gate. Not
cargo-check-verified locally - build-host/CI verification follows this commit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 21:04:54 -07:00
41691bfb2c feat(server): v2 secure-session-core Task 2 - auth rebuild
Some checks failed
Build and Test / Build Server (Linux) (push) Failing after 3m37s
Build and Test / Build Agent (Windows) (push) Successful in 6m37s
Build and Test / Security Audit (push) Successful in 4m10s
Build and Test / Build Summary (push) Has been skipped
SPEC-002 Phase 1 Task 2 (specs/v2-secure-session-core), code-reviewed APPROVED.

- DELETE the JWT-as-agent-key branch in relay validate_agent_api_key (audit
  CRITICAL): agent auth now = per-agent cak_ key (SHA-256 -> connect_agent_keys,
  revoked filtered) OR support code OR deprecated shared AGENT_API_KEY (warned).
  A user JWT can no longer authenticate an agent.
- auth/agent_keys.rs: cak_ gen (OsRng 256-bit) + SHA-256 hash + verify.
- auth/jwt.rs: ViewerClaims + create/validate_viewer_token (5-min TTL,
  purpose=viewer, session_id+tenant_id claims; non-interchangeable with login).
- Admin key issuance: POST/GET/DELETE /api/machines/:agent_id/keys.
- POST /api/sessions/:id/viewer-token mints a session-bound short-lived token.
- Migration 005: organization/site/tags on connect_machines (fixes the silent
  update_machine_metadata write, coord todo faf39fe0).

NOTE: viewer-token minting is gated by AuthenticatedUser only; the AUTHORIZATION
check (admin/permission gate) that closes audit CRITICAL #1 lands in Task 3 (the
viewer WS verification). The viewer WS path (relay/mod.rs:285) is untouched here.
Not cargo-check-verified (no toolchain on the authoring host) - self-reviewed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 18:57:12 -07:00
fef8111ff3 feat(server): v2 secure-session-core Task 1 - schema + per-agent keys
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 6m7s
Build and Test / Build Server (Linux) (push) Successful in 10m15s
Build and Test / Security Audit (push) Successful in 4m24s
Build and Test / Build Summary (push) Successful in 12s
SPEC-002 Phase 1 Task 1 (specs/v2-secure-session-core), code-reviewed APPROVED.

Migration 004 (idempotent, server-applied): tenants + seeded default tenant,
connect_agent_keys (hash-only, revocable, FK->connect_machines), nullable
tenant_id on all scoped tables (tenancy-ready, not tenant-yet), connect_sessions
is_managed/source/consent_state, connect_support_codes consumed_at. New db
modules agent_keys.rs (stores only key_hash) + tenancy.rs (DEFAULT_TENANT_ID,
Phase-4 switch point). Struct/query updates across machines/sessions/
support_codes/events/users. Runtime sqlx throughout (GC db layer already uses
it - no compile-time macros).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 18:33:26 -07:00
4e5328fe4a Implement robust auto-update system for GuruConnect agent
Features:
- Agent checks for updates periodically (hourly) during idle
- Admin can trigger immediate updates via dashboard "Update Agent" button
- Silent updates with in-place binary replacement (no reboot required)
- SHA-256 checksum verification before installation
- Semantic version comparison

Server changes:
- New releases table for tracking available versions
- GET /api/version endpoint for agent polling (unauthenticated)
- POST /api/machines/:id/update endpoint for admin push updates
- Release management API (/api/releases CRUD)
- Track agent_version in machine status

Agent changes:
- New update.rs module with download/verify/install/restart logic
- Handle ADMIN_UPDATE WebSocket command for push updates
- --post-update flag for cleanup after successful update
- Periodic update check in idle loop (persistent agents only)
- agent_version included in AgentStatus messages

Dashboard changes:
- Version display in machine detail panel
- "Update Agent" button for each connected machine

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 09:31:23 -07:00
3fc4e1f96a Add user management system with JWT authentication
- Database schema: users, permissions, client_access tables
- Auth: JWT tokens with Argon2 password hashing
- API: login, user CRUD, permission management
- Dashboard: login required, admin Users tab
- Auto-creates initial admin user on first run
2025-12-29 21:00:20 -07:00
f6bf0cfd26 Add PostgreSQL database persistence
- Add connect_machines, connect_sessions, connect_session_events, connect_support_codes tables
- Implement db module with connection pooling (sqlx)
- Add machine persistence across server restarts
- Add audit logging for session/viewer events
- Support codes now persisted to database

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 19:51:01 -07:00