spec: add SPEC-017 end-user (sub-user) remote access
Some checks failed
Build and Test / Build Agent (Windows) (pull_request) Successful in 10m54s
Build and Test / Build Server (Linux) (push) Has been cancelled
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has been cancelled
Build and Test / Build Summary (push) Has been cancelled
Build and Test / Build Server (Linux) (pull_request) Successful in 15m39s
Build and Test / Security Audit (pull_request) Successful in 5m54s
Build and Test / Build Summary (pull_request) Successful in 36s

This commit is contained in:
2026-06-02 12:56:15 -07:00
parent 367906bd54
commit 4c49b73a71
2 changed files with 181 additions and 0 deletions

View File

@@ -95,6 +95,7 @@ Bringing GC to parity with GuruRMM's release engineering. Full plan: [SPEC-001](
- [ ] **Valuable error messages (structured errors + no silent swallows)** — P2 — one structured API error envelope with stable codes + a correlation id that also lands in the logs; contextual tracing on server/agent; sweep the 37 `let _ =` swallows (the pattern that hid the migration-005 bug); dashboard surfaces the real cause + id instead of a generic line. **[→ v2 Phase 0/1 conventions]** ([SPEC-008](specs/SPEC-008-valuable-error-messages.md))
- [ ] **Feature-rich, fully-documented management API** — P2 — everything the console can do, callable by API: OpenAPI 3.x generated from code (utoipa) + browsable docs at `/api/docs`, long-lived revocable scoped API tokens (PAT-style, distinct from the 24h JWT + agent keys), an API-completeness gap audit, and consistent pagination/error conventions. Distinct from the ADR-001 RMM integration contract. **[→ v2 Phase 3]** ([SPEC-009](specs/SPEC-009-feature-rich-documented-api.md))
- [ ] **Branding and white-label configuration** — P2 — Allow MSPs to customize logo, colors, and product name for white-labeled remote support. Dashboard admin settings page with logo upload (PNG/SVG, max 2MB), brand hue slider (OKLCH 0-360°, default 184=cyan), product name override, company name, and favicon. Agent tray tooltip uses custom product name from registry. Singleton database table with public GET endpoint for unauthenticated rendering. CSS variables (`--brand-hue`, `--accent`, `--panel`) for dynamic theming. **[→ v2 Phase 2]** ([SPEC-014](specs/SPEC-014-branding-whitelabel.md))
- [ ] **End-user (sub-user) remote access** — P2 (may be P3) — let a client pay for their employees to reach their *own* machines from home: a deny-by-default `end_user` login role, a locked-down end-user portal listing only granted machines, and Connect reusing the existing session-scoped viewer-token + relay path. Grant primitive already exists (`user_client_access`, migration 002); directory sync (AD/Entra/Google) is a separate future spec. **[→ new capability, post v2-console]** ([SPEC-017](specs/SPEC-017-end-user-remote-access.md))
- [ ] Programmatic session pre-create + viewer-token (integration contract) — P2
## Security & Infrastructure

View File

@@ -0,0 +1,180 @@
# SPEC-017: End-User (Sub-User) Remote Access
**Status:** Proposed
**Priority:** P2 (may settle to P3 depending on client demand)
**Requested By:** Mike (2026-06-02)
**Estimated Effort:** Large
## Overview
Let a client pay for their own employees to remotely reach **their own work machines** from home
through GuruConnect — the Splashtop-Business / unattended-end-user-access model, layered on top of the
MSP-technician console GuruConnect ships today. An MSP admin (or, later, a delegated client-company
admin) provisions a list of **end-users** and grants each one access to specific managed machines. The
end-user signs into a locked-down **end-user portal**, sees only the machines granted to them, and
connects — reusing the existing persistent-agent + session-scoped-viewer-token + relay path.
Success criteria: an `end_user`-role account can log in at a separate portal, see exactly the machines
in its grant set (and no others, across no other tenant), launch a control session to an online granted
machine, and is hard-denied from every technician/admin API, the agent plane, and any machine it was
not granted — with each login and machine access written to the audit log.
This is a net-new **sellable capability**, not a console-MVP blocker. It is sequenced after the v2
console foundations it depends on (tenancy, machine identity, persistent enrollment), which is why it is
P2 rather than P1.
## Scope
### Included in v1
- A new **`end_user`** value for `users.role`, provisioned by an MSP admin, with **deny-by-default**
authority: no console permissions, no agent-plane access, machine reach limited strictly to its
`user_client_access` grant set within its own tenant.
- A **separate end-user login + portal** route (locked-down): lists only granted machines with
online/offline state and a Connect action. No admin nav, no other users/machines/companies.
- **Admin UI + API** to create/disable end-users and assign/revoke per-machine grants, reusing the
existing `user_client_access` table.
- **Connect flow** that reuses the landed session-scoped viewer-token mechanism (`ViewerClaims`,
`jwt.rs:114`) and the relay enforcement path — no new transport.
- A new `connect_sessions.source` value **`end_user`** (migration widening the existing CHECK).
- **Audit**: end-user login success/failure and each machine-access grant-check written to
`connect_session_events`.
- Rate limiting + lockout on the public end-user login.
### Explicitly out of scope (v1)
- **Directory sync (AD / Entra-365 / Google) → end-user list** — its own future spec; v1 is manual
list management only.
- **Self-service seat purchasing / billing automation.** v1 records/counts seats per tenant; real
metering and Syncro/billing wiring is deferred.
- **Delegated client-company-admin role** (a client managing its own end-users/grants) — noted as a
fast-follow; v1 grants are MSP-admin-managed.
- Per-session view-only-vs-control *policy* per end-user (v1 = Control of one's own machine; the
`ViewerAccess` split still exists at the token layer).
- File transfer, session recording (already out of scope for the broader product v1).
## Architecture
### Principal model — `end_user` is a constrained variant of the login plane
GuruConnect already has three credential planes that must stay separate (audit-hardened in v2 Phase 1):
1. **Login `Claims`** (`jwt.rs:11`) — dashboard users; `role ∈ {admin, operator, viewer}` today.
2. **Session-scoped `ViewerClaims`** (`jwt.rs:114`) — 5-min, one session, `purpose=viewer`.
3. **Agent `cak_` keys** (`connect_agent_keys`, migration 004) — agents only.
`end_user` is added as a **fourth role on the login plane** — it issues a normal login JWT
(`create_token`, `jwt.rs:161`) carrying `role: "end_user"` and an **empty permission list**. The
separation guarantees the v2 audit established are preserved: an `end_user` JWT still cannot be used as
a viewer token (lacks `purpose`) nor as an agent key (agent plane rejects user JWTs).
**Critical authz inversion:** `user_client_access` today documents "no entries = access to all (for
admins)" (migration 002, line 25-26). The grant check **must branch on role** — for `end_user`, an
empty grant set means **zero** machines, never all. Authz is deny-by-default and grant-scoped; the
admin-bypass in `Claims::has_permission` (`jwt.rs:28-33`) must never fire for `end_user`.
### Agent / Relay-server / Viewer / Dashboard responsibilities
- **Agent:** no changes. End-users connect to existing **persistent/unattended** managed agents
(consent `not_required` — it is the user's own machine). Optionally honors the SPEC-015 notification
overlay if a per-machine policy requires it.
- **Relay-server:** no transport change. New end-user auth + portal + connect endpoints; the
grant-check + viewer-token mint is the only new server logic on the hot path.
- **Viewer:** reuse the React/TS web viewer (`dashboard/src/components/RemoteViewer.tsx`) — the
end-user portal embeds the same component with a Control-mode viewer token.
- **Dashboard:** new **role-gated end-user portal** route (recommended separate from the technician
console — see Open Questions), plus admin screens for end-user + grant management.
### Database (migrations)
- **`user_client_access`** — reused as the grant table; no schema change (already
`user_id UUID × client_id UUID → connect_machines(id)`, unique pair, migration 002).
- New migration `011_end_user_access.sql`:
- Widen `connect_sessions.source` CHECK to `('standalone','gururmm','end_user')` (currently
`('standalone','gururmm')`, migration 004 line 99-102).
- Optional `users` columns for the external principal: `mfa_secret TEXT NULL`,
`must_change_password BOOLEAN NOT NULL DEFAULT false`, and a partial index for fast
`role='end_user'` listing per `tenant_id`.
- (Seat tracking, if landed in v1: a lightweight per-tenant `end_user` count view or a
`tenant_seats` row — kept minimal.)
- Grants are tenant-contained: insert path validates `machine.tenant_id == end_user.tenant_id`.
### API endpoints / WS messages
- `POST /api/enduser/auth/login` — public, rate-limited; returns an `end_user` login JWT.
- `GET /api/enduser/machines` — lists only the caller's granted, in-tenant machines + presence.
- `POST /api/enduser/machines/:id/connect` — grant-checked; creates a `source=end_user` session and
mints a Control `ViewerClaims` token (`create_viewer_token`, `jwt.rs:233`) for that session.
- Admin: `POST /api/users` (role=end_user), `POST /api/users/:id/grants`,
`DELETE /api/users/:id/grants/:machine_id`, `GET /api/users?role=end_user`.
- No new protobuf messages — the WS viewer path and `guruconnect.proto` are unchanged.
## Implementation details
- `server/src/auth/jwt.rs` — extend the role vocabulary doc (`Claims.role`, line 16-17); add an
`is_end_user()` helper and ensure `has_permission` cannot grant `end_user` anything beyond explicit
permissions (the admin short-circuit at line 30 must be guarded).
- `server/src/auth/mod.rs``AuthenticatedUser` (line 29+) gains role-aware helpers; add an extractor
/ middleware that rejects non-`end_user` on the `/api/enduser/*` namespace and rejects `end_user` on
every console/admin route (deny-by-default allowlist).
- `server/src/api/` — new `enduser` handler module (login, machines, connect); admin user+grant
handlers extended for `role=end_user` and `user_client_access` writes.
- Grant check (shared fn): `machine_id ∈ user_client_access[user] AND machine.tenant_id == user.tenant_id`;
used by both `GET /machines` and `connect`.
- Session create stamps `source='end_user'`, `is_managed=true`/unattended, `consent_state='not_required'`,
then mints the viewer token via the existing path so relay enforcement is unchanged.
- `dashboard/src/` — end-user portal route (role-gated), reusing `RemoteViewer.tsx`; admin grant-matrix
UI. White-label (SPEC-014) applies to the portal as the most client-facing surface.
- Migration `server/migrations/011_end_user_access.sql` as above (idempotent; applied by
`sqlx::migrate!` per the migration standard).
## Security considerations
- **Preserve the plane separation** audited in v2 Phase 1 — `end_user` is login-plane only; it can
never satisfy `validate_viewer_token` or the agent `cak_` path.
- **Deny-by-default, grant-scoped:** empty `user_client_access` for an `end_user` = no access; the
admin-bypass must not apply. Every `/api/enduser/*` call re-checks the grant + tenant server-side
(never trust a machine id from the client).
- **Tenant containment:** an `end_user` and its grants live in one tenant; cross-tenant grants are
rejected at write and re-validated at connect. (Full tenant isolation lands with Phase 4; v1 enforces
via explicit `tenant_id` equality checks.)
- **External-user trust:** these accounts are public-internet-facing from home. Require
rate-limiting + lockout on `/api/enduser/auth/login`; support (recommend require) **TOTP MFA** for
`end_user` — schema column included so MFA can be v1 or an immediate fast-follow without a second
migration. Argon2id passwords (existing standard).
- **Audit:** log each end-user login (success/failure, source IP) and each machine access to
`connect_session_events`; the unattended access is to the user's *own* machine but must be fully
traceable. Optionally enforce the SPEC-015 overlay per machine policy.
- **Threat model:** stolen end-user creds reach only that user's granted machines (blast radius =
grant set), never the console, never the agent plane, never another tenant. Disabling the account
(`users.enabled=false`) immediately revokes portal + future tokens; the 5-min viewer-token TTL bounds
any in-flight session.
## Testing strategy
- **Unit:** grant-check fn (granted / not-granted / cross-tenant / empty-set-for-end_user = deny);
`has_permission` never elevates `end_user`; role-namespace middleware (end_user→console = 403,
technician→/api/enduser = 403).
- **Integration:** end-user login → list shows only granted machines → connect mints a Control viewer
token for a `source=end_user` session → relay admits; connect to a non-granted / other-tenant machine
→ 403; disabled account → login + token use rejected.
- **Manual:** full portal walkthrough from an off-network browser; MFA enrol + challenge; audit rows
present for login and access; white-label branding renders on the portal.
## Effort estimate & dependencies
- **Size:** Large (new principal + portal + admin grant UI + auth namespace; transport/agent untouched
and the grant table already exists, which holds it below X-Large).
- **Depends on (must precede / strongly preferred):**
- **Tenancy** (`tenants` + `tenant_id`, migration 004) — needed for containment; full isolation is
Phase 4 but v1 uses explicit tenant checks.
- **Stable machine identity + persistent enrollment** (SPEC-004 / 008 `machine_uid`, SPEC-016
zero-touch `cak_`) — end-users reach persistent managed agents.
- **Session-scoped viewer tokens** (v2 Phase 1, landed) — reused directly.
- **Pairs with:** SPEC-014 (white-label — the portal is the client-facing surface), SPEC-003/005
(machine inventory/list — portal machine rows), SPEC-015 (optional connect-notification overlay).
- **Unblocks:** the directory-sync spec (AD/Entra/Google → end-user list), delegated client-admin role,
and per-seat billing — all of which build on the `end_user` principal defined here.
## Open questions
1. **Same console vs separate end-user portal?** Recommendation: **separate, role-gated route**
smaller attack surface, no risk of leaking technician controls, cleaner white-label. Confirm before
build.
2. **End-users in the existing `users` table (role=end_user) vs a dedicated `end_users` table?**
Recommendation: reuse `users` (the grant FK `user_client_access.user_id` already points there) with
hard role guardrails. Revisit if mixing external + internal principals in one table proves risky.
3. **MFA in v1 or immediate fast-follow?** Schema is included either way; decide enforcement timing.
4. **Who administers grants in v1** — MSP admin only (assumed), or ship the delegated client-company
admin role together? (Affects scope/effort materially.)
5. **Seat/licensing enforcement depth for v1** — count-and-display vs hard-cap vs billing-integrated.
6. **Default access mode** — Control assumed (own machine); should an admin be able to pin a machine to
view-only for a given end-user? (Token layer already supports it.)