guru-connect/docs/specs/SPEC-017-end-user-remote-access.md

# SPEC-017: End-User (Sub-User) Remote Access

**Status:** Proposed
**Priority:** P2 (may settle to P3 depending on client demand)
**Requested By:** Mike (2026-06-02)
**Estimated Effort:** Large

## Overview

Let a client pay for their own employees to remotely reach **their own work machines** from home
through GuruConnect — the Splashtop-Business / unattended-end-user-access model, layered on top of the
MSP-technician console GuruConnect ships today. An MSP admin (or, later, a delegated client-company
admin) provisions a list of **end-users** and grants each one access to specific managed machines. The
end-user signs into a locked-down **end-user portal**, sees only the machines granted to them, and
connects — reusing the existing persistent-agent + session-scoped-viewer-token + relay path.

Success criteria: an `end_user`-role account can log in at a separate portal, see exactly the machines
in its grant set (and no others, across no other tenant), launch a control session to an online granted
machine, and is hard-denied from every technician/admin API, the agent plane, and any machine it was
not granted — with each login and machine access written to the audit log.

This is a net-new **sellable capability**, not a console-MVP blocker. It is sequenced after the v2
console foundations it depends on (tenancy, machine identity, persistent enrollment), which is why it is
P2 rather than P1.

## Scope

### Included in v1
- A new **`end_user`** value for `users.role`, provisioned by an MSP admin, with **deny-by-default**
  authority: no console permissions, no agent-plane access, machine reach limited strictly to its
  `user_client_access` grant set within its own tenant.
- A **separate end-user login + portal** route (locked-down): lists only granted machines with
  online/offline state and a Connect action. No admin nav, no other users/machines/companies.
- **Admin UI + API** to create/disable end-users and assign/revoke per-machine grants, reusing the
  existing `user_client_access` table.
- **Connect flow** that reuses the landed session-scoped viewer-token mechanism (`ViewerClaims`,
  `jwt.rs:114`) and the relay enforcement path — no new transport.
- A new `connect_sessions.source` value **`end_user`** (migration widening the existing CHECK).
- **Audit**: end-user login success/failure and each machine-access grant-check written to
  `connect_session_events`.
- Rate limiting + lockout on the public end-user login.

### Explicitly out of scope (v1)
- **Directory sync (AD / Entra-365 / Google) → end-user list** — its own future spec; v1 is manual
  list management only.
- **Self-service seat purchasing / billing automation.** v1 records/counts seats per tenant; real
  metering and Syncro/billing wiring is deferred.
- **Delegated client-company-admin role** (a client managing its own end-users/grants) — noted as a
  fast-follow; v1 grants are MSP-admin-managed.
- Per-session view-only-vs-control *policy* per end-user (v1 = Control of one's own machine; the
  `ViewerAccess` split still exists at the token layer).
- File transfer, session recording (already out of scope for the broader product v1).

## Architecture

### Principal model — `end_user` is a constrained variant of the login plane
GuruConnect already has three credential planes that must stay separate (audit-hardened in v2 Phase 1):
1. **Login `Claims`** (`jwt.rs:11`) — dashboard users; `role ∈ {admin, operator, viewer}` today.
2. **Session-scoped `ViewerClaims`** (`jwt.rs:114`) — 5-min, one session, `purpose=viewer`.
3. **Agent `cak_` keys** (`connect_agent_keys`, migration 004) — agents only.

`end_user` is added as a **fourth role on the login plane** — it issues a normal login JWT
(`create_token`, `jwt.rs:161`) carrying `role: "end_user"` and an **empty permission list**. The
separation guarantees the v2 audit established are preserved: an `end_user` JWT still cannot be used as
a viewer token (lacks `purpose`) nor as an agent key (agent plane rejects user JWTs).

**Critical authz inversion:** `user_client_access` today documents "no entries = access to all (for
admins)" (migration 002, line 25-26). The grant check **must branch on role** — for `end_user`, an
empty grant set means **zero** machines, never all. Authz is deny-by-default and grant-scoped; the
admin-bypass in `Claims::has_permission` (`jwt.rs:28-33`) must never fire for `end_user`.

### Agent / Relay-server / Viewer / Dashboard responsibilities
- **Agent:** no changes. End-users connect to existing **persistent/unattended** managed agents
  (consent `not_required` — it is the user's own machine). Optionally honors the SPEC-015 notification
  overlay if a per-machine policy requires it.
- **Relay-server:** no transport change. New end-user auth + portal + connect endpoints; the
  grant-check + viewer-token mint is the only new server logic on the hot path.
- **Viewer:** reuse the React/TS web viewer (`dashboard/src/components/RemoteViewer.tsx`) — the
  end-user portal embeds the same component with a Control-mode viewer token.
- **Dashboard:** new **role-gated end-user portal** route (recommended separate from the technician
  console — see Open Questions), plus admin screens for end-user + grant management.

### Database (migrations)
- **`user_client_access`** — reused as the grant table; no schema change (already
  `user_id UUID × client_id UUID → connect_machines(id)`, unique pair, migration 002).
- New migration `011_end_user_access.sql`:
  - Widen `connect_sessions.source` CHECK to `('standalone','gururmm','end_user')` (currently
    `('standalone','gururmm')`, migration 004 line 99-102).
  - Optional `users` columns for the external principal: `mfa_secret TEXT NULL`,
    `must_change_password BOOLEAN NOT NULL DEFAULT false`, and a partial index for fast
    `role='end_user'` listing per `tenant_id`.
  - (Seat tracking, if landed in v1: a lightweight per-tenant `end_user` count view or a
    `tenant_seats` row — kept minimal.)
- Grants are tenant-contained: insert path validates `machine.tenant_id == end_user.tenant_id`.

### API endpoints / WS messages
- `POST /api/enduser/auth/login` — public, rate-limited; returns an `end_user` login JWT.
- `GET  /api/enduser/machines` — lists only the caller's granted, in-tenant machines + presence.
- `POST /api/enduser/machines/:id/connect` — grant-checked; creates a `source=end_user` session and
  mints a Control `ViewerClaims` token (`create_viewer_token`, `jwt.rs:233`) for that session.
- Admin: `POST /api/users` (role=end_user), `POST /api/users/:id/grants`,
  `DELETE /api/users/:id/grants/:machine_id`, `GET /api/users?role=end_user`.
- No new protobuf messages — the WS viewer path and `guruconnect.proto` are unchanged.

## Implementation details
- `server/src/auth/jwt.rs` — extend the role vocabulary doc (`Claims.role`, line 16-17); add an
  `is_end_user()` helper and ensure `has_permission` cannot grant `end_user` anything beyond explicit
  permissions (the admin short-circuit at line 30 must be guarded).
- `server/src/auth/mod.rs` — `AuthenticatedUser` (line 29+) gains role-aware helpers; add an extractor
  / middleware that rejects non-`end_user` on the `/api/enduser/*` namespace and rejects `end_user` on
  every console/admin route (deny-by-default allowlist).
- `server/src/api/` — new `enduser` handler module (login, machines, connect); admin user+grant
  handlers extended for `role=end_user` and `user_client_access` writes.
- Grant check (shared fn): `machine_id ∈ user_client_access[user] AND machine.tenant_id == user.tenant_id`;
  used by both `GET /machines` and `connect`.
- Session create stamps `source='end_user'`, `is_managed=true`/unattended, `consent_state='not_required'`,
  then mints the viewer token via the existing path so relay enforcement is unchanged.
- `dashboard/src/` — end-user portal route (role-gated), reusing `RemoteViewer.tsx`; admin grant-matrix
  UI. White-label (SPEC-014) applies to the portal as the most client-facing surface.
- Migration `server/migrations/011_end_user_access.sql` as above (idempotent; applied by
  `sqlx::migrate!` per the migration standard).

## Security considerations
- **Preserve the plane separation** audited in v2 Phase 1 — `end_user` is login-plane only; it can
  never satisfy `validate_viewer_token` or the agent `cak_` path.
- **Deny-by-default, grant-scoped:** empty `user_client_access` for an `end_user` = no access; the
  admin-bypass must not apply. Every `/api/enduser/*` call re-checks the grant + tenant server-side
  (never trust a machine id from the client).
- **Tenant containment:** an `end_user` and its grants live in one tenant; cross-tenant grants are
  rejected at write and re-validated at connect. (Full tenant isolation lands with Phase 4; v1 enforces
  via explicit `tenant_id` equality checks.)
- **External-user trust:** these accounts are public-internet-facing from home. Require
  rate-limiting + lockout on `/api/enduser/auth/login`; support (recommend require) **TOTP MFA** for
  `end_user` — schema column included so MFA can be v1 or an immediate fast-follow without a second
  migration. Argon2id passwords (existing standard).
- **Audit:** log each end-user login (success/failure, source IP) and each machine access to
  `connect_session_events`; the unattended access is to the user's *own* machine but must be fully
  traceable. Optionally enforce the SPEC-015 overlay per machine policy.
- **Threat model:** stolen end-user creds reach only that user's granted machines (blast radius =
  grant set), never the console, never the agent plane, never another tenant. Disabling the account
  (`users.enabled=false`) immediately revokes portal + future tokens; the 5-min viewer-token TTL bounds
  any in-flight session.

## Testing strategy
- **Unit:** grant-check fn (granted / not-granted / cross-tenant / empty-set-for-end_user = deny);
  `has_permission` never elevates `end_user`; role-namespace middleware (end_user→console = 403,
  technician→/api/enduser = 403).
- **Integration:** end-user login → list shows only granted machines → connect mints a Control viewer
  token for a `source=end_user` session → relay admits; connect to a non-granted / other-tenant machine
  → 403; disabled account → login + token use rejected.
- **Manual:** full portal walkthrough from an off-network browser; MFA enrol + challenge; audit rows
  present for login and access; white-label branding renders on the portal.

## Effort estimate & dependencies
- **Size:** Large (new principal + portal + admin grant UI + auth namespace; transport/agent untouched
  and the grant table already exists, which holds it below X-Large).
- **Depends on (must precede / strongly preferred):**
  - **Tenancy** (`tenants` + `tenant_id`, migration 004) — needed for containment; full isolation is
    Phase 4 but v1 uses explicit tenant checks.
  - **Stable machine identity + persistent enrollment** (SPEC-004 / 008 `machine_uid`, SPEC-016
    zero-touch `cak_`) — end-users reach persistent managed agents.
  - **Session-scoped viewer tokens** (v2 Phase 1, landed) — reused directly.
- **Pairs with:** SPEC-014 (white-label — the portal is the client-facing surface), SPEC-003/005
  (machine inventory/list — portal machine rows), SPEC-015 (optional connect-notification overlay).
- **Unblocks:** the directory-sync spec (AD/Entra/Google → end-user list), delegated client-admin role,
  and per-seat billing — all of which build on the `end_user` principal defined here.

## Open questions
1. **Same console vs separate end-user portal?** Recommendation: **separate, role-gated route** —
   smaller attack surface, no risk of leaking technician controls, cleaner white-label. Confirm before
   build.
2. **End-users in the existing `users` table (role=end_user) vs a dedicated `end_users` table?**
   Recommendation: reuse `users` (the grant FK `user_client_access.user_id` already points there) with
   hard role guardrails. Revisit if mixing external + internal principals in one table proves risky.
3. **MFA in v1 or immediate fast-follow?** Schema is included either way; decide enforcement timing.
4. **Who administers grants in v1** — MSP admin only (assumed), or ship the delegated client-company
   admin role together? (Affects scope/effort materially.)
5. **Seat/licensing enforcement depth for v1** — count-and-display vs hard-cap vs billing-integrated.
6. **Default access mode** — Control assumed (own machine); should an admin be able to pin a machine to
   view-only for a given end-user? (Token layer already supports it.)