Files
guru-connect/docs/specs/SPEC-017-end-user-remote-access.md
Mike Swanson 4c49b73a71
Some checks failed
Build and Test / Build Agent (Windows) (pull_request) Successful in 10m54s
Build and Test / Build Server (Linux) (push) Has been cancelled
Build and Test / Build Agent (Windows) (push) Has started running
Build and Test / Security Audit (push) Has been cancelled
Build and Test / Build Summary (push) Has been cancelled
Build and Test / Build Server (Linux) (pull_request) Successful in 15m39s
Build and Test / Security Audit (pull_request) Successful in 5m54s
Build and Test / Build Summary (pull_request) Successful in 36s
spec: add SPEC-017 end-user (sub-user) remote access
2026-06-02 12:56:15 -07:00

12 KiB
Raw Blame History

SPEC-017: End-User (Sub-User) Remote Access

Status: Proposed Priority: P2 (may settle to P3 depending on client demand) Requested By: Mike (2026-06-02) Estimated Effort: Large

Overview

Let a client pay for their own employees to remotely reach their own work machines from home through GuruConnect — the Splashtop-Business / unattended-end-user-access model, layered on top of the MSP-technician console GuruConnect ships today. An MSP admin (or, later, a delegated client-company admin) provisions a list of end-users and grants each one access to specific managed machines. The end-user signs into a locked-down end-user portal, sees only the machines granted to them, and connects — reusing the existing persistent-agent + session-scoped-viewer-token + relay path.

Success criteria: an end_user-role account can log in at a separate portal, see exactly the machines in its grant set (and no others, across no other tenant), launch a control session to an online granted machine, and is hard-denied from every technician/admin API, the agent plane, and any machine it was not granted — with each login and machine access written to the audit log.

This is a net-new sellable capability, not a console-MVP blocker. It is sequenced after the v2 console foundations it depends on (tenancy, machine identity, persistent enrollment), which is why it is P2 rather than P1.

Scope

Included in v1

  • A new end_user value for users.role, provisioned by an MSP admin, with deny-by-default authority: no console permissions, no agent-plane access, machine reach limited strictly to its user_client_access grant set within its own tenant.
  • A separate end-user login + portal route (locked-down): lists only granted machines with online/offline state and a Connect action. No admin nav, no other users/machines/companies.
  • Admin UI + API to create/disable end-users and assign/revoke per-machine grants, reusing the existing user_client_access table.
  • Connect flow that reuses the landed session-scoped viewer-token mechanism (ViewerClaims, jwt.rs:114) and the relay enforcement path — no new transport.
  • A new connect_sessions.source value end_user (migration widening the existing CHECK).
  • Audit: end-user login success/failure and each machine-access grant-check written to connect_session_events.
  • Rate limiting + lockout on the public end-user login.

Explicitly out of scope (v1)

  • Directory sync (AD / Entra-365 / Google) → end-user list — its own future spec; v1 is manual list management only.
  • Self-service seat purchasing / billing automation. v1 records/counts seats per tenant; real metering and Syncro/billing wiring is deferred.
  • Delegated client-company-admin role (a client managing its own end-users/grants) — noted as a fast-follow; v1 grants are MSP-admin-managed.
  • Per-session view-only-vs-control policy per end-user (v1 = Control of one's own machine; the ViewerAccess split still exists at the token layer).
  • File transfer, session recording (already out of scope for the broader product v1).

Architecture

Principal model — end_user is a constrained variant of the login plane

GuruConnect already has three credential planes that must stay separate (audit-hardened in v2 Phase 1):

  1. Login Claims (jwt.rs:11) — dashboard users; role ∈ {admin, operator, viewer} today.
  2. Session-scoped ViewerClaims (jwt.rs:114) — 5-min, one session, purpose=viewer.
  3. Agent cak_ keys (connect_agent_keys, migration 004) — agents only.

end_user is added as a fourth role on the login plane — it issues a normal login JWT (create_token, jwt.rs:161) carrying role: "end_user" and an empty permission list. The separation guarantees the v2 audit established are preserved: an end_user JWT still cannot be used as a viewer token (lacks purpose) nor as an agent key (agent plane rejects user JWTs).

Critical authz inversion: user_client_access today documents "no entries = access to all (for admins)" (migration 002, line 25-26). The grant check must branch on role — for end_user, an empty grant set means zero machines, never all. Authz is deny-by-default and grant-scoped; the admin-bypass in Claims::has_permission (jwt.rs:28-33) must never fire for end_user.

Agent / Relay-server / Viewer / Dashboard responsibilities

  • Agent: no changes. End-users connect to existing persistent/unattended managed agents (consent not_required — it is the user's own machine). Optionally honors the SPEC-015 notification overlay if a per-machine policy requires it.
  • Relay-server: no transport change. New end-user auth + portal + connect endpoints; the grant-check + viewer-token mint is the only new server logic on the hot path.
  • Viewer: reuse the React/TS web viewer (dashboard/src/components/RemoteViewer.tsx) — the end-user portal embeds the same component with a Control-mode viewer token.
  • Dashboard: new role-gated end-user portal route (recommended separate from the technician console — see Open Questions), plus admin screens for end-user + grant management.

Database (migrations)

  • user_client_access — reused as the grant table; no schema change (already user_id UUID × client_id UUID → connect_machines(id), unique pair, migration 002).
  • New migration 011_end_user_access.sql:
    • Widen connect_sessions.source CHECK to ('standalone','gururmm','end_user') (currently ('standalone','gururmm'), migration 004 line 99-102).
    • Optional users columns for the external principal: mfa_secret TEXT NULL, must_change_password BOOLEAN NOT NULL DEFAULT false, and a partial index for fast role='end_user' listing per tenant_id.
    • (Seat tracking, if landed in v1: a lightweight per-tenant end_user count view or a tenant_seats row — kept minimal.)
  • Grants are tenant-contained: insert path validates machine.tenant_id == end_user.tenant_id.

API endpoints / WS messages

  • POST /api/enduser/auth/login — public, rate-limited; returns an end_user login JWT.
  • GET /api/enduser/machines — lists only the caller's granted, in-tenant machines + presence.
  • POST /api/enduser/machines/:id/connect — grant-checked; creates a source=end_user session and mints a Control ViewerClaims token (create_viewer_token, jwt.rs:233) for that session.
  • Admin: POST /api/users (role=end_user), POST /api/users/:id/grants, DELETE /api/users/:id/grants/:machine_id, GET /api/users?role=end_user.
  • No new protobuf messages — the WS viewer path and guruconnect.proto are unchanged.

Implementation details

  • server/src/auth/jwt.rs — extend the role vocabulary doc (Claims.role, line 16-17); add an is_end_user() helper and ensure has_permission cannot grant end_user anything beyond explicit permissions (the admin short-circuit at line 30 must be guarded).
  • server/src/auth/mod.rsAuthenticatedUser (line 29+) gains role-aware helpers; add an extractor / middleware that rejects non-end_user on the /api/enduser/* namespace and rejects end_user on every console/admin route (deny-by-default allowlist).
  • server/src/api/ — new enduser handler module (login, machines, connect); admin user+grant handlers extended for role=end_user and user_client_access writes.
  • Grant check (shared fn): machine_id ∈ user_client_access[user] AND machine.tenant_id == user.tenant_id; used by both GET /machines and connect.
  • Session create stamps source='end_user', is_managed=true/unattended, consent_state='not_required', then mints the viewer token via the existing path so relay enforcement is unchanged.
  • dashboard/src/ — end-user portal route (role-gated), reusing RemoteViewer.tsx; admin grant-matrix UI. White-label (SPEC-014) applies to the portal as the most client-facing surface.
  • Migration server/migrations/011_end_user_access.sql as above (idempotent; applied by sqlx::migrate! per the migration standard).

Security considerations

  • Preserve the plane separation audited in v2 Phase 1 — end_user is login-plane only; it can never satisfy validate_viewer_token or the agent cak_ path.
  • Deny-by-default, grant-scoped: empty user_client_access for an end_user = no access; the admin-bypass must not apply. Every /api/enduser/* call re-checks the grant + tenant server-side (never trust a machine id from the client).
  • Tenant containment: an end_user and its grants live in one tenant; cross-tenant grants are rejected at write and re-validated at connect. (Full tenant isolation lands with Phase 4; v1 enforces via explicit tenant_id equality checks.)
  • External-user trust: these accounts are public-internet-facing from home. Require rate-limiting + lockout on /api/enduser/auth/login; support (recommend require) TOTP MFA for end_user — schema column included so MFA can be v1 or an immediate fast-follow without a second migration. Argon2id passwords (existing standard).
  • Audit: log each end-user login (success/failure, source IP) and each machine access to connect_session_events; the unattended access is to the user's own machine but must be fully traceable. Optionally enforce the SPEC-015 overlay per machine policy.
  • Threat model: stolen end-user creds reach only that user's granted machines (blast radius = grant set), never the console, never the agent plane, never another tenant. Disabling the account (users.enabled=false) immediately revokes portal + future tokens; the 5-min viewer-token TTL bounds any in-flight session.

Testing strategy

  • Unit: grant-check fn (granted / not-granted / cross-tenant / empty-set-for-end_user = deny); has_permission never elevates end_user; role-namespace middleware (end_user→console = 403, technician→/api/enduser = 403).
  • Integration: end-user login → list shows only granted machines → connect mints a Control viewer token for a source=end_user session → relay admits; connect to a non-granted / other-tenant machine → 403; disabled account → login + token use rejected.
  • Manual: full portal walkthrough from an off-network browser; MFA enrol + challenge; audit rows present for login and access; white-label branding renders on the portal.

Effort estimate & dependencies

  • Size: Large (new principal + portal + admin grant UI + auth namespace; transport/agent untouched and the grant table already exists, which holds it below X-Large).
  • Depends on (must precede / strongly preferred):
    • Tenancy (tenants + tenant_id, migration 004) — needed for containment; full isolation is Phase 4 but v1 uses explicit tenant checks.
    • Stable machine identity + persistent enrollment (SPEC-004 / 008 machine_uid, SPEC-016 zero-touch cak_) — end-users reach persistent managed agents.
    • Session-scoped viewer tokens (v2 Phase 1, landed) — reused directly.
  • Pairs with: SPEC-014 (white-label — the portal is the client-facing surface), SPEC-003/005 (machine inventory/list — portal machine rows), SPEC-015 (optional connect-notification overlay).
  • Unblocks: the directory-sync spec (AD/Entra/Google → end-user list), delegated client-admin role, and per-seat billing — all of which build on the end_user principal defined here.

Open questions

  1. Same console vs separate end-user portal? Recommendation: separate, role-gated route — smaller attack surface, no risk of leaking technician controls, cleaner white-label. Confirm before build.
  2. End-users in the existing users table (role=end_user) vs a dedicated end_users table? Recommendation: reuse users (the grant FK user_client_access.user_id already points there) with hard role guardrails. Revisit if mixing external + internal principals in one table proves risky.
  3. MFA in v1 or immediate fast-follow? Schema is included either way; decide enforcement timing.
  4. Who administers grants in v1 — MSP admin only (assumed), or ship the delegated client-company admin role together? (Affects scope/effort materially.)
  5. Seat/licensing enforcement depth for v1 — count-and-display vs hard-cap vs billing-integrated.
  6. Default access mode — Control assumed (own machine); should an admin be able to pin a machine to view-only for a given end-user? (Token layer already supports it.)