Single search box matching case-insensitive substring across ALL machine attributes (OS, logged-on user, external/private IP, company, site, tag, serial, MAC, client version, ...) server-side, ScreenConnect-style. Replaces the dashboard's hostname/agent_id-only client filter (inadequate at ~900+ machines). pg_trgm GIN index over a concatenated searchable-text expression (INET cast to text, tags via array_to_string); multi-term AND; optional field-scoped syntax (os:/user:/ip:). Parameterized + fixed column allowlist (no injection), admin-guarded, DoS-capped. Depends on SPEC-003 (attrs must be persisted to be searchable); reuses SPEC-005 enriched payload. Requested by Mike 2026-05-30. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
8.3 KiB
SPEC-006: Universal Machine Search ("Everything Is Searchable")
Status: Proposed Priority: P2 Requested By: Mike (2026-05-30) Estimated Effort: Medium
Overview
Give the Operator Console a single search box that matches a query against any
attribute of any guest machine and returns all matches — OS, logged-on user, IP, company,
site, tag, serial, MAC, version, anything — exactly like ScreenConnect. Today the
dashboard filters only hostname/agent_id client-side (MachinesPage.tsx:46), which
is useless across a ~900-machine fleet. Success = typing "windows 11" finds every Win11
box, "fred" finds every machine where Fred is/was logged on, and "98.97" finds the
machine at that IP — server-side, fast, case-insensitive substring, across the whole
fleet.
Examples (user-stated): search an OS string → all machines with that OS; a username → all machines with that logged-on user; an IP (external or private) → the matching machine(s).
Scope
Included in v1
- Server-side multi-attribute search on
GET /api/machines?q=<text>(extendlist_machines), default = case-insensitive substring, match-any-field across:hostname,agent_id/machine_uid,organization(company),site,department,device_type,tags,logged_on_user,os_name/os_version(+locale),agent_version(client version),manufacturer,model,serial_number,machine_description,external_ip,private_ip,mac_address,status. - Multi-term AND: space-separated terms all must match (each term may match a different field) — "windows fred" = Win machine where Fred is logged on.
- Performance: a Postgres index that supports substring search across all columns (see Architecture) so search stays fast at fleet scale.
- IP/array handling:
external_ip/private_ip(INET, from SPEC-003) matched as text for partial-IP queries ("98.97");tags(TEXT[]) matched per-element. - Wire the dashboard search box to the server query (debounced), replacing the hostname-only client filter; render results through the SPEC-005 list view.
Nice-to-have (v1 if cheap, else fast-follow)
- Field-scoped syntax like ScreenConnect:
os:windows,user:fred,ip:98.97,company:"arizona computer",tag:winter,version:23.9— a known-prefix restricts the match to that column; unprefixed terms stay match-any-field; quotes group a phrase.
Explicitly out of scope
- Saved searches / smart groups / search-driven "session groups" (ScreenConnect's
+ Create Session Group) — separate feature. - Boolean OR / NOT / regex / fuzzy ranking — v1 is AND-of-substring; relevance ranking deferred.
- Searching live-only state not persisted to the DB (current viewer/host connection)
— exposed as a secondary structured filter (online/offline), not free-text, since it
lives in the in-memory
SessionManager, notconnect_machines. - Full event/session-history search — this spec is machine search.
Architecture
- Relay-server / DB (the core): add search to
db::machinesandlist_machines(main.rs:636). Query shape:- Maintain a searchable-text representation of each
connect_machinesrow and index it for substring search. Two viable approaches (decide in planning):pg_trgmGIN index over a concatenated expression (lower(hostname||' '||coalesce(organization,'')||' '||…||' '||host(external_ip)||…)). SupportsILIKE '%term%'on arbitrary substrings with index acceleration. Simple, matches ScreenConnect's substring semantics best.- Generated
search_textcolumn (maintained on write / via the same path asupdate_machine_metadata/update_machine_inventory) +pg_trgmGIN on it. Slightly more plumbing, cleaner query, easier to weight later.
tags TEXT[]folded into the text viaarray_to_string(tags,' '); INET viahost(external_ip)/host(private_ip).- Field-scoped terms compile to a targeted
col ILIKE '%v%'(or= ANY(tags)fortag:); unscoped terms to the concatenated-text match; all terms AND-ed.
- Maintain a searchable-text representation of each
- API:
list_machinesgainsq(and reuses the SPEC-005 enrichedMachineInfopayload so results render fully). SameAuthenticatedUseradmin guard. - Dashboard: the existing search input drives
listMachines({ q })(debounced ~250ms) indashboard/src/api/machines.ts; remove the hostname-only client filter inMachinesPage.tsx. Show result count + "no matches" empty state. - Protobuf: none — server/dashboard/DB only. Searchability of inventory fields is entirely dependent on SPEC-003 persisting them.
Implementation details
- Files to touch:
server/migrations/(new migration:CREATE EXTENSION IF NOT EXISTS pg_trgm;+ the GIN index / optionalsearch_textcolumn — idempotent, startup-applied bysqlx::migrate!(), never pre-applied via psql);server/src/db/machines.rs(search_machines(q)orget_all_machinesgaining a filter; parameterized, never string-interpolated);server/src/main.rs:636(list_machinesreadsq);dashboard/src/features/machines/MachinesPage.tsx:46(drop local filter, call server),dashboard/src/api/machines.ts(q param). - Key logic: tokenize on whitespace (respecting quotes), classify each token as
scoped/unscoped, build a parameterized
WHERE term1 AND term2 …. Cap query length and term count.
Security considerations
- SQL injection: all terms are bound parameters (
$n), never concatenated into SQL. Column list for scoped search is a fixed allowlist — afield:prefix can only map to a known column, never arbitrary SQL. - Admin-authenticated only (
AuthenticatedUser), same as the existing machines list — no new unauthenticated surface; search exposes nothing the list doesn't already. - DoS guard: cap query length, number of terms, and result page size;
pg_trgmkeeps worst-case substring scans index-backed so a broad query can't table-scan 900+ rows unindexed. - Treat the query as untrusted text; the response is the same admin-only machine data.
Testing strategy
- Unit: tokenizer (quotes, multi-term, scoped vs. unscoped); query builder emits
parameterized SQL with the right AND structure; scoped
field:maps only to allowlisted columns (unknown prefix → treated as literal text, not an error/injection). - Integration (seeded DB): "windows 11" returns all Win11 rows; a username returns
rows with that
logged_on_user; a partial IP returns rows whoseexternal_ip/private_ipcontains it; a tag value returns tagged rows; multi-term ANDs correctly; emptyqreturns the full list unchanged. - Performance: with ~1k seeded rows, a broad substring query uses the GIN index (EXPLAIN shows index scan, not seq scan) and returns within target latency.
- Manual: on the live console, reproduce the user's three examples (OS, username, IP).
Effort estimate & dependencies
- Size: Medium. The query builder + index is the bulk; the API/dashboard wiring is small. Field-scoped syntax adds a little parsing.
- Depends on: SPEC-003 — the inventory attributes (OS detail, user, IP, MAC,
serial, version, device type) must be persisted on
connect_machinesto be searchable; without it, search covers only the handful of existing columns. Reuses SPEC-005's enriched payload to render results; benefits from SPEC-004 dedup so results aren't padded with ghost duplicates. Migration orders after SPEC-003/004's. - Unblocks: fast fleet triage ("who's on this IP / running this OS / logged in as X"), and the saved-search / smart-group follow-up.
Open questions
- Index strategy — concatenated-expression
pg_trgmGIN vs. a maintainedsearch_textcolumn. Proposed: start with the expression index (less write plumbing); move tosearch_textif weighting/ranking is wanted later. - Field-scoped syntax in v1 or fast-follow? Default match-any-field covers the stated use cases; scoped syntax is polish.
- Result cap / pagination — return all matches, or page (limit/offset)? At ~1k rows a cap with "N more" may suffice; confirm.
- Include live online/host state as a filter — free-text can't reach in-memory state;
offer
status:onlineas a structured filter that the server resolves against the SessionManager, or keep search DB-only in v1?