feat(agent,server): v2 secure-session-core Task 7 - HW H.264 + negotiated raw fallback
All checks were successful
Build and Test / Build Agent (Windows) (push) Successful in 6m57s
Build and Test / Build Server (Linux) (push) Successful in 10m23s
Build and Test / Security Audit (push) Successful in 4m15s
Build and Test / Build Summary (push) Successful in 9s

SPEC-002 Phase 1 Task 7 (the last), code-reviewed APPROVED, locally verified
(cargo fmt + clippy -D warnings exit 0 + cargo test --workspace 89 pass + build).

- Encoder trait + factory: RawEncoder (salvaged, UNCHANGED) and H264Encoder,
  selected by negotiation; factory falls back to raw on H.264 init failure.
- Negotiation: agent advertises supports_h264 (MFTEnumEx HW probe, cached) in
  AgentStatus; server picks the codec via select_video_codec(supports, prefer)
  and stamps StartStream.video_codec; agent re-guards on local HW. Policy
  constant DEFAULT_PREFER_H264 = false, so RAW is negotiated for every session
  today - H.264 stays dormant until live hardware validation (Task 8).
- MF H.264 encoder (h264.rs, FIRST-CUT / compile-verified-only): HW encoder MFT,
  BGRA->NV12 (color.rs, unit-tested), sync drain, fall-back-to-raw on any failure.
- Viewer H.264 decoder (decoder.rs, FIRST-CUT): MF decoder on a dedicated COM
  thread; drops+logs on failure, raw render path untouched.
- proto additive: VideoCodec enum, StartStream.video_codec=3,
  SessionResponse.video_codec=5, AgentStatus.supports_h264=11.
- Raw+Zstd path byte-for-byte unchanged; remains the guaranteed default/fallback.

Review confirmed unsafe impl Send for H264Encoder is sound (single-owned &mut on
the block_on thread; session future never spawned) and every MF failure degrades
to raw. H.264 is NOT claimed functional - compile/clippy/build-verified only;
live validation + force-IDR + the no-spawn-invariant doc are Task 8 go-live gates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-30 10:35:04 -07:00
parent bb73ba667f
commit f9bdecbfdb
12 changed files with 1885 additions and 23 deletions

View File

@@ -393,11 +393,104 @@ Reference: SPEC-002 §4.1/§4.2; salvage ledger §2; `agent/src/input/keyboard.r
---
## Task 7: Hardware H.264 encode + negotiated raw/Zstd fallback
## Task 7 [IMPLEMENTED 2026-05-30 — self-verified on local Windows toolchain: `cargo fmt --all --check` clean, `cargo clippy --workspace --all-targets --all-features -- -D warnings` exit 0, `cargo test --workspace` 89 pass (36 agent + 53 server; was 70, no regressions), `cargo build --workspace` ok; pending Code Review]: Hardware H.264 encode + negotiated raw/Zstd fallback
Files touched: `agent/src/encoder/` (`mod.rs`, `h264.rs` [new], `raw.rs` [salvaged]),
`agent/src/capture/` (feed), `agent/src/viewer/` (decode), `proto/guruconnect.proto`
(`AgentStatus` capability, `SessionResponse` codec), `server/src/session/mod.rs` (negotiation).
> [IMPLEMENTED] Raw+Zstd remains the DEFAULT and guaranteed fallback; H.264 is a
> negotiated upgrade that is COMPILE-VERIFIED ONLY (live MF encode/decode is Task
> 8 — needs real GPU + frames). The testable parts (abstraction, factory,
> negotiation, capability plumbing, color-conversion math) are done solidly with
> unit tests; the MF H.264 encoder and viewer decoder are first-cut, clearly
> marked, and gated behind a default-off policy so unvalidated H.264 never ships
> as the default.
>
> 1. ENCODER ABSTRACTION (`agent/src/encoder/mod.rs`): the existing `Encoder`
> trait (`encode(&mut self, &CapturedFrame) -> Result<EncodedFrame>`) is the
> abstraction; `RawEncoder` (salvaged raw+Zstd+dirty-rects, UNCHANGED behavior)
> and the new `H264Encoder` both implement it. Factory split into pure pieces:
> `codec_from_str` (config-string -> `VideoCodec`), `select_codec(negotiated,
> hardware_available)` (agent-side guard: H.264 only if HW present, HEVC->raw,
> else raw), and `create_encoder_for(VideoCodec, quality)` (builds the encoder;
> on H.264 init failure logs + returns a RAW encoder so the session never
> breaks). UNIT-TESTED: codec_from_str mapping, select_codec guard matrix, raw
> factory always succeeds, string path resolves to raw without HW.
> 2. CAPABILITY + NEGOTIATION (testable, done well):
> - `encoder/capability.rs`: `supports_hardware_h264()` probes MF once
> (`MFTEnumEx(MFT_CATEGORY_VIDEO_ENCODER, MFT_ENUM_FLAG_HARDWARE,
> MFVideoFormat_H264)`), caches the bool via `OnceLock`; false on non-Windows
> / no HW / MF error. Advertised in `AgentStatus.supports_h264` (proto field
> 11, additive).
> - Server (`server/src/session/mod.rs`): `select_video_codec(agent_supports,
> prefer_h264)` is a PURE decision fn — H.264 only when BOTH the agent
> supports it AND policy prefers it, else raw. Policy constant
> `DEFAULT_PREFER_H264 = false` (documented: keeps raw as the negotiated codec
> until H.264 is hardware-validated). `supports_h264` stored on the in-memory
> `Session` from `AgentStatus` (`update_agent_status` gained the param). The
> negotiated codec is stamped on `StartStream.video_codec` in
> `send_start_stream_internal` (the LIVE server->agent codec-selection point —
> SessionRequest/SessionResponse are not exchanged on the wire in v2, so the
> proto's `SessionResponse.video_codec` is kept for spec parity but the live
> path uses `StartStream`). UNIT-TESTED: the negotiation matrix, the
> default-policy guardrail (capable agent still gets raw), and the
> `AgentStatus -> supports_h264` ingest.
> - Agent applies it: `StartStream` handler decodes `video_codec`, stores
> `negotiated_codec`, and `init_streaming` builds the encoder via
> `select_codec` + `create_encoder_for` (re-guards on local HW; older server
> sends 0 = RAW, preserving the default).
> 3. MF H.264 ENCODER (`agent/src/encoder/h264.rs`, FIRST-CUT, compile-verified
> only): enumerates+activates a HW H.264 encoder MFT, sets H.264 output then
> NV12 input media types (frame size/rate, bitrate from quality), feeds frames
> (`ProcessInput`) and drains synchronously (`ProcessOutput`, NEED_MORE_INPUT =
> "no output this tick"), emitting `VideoFrame{H264(EncodedFrame{data, keyframe,
> pts, dts})}`. BGRA->NV12 via `encoder/color.rs` (BT.601 limited-range, 2x2 box
> chroma; isolated + UNIT-TESTED: size, odd-dim/short-buffer rejection, black/
> white/red reference values, plane coverage). On ANY init failure the FACTORY
> falls back to raw (logged); per-frame errors surface to the session (which
> logs + continues). Handles resolution change (re-init), keyframe flag
> (CleanPoint), MF buffer alloc for non-sample-providing MFTs. NOT yet live: the
> async-MFT event model is documented as a Task-8 refinement (this cut drains
> synchronously); precise force-IDR (CODECAPI) is a TODO; D3D11 zero-copy
> deferred (feeds CPU NV12).
> 4. VIEWER H.264 DECODE (`agent/src/viewer/decoder.rs` [new], FIRST-CUT,
> compile-verified only): MF H.264 decoder MFT -> NV12 -> BGRA
> (`nv12_to_bgra`, BT.601 inverse, UNIT-TESTED round-trip within tolerance +
> short-buffer + black). Runs on a DEDICATED OS thread (`gc-h264-decode`), NOT a
> tokio task — the MF decoder has COM thread affinity and a tokio task can
> migrate across workers at await points. The receive task forwards H.264 access
> units over a std channel; the worker decodes and pushes BGRA `FrameData`
> through the existing render path via `blocking_send`. On decoder-init failure
> it logs once and drops H.264 frames; the RAW render path is untouched. Handles
> the `MF_E_TRANSFORM_STREAM_CHANGE` NV12 output renegotiation + size discovery.
> 5. RAW STILL WORKS END-TO-END: `RawEncoder` is unchanged; with
> `DEFAULT_PREFER_H264 = false` the server negotiates RAW for every session
> (including capable agents), the agent builds the raw encoder, and the viewer's
> existing `Raw` branch renders it — the guaranteed default/fallback path is
> fully intact and is what runs today.
>
> PROTO (additive — no field renumbered): `VideoCodec` enum (RAW=0, H264=1,
> H265=2); `SessionResponse.video_codec = 5` (spec parity); `StartStream.video_codec
> = 3` (live negotiation); `AgentStatus.supports_h264 = 11` (capability). HEVC is a
> documented TODO/opt-in everywhere (never selected). Cargo.toml: added the
> `Win32_Media_MediaFoundation` + COM windows features (no new external crates).
>
> COMPILE-VERIFIED-ONLY / NEEDS LIVE HARDWARE (Task 8): the MF H.264 encoder
> init/feed/emit on a real GPU, the viewer MF decoder on a live stream, the
> BGRA<->NV12 fidelity end-to-end, and the synchronous-drain timing. The encoder/
> decoder are structured to fall back to raw (encoder) / drop frames + log
> (decoder) on any failure so they cannot break a session even if MF misbehaves.
>
> TESTS ADDED (19): agent +16 (encoder factory/select matrix x5, color BGRA->NV12
> x8, decoder NV12<->BGRA x3), server +3 (codec negotiation matrix, default-policy
> guardrail, AgentStatus capability ingest).
Files touched: `proto/guruconnect.proto` (`VideoCodec` enum + `SessionResponse.video_codec`
+ `StartStream.video_codec` + `AgentStatus.supports_h264`), `agent/Cargo.toml` (MF/COM windows
features), `agent/src/encoder/mod.rs` (trait/factory/select), `agent/src/encoder/raw.rs`
(salvaged, unchanged), `agent/src/encoder/h264.rs` [new], `agent/src/encoder/capability.rs` [new],
`agent/src/encoder/color.rs` [new], `agent/src/session/mod.rs` (negotiated codec apply +
`supports_h264` advertise), `agent/src/viewer/mod.rs` (H.264 route + decode worker),
`agent/src/viewer/decoder.rs` [new], `server/src/session/mod.rs` (`select_video_codec` +
`DEFAULT_PREFER_H264` + `supports_h264` field/ingest + `StartStream` codec stamp),
`server/src/relay/mod.rs` (pass `supports_h264` from `AgentStatus`).
- HW **H.264** via Windows Media Foundation (transparently NVENC/AMF/QuickSync) emitting the proto's
`EncodedFrame` (h264). Native viewer decodes via MF/D3D11.