The SPEC-016 Phase B credential-store guard referenced "SPEC-017" for the forthcoming SYSTEM service host, but 017 is now Mike's end-user-access spec; the service host is SPEC-018. Comment + error-string text only, no logic change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
414 lines
18 KiB
Rust
414 lines
18 KiB
Rust
//! At-rest storage for the per-machine operating credential (`cak_`).
|
|
//!
|
|
//! SPEC-016 Phase B, item 4 + §Security. The `cak_` minted by `/api/enroll` is
|
|
//! the high-sensitivity, per-machine, independently-revocable operating
|
|
//! credential. It is stored with **two independent layers** (Mike's locked
|
|
//! decision — "BOTH layers"):
|
|
//!
|
|
//! 1. **DPAPI-machine encryption** (`CryptProtectData` with
|
|
//! `CRYPTPROTECT_LOCAL_MACHINE`): the on-disk bytes are a DPAPI blob keyed to
|
|
//! THIS machine. A copied/exfiltrated file is inert on any other box — DPAPI
|
|
//! machine keys do not leave the machine.
|
|
//! 2. **SYSTEM/Administrators-only ACL** on the containing directory + file: a
|
|
//! non-admin user cannot even read the ciphertext. Inheritance is removed and
|
|
//! only `SYSTEM` and `BUILTIN\Administrators` are granted full control.
|
|
//!
|
|
//! Local admin / SYSTEM can always recover the value — that is accepted (SPEC-016
|
|
//! §Security): the blast radius of one leaked `cak_` is a single, independently
|
|
//! revocable machine.
|
|
//!
|
|
//! Storage location (chosen over an HKLM value): a file under
|
|
//! `%ProgramData%\GuruConnect\credentials\agent.cak`. Rationale — the agent
|
|
//! already keeps its config and the `machine_uid` fallback seed under
|
|
//! `%ProgramData%\GuruConnect`, so co-locating keeps a single protected
|
|
//! directory; and a directory/file ACL applied via `icacls` is auditable with far
|
|
//! less unsafe FFI than building a registry-key security descriptor by hand. Both
|
|
//! storage shapes are explicitly permitted by the spec.
|
|
//!
|
|
//! SECURITY: the plaintext `cak_` is NEVER logged. Errors describe the operation,
|
|
//! not the value.
|
|
|
|
#![cfg(windows)]
|
|
|
|
use anyhow::{anyhow, Context, Result};
|
|
use std::path::PathBuf;
|
|
use thiserror::Error;
|
|
|
|
/// Failure classes for [`load_cak`], so callers can distinguish an *operational*
|
|
/// problem (the file exists but this process cannot open/read it — e.g. running in
|
|
/// the wrong security context against a SYSTEM-only-ACL'd store) from the real
|
|
/// *tamper / wrong-machine* signal (the file was read successfully but DPAPI
|
|
/// decryption failed).
|
|
///
|
|
/// The distinction matters for the run-mode resolver (`main.rs`):
|
|
/// - [`LoadCakError::Io`] is recoverable/actionable — log it and STOP (do not
|
|
/// silently re-enroll over a store we simply can't read in this context).
|
|
/// - [`LoadCakError::Decrypt`] is a hard tamper signal — STOP, do not re-enroll.
|
|
#[derive(Debug, Error)]
|
|
pub enum LoadCakError {
|
|
/// The store path could not be resolved (e.g. `%ProgramData%` unset).
|
|
#[error("could not resolve credential store path: {0}")]
|
|
Path(String),
|
|
|
|
/// An IO/open/read error reaching the stored blob — INCLUDING
|
|
/// `PermissionDenied` (the running context lacks rights to the SYSTEM-only
|
|
/// store). Operational, not a tamper signal.
|
|
#[error("credential store is present but could not be read in this context: {source}")]
|
|
Io {
|
|
/// Whether this was specifically an access-denied error (drives the
|
|
/// run-mode fail-fast guard in `main.rs`).
|
|
permission_denied: bool,
|
|
source: std::io::Error,
|
|
},
|
|
|
|
/// The blob was read successfully but DPAPI decryption FAILED — the real
|
|
/// tamper / wrong-machine / corruption signal. A hard stop; never re-enroll.
|
|
#[error("stored credential failed to decrypt (wrong machine, tampered, or corrupted): {0}")]
|
|
Decrypt(String),
|
|
}
|
|
|
|
/// Directory holding the protected credential file.
|
|
fn credentials_dir() -> Result<PathBuf> {
|
|
let program_data =
|
|
std::env::var("ProgramData").context("ProgramData environment variable is not set")?;
|
|
Ok(PathBuf::from(program_data)
|
|
.join("GuruConnect")
|
|
.join("credentials"))
|
|
}
|
|
|
|
/// Full path to the DPAPI-encrypted `cak_` blob.
|
|
fn cak_path() -> Result<PathBuf> {
|
|
Ok(credentials_dir()?.join("agent.cak"))
|
|
}
|
|
|
|
/// Persist `cak` encrypted at rest.
|
|
///
|
|
/// Ordering is security-critical (H2 — TOCTOU): the directory ACL is locked
|
|
/// BEFORE any secret bytes touch the filesystem, and the temp file is written
|
|
/// INSIDE the already-locked directory, so no ciphertext ever exists at a path
|
|
/// carrying an inherited (potentially world-readable) ACL:
|
|
///
|
|
/// 1. `create_dir_all(dir)` — ensure the directory exists.
|
|
/// 2. `lock_down_acl(dir)` — remove inherited ACEs and grant SYSTEM +
|
|
/// Administrators full control, made inheritable `(OI)(CI)` so children
|
|
/// created afterward are covered. This is an explicit precondition for the
|
|
/// write that follows — NOT an unstated inheritance assumption.
|
|
/// 3. DPAPI-machine-encrypt the plaintext.
|
|
/// 4. Write the ciphertext to a temp file inside the now-locked directory, then
|
|
/// rename over the target (atomic-ish replace).
|
|
/// 5. `lock_down_acl(file)` — assert the file's own ACL (belt-and-suspenders; the
|
|
/// file already inherits the directory's restrictive ACEs).
|
|
/// 6. C1 read-back: immediately attempt [`load_cak`] to PROVE the running
|
|
/// security context can read its own store. If it cannot (e.g. a non-SYSTEM
|
|
/// run wrote a SYSTEM-only store it can no longer read), fail HERE at enroll
|
|
/// time with an actionable error — rather than silently bricking on the next
|
|
/// boot when the steady-state path tries to load it.
|
|
///
|
|
/// Returns an error (never logs the plaintext) on any failure so the caller can
|
|
/// surface it / retry.
|
|
pub fn store_cak(cak: &str) -> Result<()> {
|
|
// 1 + 2: lock the directory ACL BEFORE writing any secret (H2 / TOCTOU).
|
|
let dir = credentials_dir()?;
|
|
std::fs::create_dir_all(&dir)
|
|
.with_context(|| format!("failed to create credentials dir {dir:?}"))?;
|
|
lock_down_acl(&dir).context("failed to restrict credentials directory ACL")?;
|
|
|
|
// 3: encrypt only after the destination directory is locked down.
|
|
let ciphertext = dpapi_protect(cak.as_bytes()).context("DPAPI encryption of cak_ failed")?;
|
|
|
|
// 4: write the temp file INSIDE the already-locked directory, then rename.
|
|
let path = cak_path()?;
|
|
let tmp = path.with_extension("cak.tmp");
|
|
std::fs::write(&tmp, &ciphertext)
|
|
.with_context(|| format!("failed to write temp credential file {tmp:?}"))?;
|
|
std::fs::rename(&tmp, &path)
|
|
.with_context(|| format!("failed to place credential file {path:?}"))?;
|
|
|
|
// 5: assert the file ACL too (the file already inherits the dir's ACEs).
|
|
lock_down_acl(&path).context("failed to restrict credential file ACL")?;
|
|
|
|
// 6: C1 read-back — confirm THIS context can read back what it just wrote.
|
|
// Catches the "wrote a SYSTEM-only store from a non-SYSTEM context" footgun at
|
|
// enroll time instead of as a silent brick on the next launch.
|
|
match load_cak() {
|
|
Ok(Some(_)) => {
|
|
tracing::info!("[ENROLL] stored per-machine credential (encrypted at rest)");
|
|
Ok(())
|
|
}
|
|
Ok(None) => Err(anyhow!(
|
|
"stored the credential but read-back returned nothing — refusing to proceed \
|
|
with an unverifiable credential store"
|
|
)),
|
|
Err(LoadCakError::Io {
|
|
permission_denied: true,
|
|
..
|
|
}) => Err(anyhow!(
|
|
"[ENROLL] wrote the credential store but cannot read it back in THIS security \
|
|
context (access denied). The store is ACL'd to SYSTEM + Administrators by \
|
|
design; the managed agent must run as the GuruConnect SYSTEM service (see \
|
|
SPEC-018) to read it. Refusing to leave an unreadable store behind."
|
|
)),
|
|
Err(e) => Err(anyhow::Error::new(e)
|
|
.context("stored the credential but the immediate read-back verification failed")),
|
|
}
|
|
}
|
|
|
|
/// Load and decrypt the stored `cak_`, or `Ok(None)` if no credential is stored.
|
|
///
|
|
/// Error classification (M1) — the caller MUST treat these differently:
|
|
/// - `Ok(None)` -> no store yet (NotFound or empty); enroll is fine.
|
|
/// - [`LoadCakError::Io`] -> the store exists but is unreadable in this
|
|
/// context (open/read error, INCLUDING access-denied). Operational; the caller
|
|
/// logs it and STOPS — it must NOT silently re-enroll over a store it merely
|
|
/// cannot read here.
|
|
/// - [`LoadCakError::Decrypt`] -> the bytes were read but DPAPI decryption
|
|
/// FAILED (wrong machine / tampered / corrupted). A hard tamper signal; STOP.
|
|
///
|
|
/// Only a successful READ whose decrypt fails is the tamper signal — an IO or
|
|
/// permission error is never conflated with tamper.
|
|
pub fn load_cak() -> std::result::Result<Option<String>, LoadCakError> {
|
|
let path = cak_path().map_err(|e| LoadCakError::Path(e.to_string()))?;
|
|
let ciphertext = match std::fs::read(&path) {
|
|
Ok(bytes) => bytes,
|
|
Err(e) if e.kind() == std::io::ErrorKind::NotFound => return Ok(None),
|
|
Err(e) => {
|
|
let permission_denied = e.kind() == std::io::ErrorKind::PermissionDenied;
|
|
return Err(LoadCakError::Io {
|
|
permission_denied,
|
|
source: e,
|
|
});
|
|
}
|
|
};
|
|
if ciphertext.is_empty() {
|
|
return Ok(None);
|
|
}
|
|
// Reaching here means the READ succeeded — so a decrypt failure now IS the real
|
|
// tamper / wrong-machine signal (never conflated with an IO/permission error).
|
|
let plaintext =
|
|
dpapi_unprotect(&ciphertext).map_err(|e| LoadCakError::Decrypt(e.to_string()))?;
|
|
let cak = String::from_utf8(plaintext)
|
|
.map_err(|e| LoadCakError::Decrypt(format!("decrypted bytes were not valid UTF-8: {e}")))?;
|
|
if cak.is_empty() {
|
|
return Ok(None);
|
|
}
|
|
Ok(Some(cak))
|
|
}
|
|
|
|
/// Remove the stored credential (e.g. on revocation / forced re-enroll).
|
|
/// Succeeds if the file is already absent.
|
|
///
|
|
/// Part of the store/load/clear API the spec requires (SPEC-016 item 4). Not yet
|
|
/// called from a code path — the relay-side `cak_` revocation / forced re-enroll
|
|
/// flow that drives it is the deferred SPEC-016 Phase B/D server work (the
|
|
/// `TODO(SPEC-016 Phase B/D): consider revoking existing cak_ on collision` note
|
|
/// in `server/src/api/enroll.rs`) — so it is retained as part of the complete
|
|
/// store API and explicitly allowed dead until that server work lands.
|
|
#[allow(dead_code)]
|
|
pub fn clear_cak() -> Result<()> {
|
|
let path = cak_path()?;
|
|
match std::fs::remove_file(&path) {
|
|
Ok(()) => {
|
|
tracing::info!("[ENROLL] cleared stored per-machine credential");
|
|
Ok(())
|
|
}
|
|
Err(e) if e.kind() == std::io::ErrorKind::NotFound => Ok(()),
|
|
Err(e) => Err(e).with_context(|| format!("failed to remove {path:?}")),
|
|
}
|
|
}
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// DPAPI (machine scope)
|
|
// ---------------------------------------------------------------------------
|
|
|
|
/// DPAPI-machine-encrypt `plaintext` into a self-contained blob.
|
|
fn dpapi_protect(plaintext: &[u8]) -> Result<Vec<u8>> {
|
|
use windows::Win32::Security::Cryptography::{
|
|
CryptProtectData, CRYPTPROTECT_LOCAL_MACHINE, CRYPT_INTEGER_BLOB,
|
|
};
|
|
|
|
// CryptProtectData requires a mutable input pointer in the struct, though it
|
|
// does not modify the bytes; copy into a local Vec to get a *mut without
|
|
// aliasing the caller's slice.
|
|
let mut input = plaintext.to_vec();
|
|
let in_blob = CRYPT_INTEGER_BLOB {
|
|
cbData: u32::try_from(input.len()).context("plaintext too large for DPAPI")?,
|
|
pbData: input.as_mut_ptr(),
|
|
};
|
|
let mut out_blob = CRYPT_INTEGER_BLOB::default();
|
|
|
|
// SAFETY: in_blob points at a valid, sized buffer; out_blob is owned here and
|
|
// its pbData is allocated by DPAPI (freed via LocalFree below). No prompt
|
|
// struct / entropy / reserved args.
|
|
unsafe {
|
|
CryptProtectData(
|
|
&in_blob,
|
|
windows::core::PCWSTR::null(),
|
|
None,
|
|
None,
|
|
None,
|
|
CRYPTPROTECT_LOCAL_MACHINE,
|
|
&mut out_blob,
|
|
)
|
|
.context("CryptProtectData failed")?;
|
|
}
|
|
|
|
let result = copy_and_free_blob(&out_blob);
|
|
// Best-effort scrub of the transient plaintext copy.
|
|
input.iter_mut().for_each(|b| *b = 0);
|
|
|
|
result.ok_or_else(|| anyhow!("CryptProtectData returned an empty/invalid blob"))
|
|
}
|
|
|
|
/// DPAPI-decrypt a blob previously produced by [`dpapi_protect`] on this machine.
|
|
fn dpapi_unprotect(ciphertext: &[u8]) -> Result<Vec<u8>> {
|
|
use windows::Win32::Security::Cryptography::{
|
|
CryptUnprotectData, CRYPTPROTECT_LOCAL_MACHINE, CRYPT_INTEGER_BLOB,
|
|
};
|
|
|
|
let mut input = ciphertext.to_vec();
|
|
let in_blob = CRYPT_INTEGER_BLOB {
|
|
cbData: u32::try_from(input.len()).context("ciphertext too large for DPAPI")?,
|
|
pbData: input.as_mut_ptr(),
|
|
};
|
|
let mut out_blob = CRYPT_INTEGER_BLOB::default();
|
|
|
|
// SAFETY: as in dpapi_protect — valid sized input, owned output freed below.
|
|
unsafe {
|
|
CryptUnprotectData(
|
|
&in_blob,
|
|
None,
|
|
None,
|
|
None,
|
|
None,
|
|
CRYPTPROTECT_LOCAL_MACHINE,
|
|
&mut out_blob,
|
|
)
|
|
.context("CryptUnprotectData failed")?;
|
|
}
|
|
|
|
copy_and_free_blob(&out_blob)
|
|
.ok_or_else(|| anyhow!("CryptUnprotectData returned an empty/invalid blob"))
|
|
}
|
|
|
|
/// Copy a DPAPI output blob into an owned `Vec` and `LocalFree` the DPAPI buffer.
|
|
///
|
|
/// Returns `Some(bytes)` on success, `None` if the blob is null/empty. Always
|
|
/// frees `pbData` when non-null (DPAPI allocates it with `LocalAlloc`).
|
|
fn copy_and_free_blob(
|
|
blob: &windows::Win32::Security::Cryptography::CRYPT_INTEGER_BLOB,
|
|
) -> Option<Vec<u8>> {
|
|
use windows::Win32::Foundation::{LocalFree, HLOCAL};
|
|
|
|
if blob.pbData.is_null() {
|
|
return None;
|
|
}
|
|
// SAFETY: DPAPI guarantees pbData points at cbData valid bytes on success.
|
|
let bytes = unsafe { std::slice::from_raw_parts(blob.pbData, blob.cbData as usize).to_vec() };
|
|
// SAFETY: pbData was allocated by DPAPI via LocalAlloc; free it once.
|
|
unsafe {
|
|
let _ = LocalFree(HLOCAL(blob.pbData as *mut core::ffi::c_void));
|
|
}
|
|
if bytes.is_empty() {
|
|
None
|
|
} else {
|
|
Some(bytes)
|
|
}
|
|
}
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// ACL hardening
|
|
// ---------------------------------------------------------------------------
|
|
|
|
/// Restrict `path` (file or directory) to SYSTEM + Administrators full control,
|
|
/// removing inherited ACEs so a permissive parent grant cannot leak read access.
|
|
///
|
|
/// Implemented via `icacls` — the documented, auditable mechanism — rather than
|
|
/// hand-rolling a security descriptor through `SetNamedSecurityInfoW` (hundreds
|
|
/// of lines of SID/ACL FFI). `icacls` ships on every supported Windows target.
|
|
/// A failure here is surfaced (the caller treats inability to lock down the
|
|
/// credential store as a hard error) but the well-known SIDs `*S-1-5-18`
|
|
/// (LocalSystem) and `*S-1-5-32-544` (BUILTIN\Administrators) are language- and
|
|
/// locale-independent, so this does not break on localized Windows.
|
|
fn lock_down_acl(path: &std::path::Path) -> Result<()> {
|
|
use std::os::windows::process::CommandExt;
|
|
use std::process::Command;
|
|
|
|
const CREATE_NO_WINDOW: u32 = 0x0800_0000;
|
|
|
|
let path_str = path
|
|
.to_str()
|
|
.ok_or_else(|| anyhow!("credential path is not valid UTF-8: {path:?}"))?;
|
|
|
|
// /inheritance:r -> remove inherited ACEs (drop the permissive parent grant)
|
|
// /grant:r -> replace any existing explicit grants for the principal
|
|
// *S-1-5-18 -> LocalSystem; *S-1-5-32-544 -> BUILTIN\Administrators
|
|
let output = Command::new("icacls")
|
|
.arg(path_str)
|
|
.args([
|
|
"/inheritance:r",
|
|
"/grant:r",
|
|
"*S-1-5-18:(OI)(CI)F",
|
|
"/grant:r",
|
|
"*S-1-5-32-544:(OI)(CI)F",
|
|
])
|
|
.creation_flags(CREATE_NO_WINDOW)
|
|
.output()
|
|
.context("failed to invoke icacls to harden credential ACL")?;
|
|
|
|
if !output.status.success() {
|
|
// icacls writes its diagnostics to stdout; surface the code only (no
|
|
// credential material is ever passed to icacls, only the path).
|
|
return Err(anyhow!(
|
|
"icacls failed to harden {path_str} (exit {:?})",
|
|
output.status.code()
|
|
));
|
|
}
|
|
Ok(())
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use super::*;
|
|
|
|
/// DPAPI round-trips on the same machine: protect then unprotect must recover
|
|
/// the exact plaintext. (Runs on the build/test host, which IS the same
|
|
/// machine — the machine-scope key is available to any process here.)
|
|
#[test]
|
|
fn dpapi_roundtrip_recovers_plaintext() {
|
|
let secret = b"cak_test_value_0123456789abcdef";
|
|
let blob = dpapi_protect(secret).expect("DPAPI protect should succeed on this machine");
|
|
assert_ne!(
|
|
blob.as_slice(),
|
|
secret.as_slice(),
|
|
"ciphertext must differ from plaintext"
|
|
);
|
|
let recovered = dpapi_unprotect(&blob).expect("DPAPI unprotect should succeed");
|
|
assert_eq!(recovered, secret, "round-trip must recover the exact bytes");
|
|
}
|
|
|
|
/// A non-empty plaintext yields a non-empty, differing blob, and an empty
|
|
/// input is handled (DPAPI accepts zero-length and round-trips to empty).
|
|
#[test]
|
|
fn dpapi_roundtrip_handles_varied_lengths() {
|
|
for plaintext in [b"x".as_slice(), b"cak_".as_slice(), &[0u8; 256]] {
|
|
let blob = dpapi_protect(plaintext).expect("protect");
|
|
let back = dpapi_unprotect(&blob).expect("unprotect");
|
|
assert_eq!(back.as_slice(), plaintext);
|
|
}
|
|
}
|
|
|
|
/// Tampering with the ciphertext must make decryption FAIL rather than return
|
|
/// garbage — DPAPI authenticates its blobs.
|
|
#[test]
|
|
fn dpapi_rejects_tampered_blob() {
|
|
let mut blob = dpapi_protect(b"cak_tamper_target").expect("protect");
|
|
// Flip a byte in the middle of the blob.
|
|
let mid = blob.len() / 2;
|
|
blob[mid] ^= 0xFF;
|
|
assert!(
|
|
dpapi_unprotect(&blob).is_err(),
|
|
"a tampered DPAPI blob must fail to decrypt"
|
|
);
|
|
}
|
|
}
|