diff --git a/doc/tpm.md b/doc/tpm.md index 90f7ec064..e370a4dac 100644 --- a/doc/tpm.md +++ b/doc/tpm.md @@ -10,8 +10,35 @@ See also: [architecture.md](architecture.md), [boot-process.md](boot-process.md) ## tpmr — unified TPM abstraction `initrd/bin/tpmr.sh` is a shell script wrapper that presents a single interface -over both TPM 1.2 (`tpm` / `trousers`) and TPM 2.0 (`tpm2-tools`). All Heads -scripts call `tpmr.sh` rather than invoking `tpm` or `tpm2` directly. +over both TPM 1.2 and TPM 2.0. All Heads scripts call `tpmr.sh` rather than +invoking TPM tools directly. + +### Boot chain and TPM tool selection + +```text +initrd/init (PID 1) + └─ CONFIG_BOOTSCRIPT → /bin/gui-init.sh [board config] + ├─ source /etc/functions.sh [shared TPM helpers] + ├─ source /etc/gui_functions.sh [whiptail wrappers] + └─ calls initrd/bin/tpmr.sh [TPM abstraction] + ├─ TPM1: calls `tpm` (tpmtotp util/tpm) [CONFIG_TPM2_TOOLS != y] + │ modules/tpmtotp → output: totp hotp qrenc util/tpm + │ + └─ TPM2: calls `tpm2` (single binary, subcommands) [CONFIG_TPM2_TOOLS=y] + modules/tpm2-tss + modules/tpm2-tools +``` + +TPM1 support comes exclusively from the `tpmtotp` module (`modules/tpmtotp`), +which builds `util/tpm` as part of its outputs. This binary is installed to +the initrd as `tpm` and supports subcommands such as `physicalpresence`, +`forceclear`, `takeown -pwdo`, `counter_create`, `counter_increment`, etc. + +TPM2 support comes from `modules/tpm2-tss` (TSS software stack) and +`modules/tpm2-tools` (`tpm2` binary with subcommands like `getcap`, +`nvdefine`, `nvincrement`). + +Both TPM1 and TPM2 boards may also enable `CONFIG_TPMTOTP=y` for the +`totp` and `hotp` utilities, which are independent of the TPM version. ### PCR sizes @@ -38,6 +65,8 @@ scripts call `tpmr.sh` rather than invoking `tpm` or `tpm2` directly. | `reset` | Reset the TPM | | `kexec_finalize` | Finalize PCR state before kexec (TPM2 only) | | `shutdown` | Orderly shutdown (TPM2 only) | +| `da_state` | Query dictionary attack lockout state | +| `bad_auth` | Deliberately trigger an auth failure to test DA lockout | --- @@ -271,9 +300,11 @@ The rollback counter prevents **TPM swap attacks** and **/boot disk swap attacks ### How it works -The counter is stored **in the TPM** (NVRAM index `0x3135106223`), ensuring -hardware binding. A SHA-256 hash of the counter value is stored on **/boot** -(`/boot/kexec_rollback.txt`). This creates a two-way binding: +The counter value is stored **in the TPM** at a persistent NVRAM index +(stored in `/boot/kexec_rollback.txt`; the index is a well-known value +for TPM1 and randomly generated per TPM2 at provisioning time). A SHA-256 +hash of the counter value is stored on **/boot** (`/boot/kexec_rollback.txt`). +This creates a two-way binding: - Cannot swap TPM without breaking /boot consistency - Cannot swap /boot without breaking TPM consistency @@ -398,3 +429,216 @@ To verify that a new board's coreboot config matches the expected RoT: | Auth sessions | Not used | Required for policy-based unseal | | `kexec_finalize` | No-op | Extends PCRs, then `tpm2 shutdown` | | `startsession` | No-op | Creates encryption session | + +### TPM1 auth retry and error detection + +`_tpm_auth_retry()` in `initrd/bin/tpmr.sh` provides shared retry logic for +both TPM1 and TPM2 operations that need authorization. On auth failure +(wrong passphrase), the passphrase cache is shredded and the user is +re-prompted up to 3 times before giving up. + +Auth failure is detected by grepping the command output for known error +patterns. TPM1 (tpmtotp) errors go to stdout via `printf()` with +`TPM_GetErrMsg()` strings. TPM2 (tpm2-tools) errors go to stderr via +`LOG_ERR()` and may include raw TPM response codes. + +| Pattern | Type | TPM version | Example error | +| --- | --- | --- | --- | +| `authorization|auth|bad|permission` | English words | TPM1+TPM2 | `TPM_AUTHFAIL`, `bad passphrase` | +| `defend` | English word | TPM1 | `Defend lock running` | +| `0x98e|0x149` | Hex codes | TPM2 | `TPM2_RC_AUTH_FAIL`, `TPM2_RC_NV_AUTHORIZATION` | + +### TPM1 reset defend lock + +`TPM_DEFEND_LOCK_RUNNING` (`tpm_error.h`: `TPM_BASE + TPM_NON_FATAL + 3`) +is a standard TPM 1.2 error raised when the TPM's dictionary-attack +protection is active. After too many failed authorization attempts, the +TPM enters a time-out period and refuses all authorization operations -- +including `tpm takeown` even after a successful `tpm forceclear` +(forceclear clears the owner but not the dictionary attack counter on +some implementations, particularly Infineon TPMs). + +tpmtotp's `tpm takeown` outputs: +``` +Error Defend lock running from TPM_TakeOwnership +``` + +`tpm1_reset()` in `initrd/bin/tpmr.sh` detects "defend lock" in the +`takeown` output and attempts one recovery: cycling physical presence +(`physicaldisable` / `physicalenable` / `physicalpresence` / +`physicalsetdeactivated`) to re-assert PP before retrying `takeown`. +This works on some chipsets where software presence was not properly +honoured by the first `forceclear`. + +If PP cycling also fails, no software-based recovery is available. +Further attempts (second forceclear, `TPM_ResetLockValue` with empty +auth, sleep+retry) will not help. Use `tpmr.sh da_state` from the +recovery shell to check the current DA state: + +- **TPM1**: `actionDependValue` reports remaining lockout seconds. +- **TPM2**: the human-readable summary shows estimated unlock time + based on `recoveryTime` (seconds before one failure is forgotten). + +Alternatively, reset the TPM to clear the DA state entirely: +`tpm-reset.sh` from the recovery shell, or GUI menu `Options -> +TPM/TOTP/HOTP Options -> Reset the TPM` for full reprovision. + +#### DA lockout duration escalation + +TPM 1.2 dictionary attack timeouts escalate with the failure count +(approximate; varies by vendor and TPM firmware version per Dell and +Microsoft documentation): + +| Failures accumulated | Typical lockout time | +|---------------------|---------------------| +| 1-2 | None (counter only) | +| 3-5 | 10 seconds | +| 6-9 | 1 hour | +| 10-12 | Several hours | +| 13+ | Up to 24 hours | + +Each time the TPM fully locks out and the timer expires, the DA counter +resets. If failures continue to accumulate across boots without +waiting for the timer to expire, the escalation can reach 24 hours. +This is what happened with the counter auth regression (3 failures per +boot x many boots): the DA counter reached the maximum threshold. + +#### Diagnosing DA state + +Use `tpmr.sh da_state` from the recovery shell to query the current DA +state. Available for both TPM1 and TPM2: + +| Information | TPM1 | TPM2 | +|-------------|------|------| +| Locked? | `state`: 0=inactive, 1=locked | `TPM2_PT_LOCKOUT_COUNTER` > `TPM2_PT_MAX_AUTH_FAIL` | +| Current failures | `currentCount` | `TPM2_PT_LOCKOUT_COUNTER` | +| Lockout threshold | `thresholdCount` | `TPM2_PT_MAX_AUTH_FAIL` | +| Lockout interval | -- | `TPM2_PT_LOCKOUT_INTERVAL` | +| Time remaining | `actionDependValue` (seconds) | Estimate from `LOCKOUT_COUNTER` vs `MAX_AUTH_FAIL` times `LOCKOUT_INTERVAL` | + +The recovery shell can run `tpmr.sh da_state` at any time to check +whether the TPM is locked and how much lockout time remains. + +##### TPM1 version check + +`tpm1_da_state` first queries the TPM spec version via +`TPM_CAP_VERSION_VAL` (0x1a). If `revMajor < 103`, the TPM predates +the `TPM_CAP_DA_LOGIC` capability (added in TCG TPM Main Part 2 rev 103) +and the function returns immediately without attempting the DA query: + +``` +TPM 1.2 too old to report DA lockout state. +``` + +The TPM vendor and spec revision are logged to debug.log for diagnostics. + +TPM1 chips known to lack DA state query support: +- STMicroelectronics ST33TP series (rev 13, confirmed on ThinkPad X230) +- Older Infineon SLB9635/9645 (pre-rev 103 firmware) +- Some Atmel/Microchip TPMs + +On such hardware, the preflight guard in `increment_tpm_counter` cannot +detect lockout before the increment. If the TPM is locked, the increment +fails with `TPM_DEFEND_LOCK_RUNNING`, which is caught by the error +handling (see below). TPM2 is unaffected. + +##### TPM2 firmware version + +`tpm2_da_state` logs `TPM2_PT_FIRMWARE_VERSION_1` to debug.log when DA +properties are successfully queried. This helps identify the TPM chip +and firmware revision for diagnostic purposes. + +#### DA parameter configurability + +TPM2 DA parameters are configured during `tpm2_reset()` (called by +`tpm-reset.sh` and the GUI `reset_tpm()`). Heads sets: +- `maxTries=10`: auth failures before lockout +- `recoveryTime=3600`: seconds before one failure is forgotten (counter + decrements by 1 per interval) +- `lockoutRecovery=0`: seconds lockout auth blocked after failure + +TPM1 has no software-accessible command to configure DA parameters +(tpmtotp's `setcapability` does not expose DA threshold or timeout +sub-capabilities). The DA policy is determined by the TPM firmware +and cannot be changed through software on TPM1. + +#### Testing DA lockout + +Use `tpmr.sh bad_auth` from the recovery shell to test dictionary attack +lockout behavior by deliberately triggering an auth failure: + +- **TPM1**: attempts `tpm counter_increment -pwdc ` with the counter + ID from `/boot/kexec_rollback.txt`. Each call increments the DA counter + by 1 until lockout is triggered. On pre-rev 103 TPMs, DA state can't be + read before/after — the test reports whether the increment succeeded (no + lockout) or failed (lockout active). Repeat until the increment fails. +- **TPM2**: attempts `tpm2 nvincrement -P ` with NV index auth. + Uses `-P` (not `-C o -P`) because NV index auth failure produces + `TPM2_RC_AUTH_FAIL` (0x98e) and does increment `LOCKOUT_COUNTER`; + owner auth may not increment on some implementations. Shows DA state + before and after each attempt. + +Test headers and DA state queries are logged to debug.log for analysis; +the increment command output goes to the console during interactive use. + +#### Preventing future lockouts + +Heads' counter auth regression caused 3 TPM auth failures per boot by +passing the owner passphrase as the counter auth while the counter was +created with empty auth. Restoring empty counter auth for both creation +and increment (as per TCG spec) prevents auth failures from counter +operations. All TPM1 boards that ran the regression code are affected +identically; this is not platform-specific. + +If lockout still occurs (e.g., from deliberate `bad_auth` testing or +other bugs), the increment failure path in `increment_tpm_counter` +detects it: on TPM1, the captured increment output is grepped for +`defend`/`lock`/`0x19` patterns, and the user is directed to reset +the TPM via the GUI menu. This is the catch-all for pre-rev 103 TPMs +where `da_state` can't report lockout state. + +### TPM1 physical presence + +TPM1.2 forceclear requires physical presence to be asserted. The +`tpm1_reset()` function does this with `tpm physicalpresence -s` (software +presence). On some platforms (e.g., Dell OptiPlex, some Infineon TPMs), +software physical presence may not work — the TPM firmware only accepts +hardware-asserted presence (GPIO set by BIOS). In that case, `forceclear` +returns success but may not fully reset the TPM, or `takeown` may fail +with unexpected errors. + +When software physical presence fails, the LOG shows: +``` +tpm1_reset: unable to set physical presence +``` + +This is logged but not fatal — `tpm forceclear` is still attempted. +If the TPM firmware ignores software physical presence, the reset fails +and the user must use the platform's hardware TPM reset mechanism +(typically a BIOS option or jumper). + +### TPM reset methods + +Heads has two TPM reset methods with different scope: + +**`tpm-reset.sh`** (CLI, recovery shell): +- Prompts for new owner passphrase, calls `tpmr.sh reset` +- TPM clear + re-ownership only +- No counter creation, no /boot signing, no TOTP/HOTP generation +- Intended for headless recovery or clearing a defend lock before running + the full GUI flow + +**`reset_tpm()`** (GUI, via Options -> TPM/TOTP/HOTP -> Reset the TPM in +`initrd/bin/gui-init.sh`): +- Prompts for new owner passphrase, calls `tpmr.sh reset` +- Removes stale `/boot/kexec_rollback.txt` and `/boot/kexec_primhdl_hash.txt` +- Creates new TPM rollback counter via `check_tpm_counter()` +- Increments the new counter +- Re-signs /boot with the GPG signing key +- Generates new TOTP/HOTP secrets +- Reseals TPM Disk Unlock Key (DUK) to LUKS +- Regenerates TPM2 encrypted sessions + +After `tpm-reset.sh`, the TPM is cleared but the system is not fully +provisioned — the user must complete the GUI `reset_tpm()` or OEM Factory +Reset to restore counter, signing, and secrets. diff --git a/initrd/bin/oem-factory-reset.sh b/initrd/bin/oem-factory-reset.sh index ece2f8f59..5dc315965 100755 --- a/initrd/bin/oem-factory-reset.sh +++ b/initrd/bin/oem-factory-reset.sh @@ -868,7 +868,7 @@ generate_checksums() { if [ "$CONFIG_TPM" = "y" ]; then if [ "$CONFIG_IGNORE_ROLLBACK" != "y" ]; then tpmr.sh counter_create \ - -pwdc "${TPM_PASS:-}" \ + -pwdc '' \ -la -3135106223 | tee /tmp/counter >/dev/null 2>&1 || whiptail_error_die "Unable to create TPM counter" diff --git a/initrd/bin/tpm-reset.sh b/initrd/bin/tpm-reset.sh index 047d49ef0..426012863 100755 --- a/initrd/bin/tpm-reset.sh +++ b/initrd/bin/tpm-reset.sh @@ -6,3 +6,12 @@ NOTE "This will erase all keys and secrets from the TPM" prompt_new_owner_password tpmr.sh reset "$tpm_owner_passphrase" + +# TODO: move the TPM reset + full reprovision flow (counter creation, /boot +# signing, TOTP/HOTP generation, DUK reseal) from gui-init.sh's reset_tpm() +# into a reusable function in functions.sh. Then tpm-reset.sh and the GUI +# reset_tpm() can both call the same code, eliminating the inconsistency +# between CLI and GUI reset paths. + +NOTE "TPM cleared. The TPM rollback counter was destroyed. /boot/kexec_rollback.txt still references the old counter." +NOTE "Restore full functionality from the GUI: Options -> TPM/TOTP/HOTP Options -> Reset the TPM" diff --git a/initrd/bin/tpmr.sh b/initrd/bin/tpmr.sh index 46f4581d8..35ab13392 100755 --- a/initrd/bin/tpmr.sh +++ b/initrd/bin/tpmr.sh @@ -354,7 +354,7 @@ tpm2_counter_inc() { rm -f "$tmp_err_file" shred -n 10 -z -u /tmp/secret/tpm_owner_passphrase 2>/dev/null || true DEBUG "tpm2_counter_inc attempt $attempt failed. Stderr: $tmp_err_content" - if ! echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission|0x98e|0x149'; then + if ! echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission|defend|0x98e|0x149'; then DIE "Can't increment TPM counter for $index, access denied." fi WARN "Authentication failed, retrying..." @@ -362,7 +362,7 @@ tpm2_counter_inc() { DIE "Can't increment TPM counter for $index after 3 attempts, access denied." } -# _tpm_auth_retry - Shared retry helper for TPM commands needing owner auth. +# _tpm_auth_retry - Shared retry helper for TPM commands needing authorization. # # Handles both TPM1 (tpmtotp: errors to stdout, uses -pwdo/-pwdc flags) # and TPM2 (tpm2-tools: errors to stderr, uses -P parameter). @@ -370,16 +370,26 @@ tpm2_counter_inc() { # Caching: prompt_tpm_owner_password reuses cached passphrase if available. # On auth failure the cache is shredded; next prompt will ask the user. # +# Error stream selection: +# TPM1 (tpmtotp): errors go to stdout via printf() — capture stdout+stderr +# TPM2 (tpm2-tools): errors go to stderr via LOG_ERR() — capture stderr only +# +# Auth detection grep patterns: +# English words — TPM1 (TPM_GetErrMsg returns "Authentication failed...") +# — TPM2 (tpm2-tools LOG_ERR returns "TPM2_RC_AUTH_FAIL...") +# defend — TPM1 "Defend lock running" (TPM_DEFEND_LOCK_RUNNING) +# 0x98e, 0x149 — TPM2 raw hex codes (TPM2_RC_AUTH_FAIL, TPM2_RC_NV_AUTHORIZATION) +# # Usage: _tpm_auth_retry