Skip to content

docs(fp8): update FP8 Storage docs#9241

Open
Pfannkuchensack wants to merge 1 commit into
invoke-ai:mainfrom
Pfannkuchensack:docs/fp8-storage-update-9231
Open

docs(fp8): update FP8 Storage docs#9241
Pfannkuchensack wants to merge 1 commit into
invoke-ai:mainfrom
Pfannkuchensack:docs/fp8-storage-update-9231

Conversation

@Pfannkuchensack
Copy link
Copy Markdown
Collaborator

Summary

Refresh FP8 Storage page after hook-based loader change.

PR #9231 routed every nn.Module — including diffusers ModelMixin — through InvokeAI's register_forward_pre_hook / register_forward_hook path, but the FP8 Storage docs still described the old enable_layerwise_casting implementation. This PR rewrites the relevant sections to match the merged behavior.

Also corrects two unrelated inaccuracies the rewrite surfaced:

  • Pre-Ampere CUDA cards are not a no-op. The FP8 path gates only on device.type == "cuda", and float8_e4m3fn is a pure storage dtype that works on any CUDA device — so a 2080 Ti gets the same ~50% VRAM win as a 3090. The "no-op" tier now covers only MPS and CPU.
  • The UI does not grey out the toggle based on hardware. Removed the misleading "may grey the toggle out" sentence — the only UI-side hides are for Z-Image and ControlLoRA, both unrelated to GPU detection.

Adds a "Reporting an FP8 issue" section to Troubleshooting so users include the info needed for triage on first contact: repro steps, exact model + variant (full-precision vs. GGUF/NF4/int8), LoRA stack, partner toggles (Low-VRAM, cpu_only), GPU + VRAM, OS, and the relevant log lines.

Related Issues / Discussions

Follow-up to #9231 — that PR's checklist had "Documentation added / updated" unchecked.

QA Instructions

Docs-only change. Build the Starlight docs locally (or preview on the docs deploy) and verify:

  • The "Hardware support tiers" section reads sensibly with four tiers (RTX 30, RTX 40/50/Hopper, older CUDA, MPS/CPU) instead of the previous three.
  • No remaining references to enable_layerwise_casting on the FP8 Storage page.
  • The new "Reporting an FP8 issue" subsection renders correctly with the bulleted checklist and the GitHub issues link.

Merge Plan

Straight merge. Docs only — no code, schema, or migration impact.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable) — docs-only
  • ❗Changes to a redux slice have a corresponding migration — n/a
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR) — n/a

PR invoke-ai#9231 routed every nn.Module — including diffusers ModelMixin — through
InvokeAI's `register_forward_pre_hook` / `register_forward_hook` path, but
the FP8 Storage docs still described the old `enable_layerwise_casting`
implementation. Also corrects two unrelated inaccuracies the rewrite
surfaced: pre-Ampere CUDA cards are not a no-op (the FP8 path gates only
on `device.type == "cuda"`, and `float8_e4m3fn` is a pure storage dtype
that works on any CUDA device), and the UI does not grey out the toggle
based on hardware.

Tell users what to include when reporting an FP8 problem so triage isn't
blocked on follow-up questions: repro steps, exact model + variant,
LoRA stack, partner toggles (low-VRAM, cpu_only), GPU + VRAM, OS, and
the relevant log lines.
@github-actions github-actions Bot added the docs PRs that change docs label May 27, 2026
@JPPhoto JPPhoto added the 6.13.5 Library Updates label May 27, 2026
@JPPhoto JPPhoto moved this to 6.13.5 LIBRARY UPDATES in Invoke - Community Roadmap May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.13.5 Library Updates docs PRs that change docs

Projects

Status: 6.13.5 LIBRARY UPDATES

Development

Successfully merging this pull request may close these issues.

2 participants