Skip to content

Commit ab4ebd4

Browse files
sufubaoclaude
andcommitted
docs: add multimodal OOM fix design spec
Design spec for eliminating the multimodal OOM class that surfaced with Qwen3.5-VL. Replaces PR #1253 in full: absorbs its Qwen stress helpers (minus the empty_cache call that released the measured peak), adds the min-max bug fix at visualserver/manager.py:87, tightens visual+audio concurrency semaphores from x8 to x1, ports _check_decode_infer from origin/qw35_stable, and re-shapes the LLM init into a two-pass probe-measure-rebuild-validate auto-profile that eliminates --mem_fraction as a tuning knob. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent bbc9eba commit ab4ebd4

1 file changed

Lines changed: 759 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)