Summary
useVAD exposes stream, streamInsert, and streamStop, but all three go through runForward, which rejects calls while isGenerating === true. While stream() is in flight, isGenerating is true, so every subsequent streamInsert(buffer) rejects with "model is currently generating" — i.e. the streaming API cannot actually be used as a stream.
A rapid-tap repro of this behaviour was already discussed during PR #1160 review; the thread was resolved without a code change.
Repro
const vad = useVAD({ model });
vad.stream({ /* ... */ }); // starts inference, isGenerating = true
vad.streamInsert(chunk1); // rejects: "model is currently generating"
vad.streamInsert(chunk2); // rejects
Root cause
In useVAD.ts:
const stream = (input) => runForward((inst) => inst.stream(input));
const streamInsert = (waveform) => runForward((inst) => { inst.streamInsert(waveform); return Promise.resolve(); });
const streamStop = () => runForward((inst) => { inst.streamStop(); return Promise.resolve(); });
runForward is the same gate used by forward and intentionally serialises full inferences. streamInsert is a buffer push, not an inference — it must not share the gate.
Why this is more general than useVAD
useModuleFactory exposes a single primitive, runForward, which serialises calls via isGenerating. That's correct for one-shot inference, but a streaming-capable module needs two call paths:
- Gated (only one running at a time):
forward, stream.
- Side-channel (allowed during a running stream):
streamInsert (buffer push), streamStop (interrupt signal).
The other streaming hooks — useTextToSpeech and useSpeechToText — quietly worked around this by not using useModuleFactory at all and calling the module instance directly for the side-channel methods. useVAD is the first hook to migrate streaming methods onto useModuleFactory and so is the first to hit the trap. Any future streaming hook will hit it too.
Fix options
- Local fix in
useVAD — bypass runForward for streamInsert/streamStop and call instance.streamInsert/streamStop directly (matching the existing TTS/STT shape). Smallest diff; doesn't help future hooks.
- Extend
useModuleFactory — add a non-gating primitive (e.g. runSideChannel(fn) that only checks isReady, never isGenerating). Then streaming hooks express the two regimes explicitly, and TTS/STT can migrate onto useModuleFactory cleanly later.
Additionally, streamStop arguably should never be gated even in the local-fix path — a caller may want to stop because inference is stuck.
Context
Follow-up to PR #1160 review: #1160 (comment)
Summary
useVADexposesstream,streamInsert, andstreamStop, but all three go throughrunForward, which rejects calls whileisGenerating === true. Whilestream()is in flight,isGeneratingistrue, so every subsequentstreamInsert(buffer)rejects with "model is currently generating" — i.e. the streaming API cannot actually be used as a stream.A rapid-tap repro of this behaviour was already discussed during PR #1160 review; the thread was resolved without a code change.
Repro
Root cause
In
useVAD.ts:runForwardis the same gate used byforwardand intentionally serialises full inferences.streamInsertis a buffer push, not an inference — it must not share the gate.Why this is more general than
useVADuseModuleFactoryexposes a single primitive,runForward, which serialises calls viaisGenerating. That's correct for one-shot inference, but a streaming-capable module needs two call paths:forward,stream.streamInsert(buffer push),streamStop(interrupt signal).The other streaming hooks —
useTextToSpeechanduseSpeechToText— quietly worked around this by not usinguseModuleFactoryat all and calling the module instance directly for the side-channel methods.useVADis the first hook to migrate streaming methods ontouseModuleFactoryand so is the first to hit the trap. Any future streaming hook will hit it too.Fix options
useVAD— bypassrunForwardforstreamInsert/streamStopand callinstance.streamInsert/streamStopdirectly (matching the existing TTS/STT shape). Smallest diff; doesn't help future hooks.useModuleFactory— add a non-gating primitive (e.g.runSideChannel(fn)that only checksisReady, neverisGenerating). Then streaming hooks express the two regimes explicitly, and TTS/STT can migrate ontouseModuleFactorycleanly later.Additionally,
streamStoparguably should never be gated even in the local-fix path — a caller may want to stop because inference is stuck.Context
Follow-up to PR #1160 review: #1160 (comment)