From 1f77b45b86567f7d11dfd91b4865adc5f2e9a9f5 Mon Sep 17 00:00:00 2001
From: "Paul S. Schweigert" <paul@paulschweigert.com>
Date: Fri, 29 May 2026 13:36:06 -0400
Subject: [PATCH] update gpu recs for speech demo notebook

Signed-off-by: Paul S. Schweigert <paul@paulschweigert.com>
---
 tutorials/notebooks/granite_speech_demo.ipynb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tutorials/notebooks/granite_speech_demo.ipynb b/tutorials/notebooks/granite_speech_demo.ipynb
index ea41a20..2b5c04c 100644
--- a/tutorials/notebooks/granite_speech_demo.ipynb
+++ b/tutorials/notebooks/granite_speech_demo.ipynb
@@ -18,7 +18,7 @@
     "\n",
     "## Prerequisites\n",
     "\n",
-    "- **GPU runtime: A100 (Colab Pro) recommended.** L4 works. T4 will OOM — both Granite models won't fit.\n",
+    "- **GPU runtime: A100 (Colab Pro) required.** Smaller GPUs won't have enough VRAM to hold both Granite models simultaneously.\n",
     "- **HuggingFace read token.** Free; create one at https://huggingface.co/settings/tokens. Add it as a Colab Secret named `HF_TOKEN` (sidebar → 🔑 → New secret). Used for two things: downloading the Granite model weights, *and* minting per-session WebRTC TURN credentials so audio reaches your browser.\n",
     "- **Browser:** Chrome, Edge, or Firefox. Safari may behave oddly with WebRTC.\n",
     "\n",
@@ -30,7 +30,7 @@
     "## What to do\n",
     "\n",
     "1. Set the `HF_TOKEN` Colab Secret.\n",
-    "2. Switch the runtime to a GPU (Runtime → Change runtime type → A100/L4).\n",
+    "2. Switch the runtime to an A100 GPU (Runtime → Change runtime type → A100).\n",
     "3. **Runtime → Run all.**\n",
     "4. When the last cell prints a `*.trycloudflare.com` URL, open it, allow mic access, and start talking.\n",
     "\n",
@@ -225,7 +225,7 @@
     "View one with `!tail -100 logs/vllm-speech.log` (or open the file from the Colab file browser).\n",
     "\n",
     "**Common failures:**\n",
-    "- *T4 OOM:* switch the runtime to A100 or L4. Both Granite models won't fit on a T4.\n",
+    "- *GPU OOM:* switch the runtime to an A100. Both Granite models won't fit on smaller GPUs (T4, L4).\n",
     "- *`HF_TOKEN` missing:* re-run Cell 3 after adding the secret. Without it, the backend falls back to STUN-only and audio likely won't connect through the cloudflared tunnel.\n",
     "- *Stuck \"waiting for vLLM\":* model weights are downloading. The cell waits up to 20 min — let it run.\n",
     "- *Re-running cells without cleaning up:* old processes still hold the ports. Run the kill-switch cell below, then re-run from the top.\n",