Skip to content

Update: Live streaming, working config.json layer, latency tuning#513

Open
nidhiii128 wants to merge 1 commit into
FOSSEE:eSim-Chat-Bot-Semester-Long-Internship_Autumn-2025from
nidhiii128:eSim-Chat-Bot-Semester-Long-Internship_Autumn-2025
Open

Update: Live streaming, working config.json layer, latency tuning#513
nidhiii128 wants to merge 1 commit into
FOSSEE:eSim-Chat-Bot-Semester-Long-Internship_Autumn-2025from
nidhiii128:eSim-Chat-Bot-Semester-Long-Internship_Autumn-2025

Conversation

@nidhiii128
Copy link
Copy Markdown

Update: Live streaming + working config.json layer + latency tuning

Streaming (text + vision):

  • OllamaWorker and OllamaVisionWorker now emit each token via a new
    chunk_signal.
  • Chatbot.py consumes it (_on_stream_chunk) and rewrites the bot bubble
    in place using an anchored cursor, so replies render token-by-token instead
    of appearing all at once. Generation is interruptible via the Stop button.

config.json (customizable system rules):

  • chatbot_thread.py now loads src/chatbot/config.json at startup and uses
    it for the system prompt, context window, sampling params, keep-alive, and
    history depth. Editing the file changes Copilot behavior without code edits;
    falls back to built-in defaults if the file is missing or invalid.

Latency tuning:

  • num_ctx 2048 → 1024, history window 10 → 6 turns, lower token-budget tiers
    (128/256/512), and keep_alive set to never-unload so repeat questions skip
    the model reload cost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant