Update: Live streaming, working config.json layer, latency tuning by nidhiii128 · Pull Request #513 · FOSSEE/eSim

nidhiii128 · 2026-05-26T09:29:03Z

Update: Live streaming + working config.json layer + latency tuning

Streaming (text + vision):

OllamaWorker and OllamaVisionWorker now emit each token via a new
chunk_signal.
Chatbot.py consumes it (_on_stream_chunk) and rewrites the bot bubble
in place using an anchored cursor, so replies render token-by-token instead
of appearing all at once. Generation is interruptible via the Stop button.

config.json (customizable system rules):

chatbot_thread.py now loads src/chatbot/config.json at startup and uses
it for the system prompt, context window, sampling params, keep-alive, and
history depth. Editing the file changes Copilot behavior without code edits;
falls back to built-in defaults if the file is missing or invalid.

Latency tuning:

num_ctx 2048 → 1024, history window 10 → 6 turns, lower token-budget tiers
(128/256/512), and keep_alive set to never-unload so repeat questions skip
the model reload cost.

optimization

497f715