Skip to content

Handle large exec_command output and surface scan errors as TUI toasts#583

Open
mhspektr wants to merge 17 commits into
usestrix:mainfrom
mhspektr:feature/issue-579
Open

Handle large exec_command output and surface scan errors as TUI toasts#583
mhspektr wants to merge 17 commits into
usestrix:mainfrom
mhspektr:feature/issue-579

Conversation

@mhspektr

@mhspektr mhspektr commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

NB: this builds on #577 and should be merged after

  • Large output handlingexec_command results are now capped at STRIX_MAX_TOOL_OUTPUT_CHARS (default 65 536 chars). JSON arrays are trimmed to the first 50 records (valid JSON preserved); plain text keeps the first 300 lines. Both truncation paths include a header with total size and count so the agent retains full context awareness. Set STRIX_MAX_TOOL_OUTPUT_CHARS=0 to disable.
  • MapReduce LLM compression — when a single chunk still overflows after truncation, a MapReduce pipeline compresses it via the LLM before it reaches the context window, preventing hard overflows.
  • Scan error toasts — scan thread failures now post a persistent Textual toast immediately rather than surfacing as a post-exit traceback. A timeout=0 bug that caused the toast to expire on mount was also fixed (Textual's default ~5 sis now used).

Closes #579

mhspektr and others added 16 commits June 18, 2026 22:45
- Change RuntimeError to TypeError for type validation in report/writer.py
- Update pyupgrade to v3.21.2 for Python 3.14 compatibility
Mirror the layout introduced on feature/438-token_budget: pytest +
pytest-asyncio dev deps, asyncio_mode auto, a tests.* mypy override, and
pytest in the mypy pre-commit hook deps so the tests/ package type-checks.
…ix#492)

Large local targets were copied into the sandbox file-by-file via the SDK
LocalDir entry, which stalls on big repos and could leave /workspace empty.

- --mount <path> bind-mounts a host directory read-only at /workspace/<subdir>
  instead of copying it, bypassing the per-file stream.
- A size pre-flight (STRIX_MAX_LOCAL_COPY_MB, default 1024) fails fast with a
  clear message suggesting --mount when a non-mounted local target is too big.
An empty or whitespace-only --mount value resolves to the current working
directory and would silently bind-mount it into the sandbox. Reject it.
If the same directory is passed via --target and --mount (or as duplicate
values), it previously produced two targets — copied AND bind-mounted, and
the copied one could trip the size pre-flight. Dedupe by resolved path,
preferring the bind mount.
Previously a value of 0 (or negative) made every local target count as
oversized, aborting all local scans. Now <= 0 disables the pre-flight.
os.walk silently swallowed directory-listing errors, so a permission-denied
subtree could make a large repo under-count and slip past the pre-flight.
Surface such omissions via an onerror warning.
Add CLI reference + example for --mount, document the size pre-flight env var,
note the read-only-is-not-a-hard-boundary caveat and that remote repos are not
size-checked, and clarify the backends docstring on when bind mounts apply.
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
ContextWindowExceededError carries status_code=400, which matched the
_INPUT_REJECTION_CODES guard and triggered image-strip retry logic.
Image stripping cannot reduce token count, so the agent wasted up to
3 retry cycles before parking as failed.

Add an explicit isinstance check before the status-code guard to detect
context-window overflow and park the agent as 'failed' immediately.
Scan errors were stored in _scan_error and re-raised only after the
user closed the TUI, making the app appear stuck while producing a
confusing post-exit traceback.

Extract _notify_scan_error and call it from all three error branches
in _start_scan_thread so a persistent Textual toast is shown
immediately when the scan thread fails.
…missal

Textual's Notification expires when raised_at + timeout - time() <= 0.
With timeout=0 the toast expired immediately on mount, making the error
notification invisible to users. Drop the timeout argument to use
Textual's default display duration (~5 s).
exec_command output is now capped at STRIX_MAX_TOOL_OUTPUT_CHARS (default
65536). JSON arrays are parsed and trimmed to the first 50 records (valid
JSON preserved); plain text keeps the first 300 lines. Both include a
header showing the total size and record/line count so the agent retains
full awareness of what was produced.

A secondary character cap prevents single very long lines from bypassing
the line limit. Set STRIX_MAX_TOOL_OUTPUT_CHARS=0 to disable.
Replaces plain truncation with a parallel summarisation pipeline when
exec_command output exceeds STRIX_MAX_TOOL_OUTPUT_CHARS (default 65536).
Output is split at JSON record or line boundaries; each chunk is
summarised via litellm.acompletion; summaries are consolidated into a
single result that fits the context window.

Falls back to truncate_exec_result on any compression error.

Closes usestrix#579
Code review findings:

- factory._wrap_exec_command: add post-compression backstop that calls
  truncate_exec_result when compress_exec_result returns an oversized
  result (happens when _split produces 1 chunk, e.g. a single very
  large JSON record — compress_large_output correctly returns it
  unchanged, but the result was never guarded against threshold after).

- tests/agents/test_factory_helpers.py (new): unit tests for
  _resolve_model (run_config vs settings fallback, None/whitespace
  error cases, non-string Model object guard) and _extract_task_hint
  (valid cmd, missing cmd, non-string cmd, invalid JSON, non-dict JSON).
@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds three self-contained features on top of PR #577: structure-aware truncation and MapReduce LLM compression for oversized exec_command outputs, a ContextWindowExceededError fast-path in the agent run loop that bypasses the (unhelpful) image-strip retry, and cross-thread TUI toasts for scan-thread failures. It also adds a --mount CLI option that bind-mounts large local directories read-only instead of streaming them file-by-file.

  • Large output pipeline — outputs exceeding STRIX_MAX_TOOL_OUTPUT_CHARS are first compressed via a parallel MapReduce LLM fan-out; single-chunk outputs and compression failures fall back to structure-aware truncation (JSON arrays kept as valid JSON, plain text line-capped), with a final character-count backstop in _wrap_exec_command.
  • Context window bypassContextWindowExceededError (status 400) is now intercepted before the image-strip retry loop, parks the agent as "failed", and propagates correctly for root agents while returning None for child agents.
  • --mount bind-mount supportbuild_mount_targets_info / build_session_entries / StrixDockerSandboxClient are wired together to pass host directories to Docker at container-create time; a pre-flight size check rejects oversized --target directories and directs users to --mount.

Confidence Score: 5/5

Safe to merge; all three feature areas are well-isolated with dedicated tests, and the fallback chains preserve session correctness.

The truncation/compression pipeline has a solid backstop so oversized outputs cannot bypass the character cap. The ContextWindowExceededError handler is tested end-to-end including the non-retry assertion. The bind-mount wiring is straightforward. The two issues noted are a misleading log message in non-interactive mode and a deprecated asyncio.wait_for(coroutine) call pattern — neither affects correctness.

No files require special attention; strix/core/execution.py has a misleading log message and strix/core/mapreduce_output.py has the deprecated asyncio.wait_for(coroutine) call, but neither introduces wrong behavior.

Important Files Changed

Filename Overview
strix/core/large_output.py New module: structure-aware truncation for oversized tool outputs. Halving loop, SDK header preservation, and text/JSON branching all look correct. Single-oversized-record edge case (noted in previous review) is handled by caller backstop.
strix/core/mapreduce_output.py New MapReduce LLM compression module. Fan-out to parallel litellm calls is sound; single-chunk fast-path correctly delegates truncation to caller. asyncio.wait_for receives a bare coroutine instead of a Task (deprecated in Python 3.12+). Broad except Exception in _summarise is intentional for chunk-level resilience.
strix/agents/factory.py Added MapReduce compression + truncation backstop inside _wrap_exec_command. Fallback chain (compress → backstop truncate) is correct; load_settings() called inside the hot tool-invocation path, which is fine if settings are cached by the config layer.
strix/core/execution.py ContextWindowExceededError handler correctly bypasses image-strip retry. Log message "parking as failed" fires before the if not interactive: raise guard and is factually wrong in non-interactive mode.
strix/interface/tui/app.py Scan error toast via call_from_thread with contextlib.suppress is the correct cross-thread notification pattern. Removed explicit timeout=0 (which caused immediate expiration) and uses Textual's default timeout instead.
strix/interface/utils.py New helpers: build_mount_targets_info, dedupe_local_targets, find_oversized_local_targets, directory_size_bytes. Deduplication prefers bind-mounted entries; path resolution in build_mount_targets_info is consistent. Size walk is stat-only and best-effort as documented.
strix/runtime/session_manager.py Refactored to build_session_entries which splits local sources into copied entries and bind-mount specs. Logic is correct; mounted paths are excluded from SDK manifest (preventing file-by-file copy).
strix/runtime/docker_client.py Added strix_bind_mounts class attribute (now None not [] per prior review) and Docker mount injection in _create_container. Pyright suppression comments added for private SDK imports.
strix/runtime/backends.py Passes bind_mounts through from session manager to Docker backend. Straightforward wiring change.
strix/config/settings.py Added max_local_copy_mb and max_tool_output_chars settings with documented env-var aliases and defaults. Zero-disables convention is consistent across both fields.
strix/interface/main.py Added --mount CLI argument with correct action="append", error messages updated, pre-flight size check wired in. dedupe_local_targets called after both --target and --mount lists are assembled.

Reviews (2): Last reviewed commit: "Fix review comment" | Re-trigger Greptile

Comment thread strix/runtime/docker_client.py Outdated
@mhspektr

Copy link
Copy Markdown
Contributor Author

@greptile

@mhspektr

Copy link
Copy Markdown
Contributor Author

I suggest merging #577, updating this branch, and rerunning greptile before reviewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Running strix on a large repo crashed with a ContextWindowExceededError

1 participant