implement zone log gathering in edera-debug-report#8
Merged
Conversation
Signed-off-by: Steven Noonan <steven@edera.dev>
Adds a watchdog-based timeout mechanism (TERM-then-KILL via os.killpg) that runs each protected child in its own session, plus parent-side SIGINT/SIGTERM/SIGHUP forwarding so children can't be orphaned when the report tool is interrupted. Why this and not the timeout(1) coreutil: the report runs in possibly-broken environments where tools like protect-ctl may spawn helpers, and we need to kill the whole group atomically without depending on an external binary. Surfaces two entry points: run_with_timeout() for captured-output use, and timeout_s/kill_after_s on ZipArchiveWriter.add_stream_from_proc() for streaming into the ZIP. add_stream_from_proc now returns (rc, stderr_text, timed_out); the two existing callers are updated. Signed-off-by: Steven Noonan <steven@edera.dev>
Mirrors the standalone gather-zone-logs.sh: enumerate zones via `protect-ctl zone list -o json`, then stream each zone's logs from `protect-ctl zone logs <id>` into the archive. Both calls are bounded by the in-process timeout helper (10s SIGTERM, +1s grace SIGKILL), which is the failure mode this report is meant to capture in the first place — protect-daemon being wedged. Output lands under `<top>/protect/` so it's clearly distinct from general system info, and the collector runs immediately before the journalctl block so the journal capture includes any activity these calls trigger (avoiding a pre/post journal gather). Signed-off-by: Steven Noonan <steven@edera.dev>
Wires the in-process timeout helper into the commands most likely to hang in the failure modes this report is designed to capture. Signed-off-by: Steven Noonan <steven@edera.dev>
When 'EDR_DEBUG=1' in the system environment variables, add a one-line "[edr] running: <cmd>" stderr message at the start of every external command, plus "[edr] timeout (...): SIG... -> <cmd>" when the watchdog fires. When the tool stalls, the last line printed identifies the offending command. Hooked into both add_stream_from_proc and run_with_timeout so callers that go through run_and_write, run_json_then_fallback_text, or invoke either helper directly all get coverage. Also prints "Creating: <path>" to stdout up front so the output filename is visible before the progress noise (the existing "Wrote: <path>" line is unchanged). Signed-off-by: Steven Noonan <steven@edera.dev>
On my system, kubelet was very spammy with log messages. It was taking a very long time to capture that log as a result. Signed-off-by: Steven Noonan <steven@edera.dev>
9ea704c to
23ad2c4
Compare
bleggett
approved these changes
May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This does a few things:
EDR_DEBUG=1) to help understand slow/broken report generation