Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughIntroduces a new dataclass-based event system that maps logging.LogRecord objects to typed LogEventData subclasses via from_record()/from_extra, adds a comprehensive events module, updates README examples to use dataclass instances, and replaces the previous Rich-plugin test with an event-focused test suite. Changes
Sequence Diagram(s)sequenceDiagram
participant Handler as LogHandler
participant Record as LogRecord
participant Events as EventsModule
Handler->>Record: receive LogRecord (level, msg, event, extra)
Handler->>Events: lookup cls = LOG_EVENT_CLASSES[record.event]
Events->>Events: cls.from_record(record) -> from_extra(record.__dict__)
Events-->>Handler: return typed LogEventData instance
Handler->>Handler: inspect/use dataclass fields
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
src/snakemake_interface_logger_plugins/events.py (2)
28-47: Review the post_init validation logic for edge cases.The WorkflowStarted class has good type conversion handling, but there are some considerations:
- The post_init method modifies
self.snakefilein place, which could be unexpected behavior in an immutable dataclass- The None check should come before the string conversion attempt
Consider this improvement:
def __post_init__(self) -> None: - if not isinstance(self.snakefile, str): + if self.snakefile is not None and not isinstance(self.snakefile, str): try: # Try to convert to string - this should work for PosixPath and other path-like objects self.snakefile = str(self.snakefile) except (TypeError, ValueError) as e: raise ValueError(f"Could not convert snakefile to string: {e}")
170-186: Review the mutually exclusive field logic.The ResourcesInfo class uses conditional logic to populate only one field at a time, but this design might not be intuitive. The class could potentially have multiple resource-related fields populated simultaneously.
Consider allowing multiple fields to be populated or document the mutually exclusive behavior:
@classmethod def from_record(cls, record: LogRecord) -> "ResourcesInfo": - if hasattr(record, "nodes"): - return cls(nodes=record.nodes) # type: ignore - elif hasattr(record, "cores"): - return cls(cores=record.cores) # type: ignore - elif hasattr(record, "provided_resources"): - return cls(provided_resources=record.provided_resources) # type: ignore - else: - return cls() + return cls( + nodes=getattr(record, "nodes", None), + cores=getattr(record, "cores", None), + provided_resources=getattr(record, "provided_resources", None), + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
README.md(1 hunks)src/snakemake_interface_logger_plugins/events.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit Configuration File
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake_interface_logger_plugins/events.py
🔇 Additional comments (12)
src/snakemake_interface_logger_plugins/events.py (11)
1-5: LGTM! Clean imports and type annotations.The imports are well-organized and include all necessary types for the dataclasses.
7-26: LGTM! Error dataclass implementation is solid.The Error dataclass properly handles optional fields and uses safe attribute access with getattr() and default values.
49-90: LGTM! JobInfo class handles complex resource extraction well.The resource filtering logic correctly excludes internal fields (
_cores,_nodes) and safely handles cases where the resources attribute might not exist or have the expected structure.
92-106: LGTM! Robust job ID normalization logic.The JobStarted class properly handles different input types (None, int, list) and normalizes them to a consistent List[int] format.
132-141: LGTM! Simple and consistent implementation.The JobError class follows the same pattern as other simple dataclasses with appropriate default values.
143-153: LGTM! Proper handling of group information.The GroupInfo class correctly uses default_factory for mutable default values and provides safe defaults.
155-168: LGTM! Consistent with GroupInfo implementation.The GroupError class follows good practices with default_factory for mutable fields and safe attribute access.
188-203: LGTM! Consistent implementation pattern.The DebugDag class follows the established pattern with safe attribute access and appropriate optional types.
205-213: LGTM! Simple and effective implementation.The Progress class correctly handles required integer fields with sensible defaults.
215-222: LGTM! Appropriate handling of complex data.The RuleGraph class uses Dict[str, Any] for the flexible rulegraph structure and provides a safe default.
224-239: LGTM! Smart statistics parsing logic.The RunInfo class correctly separates the "total" field from per-rule counts and handles missing data gracefully.
README.md (1)
234-277: LGTM! Excellent example demonstrating proper usage.The example function
parse_event()demonstrates best practices:
- Proper event type checking with
hasattr()- Complete mapping of all LogEvent types to dataclasses
- Type-safe usage with
isinstance()checks- Clear documentation of the pattern
This is exactly the kind of example that will help plugin developers use the new dataclasses correctly.
| The `LogEvent` enum defines particularly important Snakemake events such as workflow starting, job submission, job failure, etc. | ||
| For each event, a corresponding dataclass is defined in `snakemake_interface_logger_plugins.events`. | ||
| These dataclasses provide a typed interface for accessing event-specific fields from a `LogRecord`. | ||
|
|
||
| **To extract event data from a `LogRecord`, use the appropriate dataclass's `from_record()` method.** | ||
|
|
||
| ### Event Types, Dataclasses, and Typical Fields | ||
|
|
||
| | LogEvent | Dataclass | Typical Fields (see class for details) | | ||
| | --------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------- | | ||
| | `LogEvent.ERROR` | `Error` | exception, location, rule, traceback, file, line | | ||
| | `LogEvent.WORKFLOW_STARTED` | `WorkflowStarted` | workflow_id, snakefile | | ||
| | `LogEvent.JOB_INFO` | `JobInfo` | jobid, rule_name, threads, input, output, log, benchmark, rule_msg, wildcards, reason, shellcmd, priority, resources | | ||
| | `LogEvent.JOB_STARTED` | `JobStarted` | job_ids | | ||
| | `LogEvent.JOB_FINISHED` | `JobFinished` | job_id | | ||
| | `LogEvent.SHELLCMD` | `ShellCmd` | jobid, shellcmd, rule_name | | ||
| | `LogEvent.JOB_ERROR` | `JobError` | jobid | | ||
| | `LogEvent.GROUP_INFO` | `GroupInfo` | group_id, jobs | | ||
| | `LogEvent.GROUP_ERROR` | `GroupError` | groupid, aux_logs, job_error_info | | ||
| | `LogEvent.RESOURCES_INFO` | `ResourcesInfo` | nodes, cores, provided_resources | | ||
| | `LogEvent.DEBUG_DAG` | `DebugDag` | status, job, file, exception | | ||
| | `LogEvent.PROGRESS` | `Progress` | done, total | | ||
| | `LogEvent.RULEGRAPH` | `RuleGraph` | rulegraph | | ||
| | `LogEvent.RUN_INFO` | `RunInfo` | per_rule_job_counts, total_job_count | | ||
|
|
||
| **Note:** These field lists are guidelines only and may change between versions. | ||
| Always use defensive programming practices like `getattr()` with defaults or `hasattr()` checks when accessing fields. | ||
|
|
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Verify event mapping table accuracy against actual dataclasses.
The documentation provides a comprehensive mapping table, but we should verify that all mappings are accurate and that field lists match the actual dataclass definitions.
🏁 Script executed:
#!/bin/bash
# Verify that all LogEvent enum values have corresponding dataclasses
echo "=== Checking LogEvent enum values ==="
rg -A 20 "class LogEvent" src/
echo "=== Checking dataclass names in events.py ==="
rg "^class [A-Z]" src/snakemake_interface_logger_plugins/events.py
echo "=== Verifying field names in dataclasses match documentation ==="
# Check specific examples mentioned in the table
rg -A 10 "class Error:" src/snakemake_interface_logger_plugins/events.py
rg -A 10 "class JobInfo:" src/snakemake_interface_logger_plugins/events.py
rg -A 10 "class ResourcesInfo:" src/snakemake_interface_logger_plugins/events.pyLength of output: 3084
🏁 Script executed:
#!/bin/bash
echo "=== WorkflowStarted fields ==="
rg -A10 "class WorkflowStarted" src/snakemake_interface_logger_plugins/events.py
echo "=== GroupInfo fields ==="
rg -A10 "class GroupInfo" src/snakemake_interface_logger_plugins/events.py
echo "=== GroupError fields ==="
rg -A10 "class GroupError" src/snakemake_interface_logger_plugins/events.py
echo "=== RunInfo fields ==="
rg -A10 "class RunInfo" src/snakemake_interface_logger_plugins/events.pyLength of output: 2107
#!/bin/bash
for cls in Error WorkflowStarted JobInfo JobStarted JobFinished ShellCmd JobError GroupInfo GroupError ResourcesInfo DebugDag Progress RuleGraph RunInfo; do
echo "=== $cls fields ==="
rg -A20 "class $cls" src/snakemake_interface_logger_plugins/events.py || echo "No class $cls found"
echo
doneUpdate README event table for JobInfo fields
All LogEvent enum values map correctly to dataclasses in events.py, and the majority of field lists match. However, the JobInfo entry in the table includes three attributes that the JobInfo dataclass does not define:
- The class defines:
• jobid, rule_name, threads, input, output, log, benchmark, rule_msg, wildcards, reason - The table lists (incorrectly):
• shellcmd, priority, resources
Please remove or replace shellcmd, priority, and resources from the LogEvent.JOB_INFO row and ensure the table reflects the actual JobInfo fields (e.g., rule_msg, wildcards, reason).
🤖 Prompt for AI Agents
In README.md between lines 204 and 231, the table listing typical fields for
LogEvent.JOB_INFO incorrectly includes shellcmd, priority, and resources, which
are not defined in the JobInfo dataclass. Remove these three fields from the
JOB_INFO row and update the list to accurately reflect the actual JobInfo
dataclass fields such as rule_msg, wildcards, and reason to ensure the
documentation matches the code.
| @dataclass | ||
| class JobFinished: | ||
| job_id: int | ||
|
|
||
| @classmethod | ||
| def from_record(cls, record: LogRecord) -> "JobFinished": | ||
| return cls(job_id=getattr(record, "job_id")) | ||
|
|
There was a problem hiding this comment.
Missing default value for required field.
The JobFinished class accesses job_id without a default value, which could raise an AttributeError if the field is missing from the LogRecord.
Apply this fix to provide a safe default:
def from_record(cls, record: LogRecord) -> "JobFinished":
- return cls(job_id=getattr(record, "job_id"))
+ return cls(job_id=getattr(record, "job_id", 0))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @dataclass | |
| class JobFinished: | |
| job_id: int | |
| @classmethod | |
| def from_record(cls, record: LogRecord) -> "JobFinished": | |
| return cls(job_id=getattr(record, "job_id")) | |
| @dataclass | |
| class JobFinished: | |
| job_id: int | |
| @classmethod | |
| def from_record(cls, record: LogRecord) -> "JobFinished": | |
| return cls(job_id=getattr(record, "job_id", 0)) |
🤖 Prompt for AI Agents
In src/snakemake_interface_logger_plugins/events.py around lines 108 to 115, the
JobFinished dataclass requires a job_id field but does not provide a default
value, which can cause an AttributeError if job_id is missing in the LogRecord.
Modify the from_record method to safely access job_id using getattr with a
default value (e.g., None or a sentinel value) to avoid exceptions when the
attribute is absent.
johanneskoester
left a comment
There was a problem hiding this comment.
I like this a lot!! Very nice design.
|
But it will be a breaking change, right? Then make sure to add an exclamation mark behind the feat in the title, i.e. |
This is not breaking because its separate from the LogEvent enum. |
|
Any plans to merge this soon? |
|
Yes, have been side tracked. Hope to look at this tomorrow or Friday. I
wanted to see if there were any events that should be added/modified before
merged. At least I want to see if there is worth in a DAG event to solve
cademirch/snkmt#6
…On Wed, Oct 15, 2025 at 21:45 Jared Lumpe ***@***.***> wrote:
*jlumpe* left a comment (snakemake/snakemake-interface-logger-plugins#32)
<#32 (comment)>
Any plans to merge this soon?
—
Reply to this email directly, view it on GitHub
<#32 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKVQJ4URREPRO7FRBDCO4333X4PG7AVCNFSM6AAAAACCJVBVD6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMBZGE2TONJYGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***
com>
|
|
No rush, I'm just excited to see this addition. |
I've also made some of my own additions in #45 for reference. |
|
I'm also somewhat confused about the I feel like the error-related events should only include the Snakemake-specific information. Here's a list of what I think that could be:
This could all be implemented in one or multiple event types. |
- Added `LogEventData` base class. - Added explicit mapping between `LogEvent` enum values and `LogEventData` subclasses (`LogEventData.event` class attribute, and `LOG_EVENT_CLASSES` global var). - Added default implementation of `from_record()` class method that just inspects the dataclass fields: - Allows attributes to be missing from the record if the corresponding field has a default value, raises an exception otherwise. - Most subclasses just use the default implementation. - This changes the previous behavior in a few places, where a default value was applied in `from_record()` that was different than the dataclass field default. Still to be done: - Add `data_from_record()` function which dispatches to proper subclass based on `record.event`. - Should the `from_record()` class method be made private in this case? - Add `to_extra()` method, which converts to the dict to be passed as the `extra` argument to a logging function. - Tests for round trip of data to/from `LogRecord` instance.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/tests.py (1)
178-216: Add a regression case for raw field-name and legacy-key payloads.This only round-trips dictionaries produced by
extra(), so it never exercises the two parser inputs this PR needs to support: canonical dataclass fields (job_id,job_ids,group_id) and legacy Snakemake keys (jobid,groupid,cmd). A focused test for both shapes would catch the compatibility bugs here.Example test shape
+ `@pytest.mark.parametrize`( + ("event", "attrs", "expected_type"), + [ + ( + LogEvent.JOB_INFO, + {"job_id": 123, "rule_name": "example_rule", "threads": 4}, + logevents.JobInfoEvent, + ), + ( + LogEvent.JOB_FINISHED, + {"jobid": 123}, + logevents.JobFinishedEvent, + ), + ( + LogEvent.GROUP_INFO, + {"groupid": "group1"}, + logevents.GroupInfoEvent, + ), + ( + LogEvent.SHELLCMD, + {"jobid": 123, "cmd": "echo hi"}, + logevents.ShellCmdEvent, + ), + ], + ) + def test_from_extra_accepts_canonical_and_legacy_keys( + self, event, attrs, expected_type + ): + data = logevents.LogEventData.from_extra({"event": event, **attrs}) + assert isinstance(data, expected_type)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/tests.py` around lines 178 - 216, Extend the test_conversion regression to also cover raw dataclass field names and legacy Snakemake keys by creating extra payload shapes that use canonical fields (job_id, job_ids, group_id, cmd where applicable) and legacy keys (jobid, groupid, cmd) and asserting LogEventData.from_extra and from_record correctly parse and round-trip them; update the loop over self.EVENTS in test_conversion to, for each data, build two additional extras (one using the dataclass field names and one using legacy keys) derived from data.extra() and then assert that logevents.LogEventData.from_extra(...) and logevents.LogEventData.from_record(...) return the original or equal LogEventData instances for both shapes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake_interface_logger_plugins/events.py`:
- Around line 239-243: JobFinishedEvent.job_id and GroupInfoEvent.group_id must
retain legacy key names for compatibility: update their serialization/parsing to
accept and emit both the canonical names (job_id, group_id) and the legacy short
names ("jobid", "groupid"); specifically modify the (de)serialization mapping
used by the LogEventData subclass handling (or the to_dict/from_dict helpers) so
that JobFinishedEvent and GroupInfoEvent recognize incoming "jobid"/"groupid"
when parsing and also include the legacy keys when serializing alongside the
canonical keys.
- Around line 42-49: _in _from_extra_default (used by from_extra), the loop only
checks the alias key in extra and rejects inputs that use the dataclass
canonical field names; change the lookup to accept the canonical name first
(fld.name) and fall back to the legacy alias (fld.metadata.get("alias")), so for
each fld: if fld.name in extra use that value, else if alias in extra use the
alias value, otherwise raise the existing ValueError when no default exists.
Update references to fld.name, fld.metadata.get("alias", ...),
_from_extra_default and from_extra accordingly.
---
Nitpick comments:
In `@tests/tests.py`:
- Around line 178-216: Extend the test_conversion regression to also cover raw
dataclass field names and legacy Snakemake keys by creating extra payload shapes
that use canonical fields (job_id, job_ids, group_id, cmd where applicable) and
legacy keys (jobid, groupid, cmd) and asserting LogEventData.from_extra and
from_record correctly parse and round-trip them; update the loop over
self.EVENTS in test_conversion to, for each data, build two additional extras
(one using the dataclass field names and one using legacy keys) derived from
data.extra() and then assert that logevents.LogEventData.from_extra(...) and
logevents.LogEventData.from_record(...) return the original or equal
LogEventData instances for both shapes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 626739fa-680a-4d0b-bfa9-ff42efd30e1b
📒 Files selected for processing (2)
src/snakemake_interface_logger_plugins/events.pytests/tests.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
README.md (1)
220-235:⚠️ Potential issue | 🟡 MinorUpdate the table to match the shipped event classes and fields.
The rest of this PR uses
ErrorEvent/WorkflowStartedEvent/JobInfoEvent/ ... with normalized fields likejob_idandgroup_id(tests/tests.pyLines 82-143;src/snakemake_interface_logger_plugins/events.py:460-481), but the table still documentsError,JobInfo,jobid,groupid, and extraJOB_INFOfields likeshellcmd,priority, andresources. Readers won't be able to line this up with the actual constructors/attributes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 220 - 235, The README table is out of sync: replace the old class/event names and field names with the shipped ones (e.g. ErrorEvent, WorkflowStartedEvent, JobInfoEvent, JobStartedEvent, JobFinishedEvent, GroupInfoEvent/GroupErrorEvent, etc.) and update field names to the normalized attributes used in the codebase (e.g. job_id, group_id, per_rule_job_counts, shell_cmd if used, aux_logs, job_error_info, etc.) so the table matches the constructors/attributes in src/snakemake_interface_logger_plugins/events.py and the tests (tests/tests.py lines around 82-143); ensure each LogEvent enum entry maps to the corresponding Event class name and its actual fields from events.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Around line 243-284: The example re-implements event dispatching instead of
using the shared parser; replace the custom mapping in parse_event with a call
to events.LogEventData.from_record(record) (i.e., have parse_event return
events.LogEventData.from_record(record) or None) and update the emit method to
check the returned LogEventData subclasses (e.g., compare isinstance(event_data,
events.JobErrorEvent or the exact classes exposed by events.LogEventData) and
use their fields), removing the manual event_map and dataclass_type logic so the
example uses the central parsing logic in events.LogEventData.from_record.
In `@tests/tests.py`:
- Around line 1-18: The test setup refers to RichLogHandler in
TestRegistry.reset_registry() but tests/tests.py no longer imports or defines
RichLogHandler, causing a NameError; restore an import or definition for
RichLogHandler so reset_registry() can instantiate it—specifically add an import
for RichLogHandler (or reintroduce a minimal definition) at the top of
tests/tests.py so the symbol is available to TestRegistry.reset_registry() and
any other tests that instantiate RichLogHandler.
---
Duplicate comments:
In `@README.md`:
- Around line 220-235: The README table is out of sync: replace the old
class/event names and field names with the shipped ones (e.g. ErrorEvent,
WorkflowStartedEvent, JobInfoEvent, JobStartedEvent, JobFinishedEvent,
GroupInfoEvent/GroupErrorEvent, etc.) and update field names to the normalized
attributes used in the codebase (e.g. job_id, group_id, per_rule_job_counts,
shell_cmd if used, aux_logs, job_error_info, etc.) so the table matches the
constructors/attributes in src/snakemake_interface_logger_plugins/events.py and
the tests (tests/tests.py lines around 82-143); ensure each LogEvent enum entry
maps to the corresponding Event class name and its actual fields from events.py.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: beaca467-ce01-4a92-b95a-4a43e787f9b3
📒 Files selected for processing (2)
README.mdtests/tests.py
README.md
Outdated
| from snakemake_interface_logger_plugins.common import LogEvent | ||
| from snakemake_interface_logger_plugins import events | ||
|
|
||
| def parse_event(record): | ||
| if not hasattr(record, "event"): | ||
| return None | ||
|
|
||
| event = record.event | ||
|
|
||
| # Map LogEvent to the corresponding dataclass | ||
| event_map = { | ||
| LogEvent.ERROR: events.Error, | ||
| LogEvent.WORKFLOW_STARTED: events.WorkflowStarted, | ||
| LogEvent.JOB_INFO: events.JobInfo, | ||
| LogEvent.JOB_STARTED: events.JobStarted, | ||
| LogEvent.JOB_FINISHED: events.JobFinished, | ||
| LogEvent.SHELLCMD: events.ShellCmd, | ||
| LogEvent.JOB_ERROR: events.JobError, | ||
| LogEvent.GROUP_INFO: events.GroupInfo, | ||
| LogEvent.GROUP_ERROR: events.GroupError, | ||
| LogEvent.RESOURCES_INFO: events.ResourcesInfo, | ||
| LogEvent.DEBUG_DAG: events.DebugDag, | ||
| LogEvent.PROGRESS: events.Progress, | ||
| LogEvent.RULEGRAPH: events.RuleGraph, | ||
| LogEvent.RUN_INFO: events.RunInfo, | ||
| } | ||
|
|
||
| dataclass_type = event_map.get(event) | ||
| if dataclass_type is not None: | ||
| return dataclass_type.from_record(record) | ||
| else: | ||
| return None | ||
|
|
||
| # Usage in a log handler: | ||
| def emit(self, record): | ||
| if hasattr(record, 'event'): | ||
| if record.event == LogEvent.JOB_ERROR: | ||
| # Access job error fields | ||
| jobid = getattr(record, 'jobid', 0) | ||
| # Handle job errors | ||
| pass | ||
| elif record.event == LogEvent.JOB_FINISHED: | ||
| # Access job completion fields | ||
| job_id = getattr(record, 'job_id', 0) | ||
| # Handle job completion | ||
| pass | ||
| elif record.event == LogEvent.PROGRESS: | ||
| # Access progress fields | ||
| done = getattr(record, 'done', 0) | ||
| total = getattr(record, 'total', 0) | ||
| # Handle progress updates | ||
| pass | ||
| ``` | ||
|
|
||
| Always use `getattr(record, 'field_name', default_value)` or check with `hasattr(record, 'field_name')` before accessing fields, as not all fields may be present in every record. No newline at end of file | ||
| event_data = parse_event(record) | ||
| if event_data: | ||
| # Now you can access event-specific fields, e.g.: | ||
| if isinstance(event_data, events.JobError): | ||
| print(f"Job error for jobid: {event_data.jobid}") | ||
| elif isinstance(event_data, events.Progress): | ||
| print(f"Progress: {event_data.done}/{event_data.total}") |
There was a problem hiding this comment.
Use LogEventData.from_record() in the example instead of a custom dispatcher.
src/snakemake_interface_logger_plugins/events.py:120-154 already exposes the parsing logic. Rebuilding the mapping here is redundant, and it has already drifted from the implementation (events.Error, events.JobInfo, etc. vs the *Event classes used elsewhere in this PR).
📌 Suggested doc update
-from snakemake_interface_logger_plugins.common import LogEvent
from snakemake_interface_logger_plugins import events
-def parse_event(record):
- if not hasattr(record, "event"):
- return None
-
- event = record.event
-
- # Map LogEvent to the corresponding dataclass
- event_map = {
- LogEvent.ERROR: events.Error,
- LogEvent.WORKFLOW_STARTED: events.WorkflowStarted,
- LogEvent.JOB_INFO: events.JobInfo,
- LogEvent.JOB_STARTED: events.JobStarted,
- LogEvent.JOB_FINISHED: events.JobFinished,
- LogEvent.SHELLCMD: events.ShellCmd,
- LogEvent.JOB_ERROR: events.JobError,
- LogEvent.GROUP_INFO: events.GroupInfo,
- LogEvent.GROUP_ERROR: events.GroupError,
- LogEvent.RESOURCES_INFO: events.ResourcesInfo,
- LogEvent.DEBUG_DAG: events.DebugDag,
- LogEvent.PROGRESS: events.Progress,
- LogEvent.RULEGRAPH: events.RuleGraph,
- LogEvent.RUN_INFO: events.RunInfo,
- }
-
- dataclass_type = event_map.get(event)
- if dataclass_type is not None:
- return dataclass_type.from_record(record)
- else:
- return None
-
-# Usage in a log handler:
def emit(self, record):
- event_data = parse_event(record)
- if event_data:
- # Now you can access event-specific fields, e.g.:
- if isinstance(event_data, events.JobError):
- print(f"Job error for jobid: {event_data.jobid}")
- elif isinstance(event_data, events.Progress):
- print(f"Progress: {event_data.done}/{event_data.total}")
+ event_data = events.LogEventData.from_record(record)
+ if event_data is None:
+ return
+
+ if isinstance(event_data, events.JobErrorEvent):
+ print(f"Job error for job_id: {event_data.job_id}")
+ elif isinstance(event_data, events.ProgressEvent):
+ print(f"Progress: {event_data.done}/{event_data.total}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` around lines 243 - 284, The example re-implements event
dispatching instead of using the shared parser; replace the custom mapping in
parse_event with a call to events.LogEventData.from_record(record) (i.e., have
parse_event return events.LogEventData.from_record(record) or None) and update
the emit method to check the returned LogEventData subclasses (e.g., compare
isinstance(event_data, events.JobErrorEvent or the exact classes exposed by
events.LogEventData) and use their fields), removing the manual event_map and
dataclass_type logic so the example uses the central parsing logic in
events.LogEventData.from_record.
e3abc6d to
257bcf2
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
README.md (1)
286-339:⚠️ Potential issue | 🟠 MajorREADME still points to the removed pre-rename event classes.
The table and sample dispatcher use
events.Error,events.JobInfo, etc., butevents.pynow exportsErrorEvent,JobInfoEvent, and the other*Eventclasses. Copying this example will currently raiseAttributeError, and the hand-writtenevent_mapcan drift again fromLogEventData.from_record(). Please switch the docs to the shared parser and the actual exported class names.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 286 - 339, The README references removed dataclass names (e.g., events.Error, events.JobInfo) and a manual event_map instead of the current exported classes and shared parser; update the table and the example to use the real exported names (e.g., ErrorEvent, JobInfoEvent, JobStartedEvent, etc.) and replace the handwritten mapping/dispatch with the shared parser function LogEventData.from_record() (or show how to call that helper) so the docs match the actual symbols (LogEvent, LogEventData.from_record, and the *Event classes) and cannot drift.src/snakemake_interface_logger_plugins/events.py (1)
42-49:⚠️ Potential issue | 🟠 MajorAccept canonical dataclass keys before falling back to legacy aliases.
_from_extra_default()only reads the alias key once a field defines one. That meansLogEventData.from_extra({**asdict(job_event), "event": ...})still fails for aliased fields likejob_id/jobid, which is one of the integration paths this PR is documenting.Suggested fix
for fld in fields(cls): if fld.name in kw: continue - name = fld.metadata.get("alias", fld.name) - if name in extra: - kw[fld.name] = extra[name] + alias = fld.metadata.get("alias") + if fld.name in extra: + kw[fld.name] = extra[fld.name] + elif alias is not None and alias in extra: + kw[fld.name] = extra[alias] elif not field_has_default(fld): - raise ValueError(f"LogRecord missing required attribute {name!r}") + raise ValueError( + f"LogRecord missing required attribute {(alias or fld.name)!r}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/snakemake_interface_logger_plugins/events.py` around lines 42 - 49, The loop in _from_extra_default currently only checks the alias (fld.metadata["alias"]) against extra, causing canonical dataclass keys (fld.name) to be ignored; modify the loop in the function that iterates over fields(cls) so it first looks for the canonical key in extra (e.g., if fld.name in extra then kw[fld.name] = extra[fld.name]), then falls back to the alias (name = fld.metadata.get("alias", fld.name) and check extra[name]) before raising the ValueError; preserve the existing guard that skips fields already in kw and retain the field_has_default() check for required fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake_interface_logger_plugins/events.py`:
- Line 305: ResourcesInfoEvent.nodes is incorrectly typed as Optional[list[str]]
while the docstring describes it as the number of provided remote nodes; change
the type annotation on ResourcesInfoEvent.nodes to an integer (e.g.,
Optional[int] or int as appropriate) and update any related
validation/serialization code in the ResourcesInfoEvent class to treat nodes as
a count rather than a list so consumers receive the correct numeric contract.
---
Duplicate comments:
In `@README.md`:
- Around line 286-339: The README references removed dataclass names (e.g.,
events.Error, events.JobInfo) and a manual event_map instead of the current
exported classes and shared parser; update the table and the example to use the
real exported names (e.g., ErrorEvent, JobInfoEvent, JobStartedEvent, etc.) and
replace the handwritten mapping/dispatch with the shared parser function
LogEventData.from_record() (or show how to call that helper) so the docs match
the actual symbols (LogEvent, LogEventData.from_record, and the *Event classes)
and cannot drift.
In `@src/snakemake_interface_logger_plugins/events.py`:
- Around line 42-49: The loop in _from_extra_default currently only checks the
alias (fld.metadata["alias"]) against extra, causing canonical dataclass keys
(fld.name) to be ignored; modify the loop in the function that iterates over
fields(cls) so it first looks for the canonical key in extra (e.g., if fld.name
in extra then kw[fld.name] = extra[fld.name]), then falls back to the alias
(name = fld.metadata.get("alias", fld.name) and check extra[name]) before
raising the ValueError; preserve the existing guard that skips fields already in
kw and retain the field_has_default() check for required fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0ec7606c-80d0-45cb-bc75-ab70dd6137fa
⛔ Files ignored due to path filters (1)
pyproject.tomlis excluded by!pyproject.toml
📒 Files selected for processing (3)
README.mdsrc/snakemake_interface_logger_plugins/events.pytests/tests.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/tests.py
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/snakemake_interface_logger_plugins/events.py (1)
312-319: Consider serialization implications of storing exception objects.The
exceptionfield stores aBaseExceptiondirectly. While valid, this may cause issues if the event is serialized (e.g., pickled or JSON-encoded). If serialization is expected, consider storing the exception type and message as strings instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/snakemake_interface_logger_plugins/events.py` around lines 312 - 319, The DebugDagEvent dataclass currently stores an exception object in the exception field which can break serialization; change the model to store serializable representations (e.g., exception_type: Optional[str] and exception_message: Optional[str]) instead of BaseException, or serialize the exception into these strings inside DebugDagEvent (and optionally keep a transient/non-serialized raw_exception attribute if runtime access is needed); update any code paths that construct DebugDagEvent (references to DebugDagEvent and its exception field) to populate exception_type and exception_message from type(exc).__name__ and str(exc) (or repr) and remove reliance on the BaseException instance for serialization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake_interface_logger_plugins/events.py`:
- Around line 172-188: The WorkflowStartedEvent dataclass declares snakefile:
Optional[str] but provides no default, so _from_extra_default will treat it as
required and raise if missing; fix by giving the dataclass field a default of
None (change the declaration in WorkflowStartedEvent to snakefile: Optional[str]
= None) so missing snakefile in extra is allowed, and keep the _from_extra logic
that converts present values to str and passes snakefile=snakefile into
_from_extra_default.
In `@tests/tests.py`:
- Around line 97-105: Remove the duplicate test event: in tests.py within the
EVENTS list remove the redundant logevents.JobErrorEvent(job_id=123) so only one
JobErrorEvent for job_id=123 remains; locate the EVENTS list and delete the
second occurrence (the one after logevents.ShellCmdEvent) to prevent duplicated
test input affecting assertions that rely on unique events.
---
Nitpick comments:
In `@src/snakemake_interface_logger_plugins/events.py`:
- Around line 312-319: The DebugDagEvent dataclass currently stores an exception
object in the exception field which can break serialization; change the model to
store serializable representations (e.g., exception_type: Optional[str] and
exception_message: Optional[str]) instead of BaseException, or serialize the
exception into these strings inside DebugDagEvent (and optionally keep a
transient/non-serialized raw_exception attribute if runtime access is needed);
update any code paths that construct DebugDagEvent (references to DebugDagEvent
and its exception field) to populate exception_type and exception_message from
type(exc).__name__ and str(exc) (or repr) and remove reliance on the
BaseException instance for serialization.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ba850ec3-7741-4342-9cea-123c605eea75
📒 Files selected for processing (2)
src/snakemake_interface_logger_plugins/events.pytests/tests.py
| @dataclass | ||
| class WorkflowStartedEvent(LogEventData): | ||
| event = LogEvent.WORKFLOW_STARTED | ||
|
|
||
| workflow_id: uuid.UUID | ||
| snakefile: Optional[str] | ||
|
|
||
| @classmethod | ||
| def _from_extra(cls, extra: StrMap) -> Self: | ||
| snakefile = extra.get("snakefile", None) | ||
| if snakefile is not None: | ||
| try: | ||
| # Try to convert to string - this should work for PosixPath and other path-like objects | ||
| snakefile = str(snakefile) | ||
| except (TypeError, ValueError) as e: | ||
| raise ValueError(f"Could not convert snakefile to string: {e}") | ||
| return _from_extra_default(cls, extra, snakefile=snakefile) |
There was a problem hiding this comment.
Missing default for optional snakefile field.
The snakefile field is typed as Optional[str] but has no default value. This means _from_extra_default will raise a ValueError if snakefile is missing from the extra dict, even though None is a valid value per the type hint.
Proposed fix
`@dataclass`
class WorkflowStartedEvent(LogEventData):
event = LogEvent.WORKFLOW_STARTED
workflow_id: uuid.UUID
- snakefile: Optional[str]
+ snakefile: Optional[str] = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/snakemake_interface_logger_plugins/events.py` around lines 172 - 188, The
WorkflowStartedEvent dataclass declares snakefile: Optional[str] but provides no
default, so _from_extra_default will treat it as required and raise if missing;
fix by giving the dataclass field a default of None (change the declaration in
WorkflowStartedEvent to snakefile: Optional[str] = None) so missing snakefile in
extra is allowed, and keep the _from_extra logic that converts present values to
str and passes snakefile=snakefile into _from_extra_default.
| logevents.JobErrorEvent(job_id=123), | ||
| logevents.JobStartedEvent(job_ids=[123, 321]), | ||
| logevents.JobFinishedEvent(job_id=123), | ||
| logevents.ShellCmdEvent( | ||
| job_id=123, | ||
| shellcmd="echo 'Hello, World!'", | ||
| rule_name="example_rule", | ||
| ), | ||
| logevents.JobErrorEvent(job_id=123), |
There was a problem hiding this comment.
Duplicate JobErrorEvent in test data.
JobErrorEvent(job_id=123) appears twice in the EVENTS list (lines 97 and 105). This is likely unintentional and could mask test failures if the duplicate causes issues.
Proposed fix - remove duplicate
logevents.JobInfoEvent(
job_id=123,
rule_name="example_rule",
threads=4,
input=["input1.txt", "input2.txt"],
output=["output.txt"],
log=["log.txt"],
benchmark="benchmark.csv",
wildcards={"foo": "bar"},
),
- logevents.JobErrorEvent(job_id=123),
logevents.JobStartedEvent(job_ids=[123, 321]),
logevents.JobFinishedEvent(job_id=123),
logevents.ShellCmdEvent(
job_id=123,
shellcmd="echo 'Hello, World!'",
rule_name="example_rule",
),
logevents.JobErrorEvent(job_id=123),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logevents.JobErrorEvent(job_id=123), | |
| logevents.JobStartedEvent(job_ids=[123, 321]), | |
| logevents.JobFinishedEvent(job_id=123), | |
| logevents.ShellCmdEvent( | |
| job_id=123, | |
| shellcmd="echo 'Hello, World!'", | |
| rule_name="example_rule", | |
| ), | |
| logevents.JobErrorEvent(job_id=123), | |
| logevents.JobStartedEvent(job_ids=[123, 321]), | |
| logevents.JobFinishedEvent(job_id=123), | |
| logevents.ShellCmdEvent( | |
| job_id=123, | |
| shellcmd="echo 'Hello, World!'", | |
| rule_name="example_rule", | |
| ), | |
| logevents.JobErrorEvent(job_id=123), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/tests.py` around lines 97 - 105, Remove the duplicate test event: in
tests.py within the EVENTS list remove the redundant
logevents.JobErrorEvent(job_id=123) so only one JobErrorEvent for job_id=123
remains; locate the EVENTS list and delete the second occurrence (the one after
logevents.ShellCmdEvent) to prevent duplicated test input affecting assertions
that rely on unique events.
This adds logevent dataclasses for conviently parsing log records into a dataclass.
This could also be extended to provide formatting to simplify logging upstream. This would move the complex formatting out of Snakemake's current DefaultFormatter and to the LogEvent.
Example:
Summary by CodeRabbit
New Features
Documentation
Tests