feat(logging): per-run log directory with unified main.jsonl (DIM-685)#1437
feat(logging): per-run log directory with unified main.jsonl (DIM-685)#1437spomichter wants to merge 1 commit intodevfrom
Conversation
Greptile SummaryThis PR consolidates per-process log files into a single Key issues found:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant M as Main Process<br/>(dimos.py import)
participant LC as logging_config
participant E as os.environ
participant W as Worker Process
M->>LC: setup_logger() [at import time]
LC->>LC: _configure_structlog()<br/>_RUN_LOG_DIR=None, env var unset
LC->>LC: _get_log_file_path() → legacy dimos_<ts>_<pid>.jsonl
LC-->>M: FileHandler → legacy path (cached in stdlib logger)
Note over M: run() is called
M->>LC: set_run_log_dir(log_dir)
LC->>LC: _RUN_LOG_DIR = log_dir<br/>_LOG_FILE_PATH = None ← cache cleared
LC->>E: DIMOS_RUN_LOG_DIR = log_dir
M->>M: logger.info("Logging to ...")
Note over M,LC: ⚠️ Still uses legacy FileHandler!<br/>main.jsonl is NOT written by main process
M->>M: blueprint.build() → forkserver starts workers
W->>E: inherit DIMOS_RUN_LOG_DIR
W->>LC: setup_logger() [fresh process]
LC->>LC: _RUN_LOG_DIR=None, env var SET
LC->>LC: _get_log_file_path() → logs/<run-id>/main.jsonl
LC-->>W: FileHandler → main.jsonl ✅
Note over M,W: Main process → legacy file<br/>Workers → main.jsonl<br/>Goal of unified log NOT fully achieved
|
| blueprint_name = "-".join(robot_types) | ||
| safe_name = blueprint_name.replace("/", "-").replace(" ", "-") | ||
| run_id = f"{_time.strftime('%Y%m%d-%H%M%S')}-{safe_name}" |
There was a problem hiding this comment.
Incomplete safe_name sanitization for directory names
The current sanitization only replaces / and space, but filesystem-unfriendly characters in robot type names (e.g., :, *, ?, \, |, <, >) would still be included in the directory name and could cause failures on Windows or certain Linux filesystems/tools.
Consider using a more comprehensive approach:
| blueprint_name = "-".join(robot_types) | |
| safe_name = blueprint_name.replace("/", "-").replace(" ", "-") | |
| run_id = f"{_time.strftime('%Y%m%d-%H%M%S')}-{safe_name}" | |
| safe_name = "".join(c if c.isalnum() or c in "-_." else "-" for c in blueprint_name) |
| file_handler = logging.FileHandler( | ||
| log_file_path, | ||
| mode="a", | ||
| maxBytes=10 * 1024 * 1024, # 10MiB | ||
| backupCount=20, | ||
| encoding="utf-8", | ||
| ) | ||
|
|
There was a problem hiding this comment.
No file size limit after removing RotatingFileHandler
Switching to plain FileHandler is the right call for multi-process safety (rotation races with forkserver workers are a real problem), but it means main.jsonl can now grow without bound. For long-running robot sessions with verbose logging this could fill the disk.
Consider capping growth with an explicit note, or adding a maxBytes guard at startup (e.g. warn or truncate if the file already exceeds a threshold), or at least documenting the expected maintenance story (e.g. "rotate manually" or "logrotate config provided") so operators know what to do.
Problem
Closes DIM-685
every forkserver worker creates its own log file — a single run produces 6+ scattered
dimos_<ts>_<pid>.jsonlfiles. agents can't grep one file for all output, and there's no way to tell which files belong to which run.Solution
all processes (main + workers) now write to one file per run:
how it works:
set_run_log_dir()setsos.environ["DIMOS_RUN_LOG_DIR"]beforeblueprint.build()_get_log_file_path()→ write to samemain.jsonlFileHandlerinstead ofRotatingFileHandler(avoids multi-process rotation races)zero changes to worker.py.
Breaking Changes
None
How to Test
Contributor License Agreement