Skip to content

[CI verification] Fix pdb / breakpoint() hang in workflow code#3

Open
elidlocke wants to merge 3 commits into
mainfrom
pdb-hang-repro
Open

[CI verification] Fix pdb / breakpoint() hang in workflow code#3
elidlocke wants to merge 3 commits into
mainfrom
pdb-hang-repro

Conversation

@elidlocke

Copy link
Copy Markdown
Owner

Self-PR on fork to run the full CI matrix for upstream PR
temporalio#1568 while upstream CI is gated on
first-time-contributor approval.

Do not merge. This branch is what we want upstream to take.

Diff should be exactly 3 files:

  • temporalio/worker/_workflow.py
  • temporalio/worker/_debugger.py (new)
  • tests/worker/test_breakpoint_hang.py (new)
  • README.md

elidlocke and others added 3 commits June 8, 2026 12:31
When debug_mode=True (or TEMPORAL_DEBUG=1), breakpoint() inside workflow
code now opens an interactive pdb prompt -- including from a sandboxed
workflow run under pytest. Four pieces:

- Inline dispatch on the asyncio main thread (via loop.call_soon to
  avoid nesting inside the dispatch task's __step() and tripping
  Python 3.14's task-entry validation).
- breakpoint removed from the sandbox's invalid builtins so the call
  reaches the worker hook. Nothing else is relaxed.
- A Pdb subclass that lands at the workflow's own frame, suspends
  sandbox checks during each REPL interaction, and overrides q/Ctrl-D
  to continue the workflow instead of failing it with BdbQuit.
- A defensive sys.breakpointhook that raises a clear RuntimeError when
  breakpoint() is called from a workflow worker thread without
  debug_mode, replacing the previous silent hang.

When debug_mode is not set, the worker's dispatch and sandbox config
are unchanged.

Adds a README subsection on debugging workflows and five tests at
tests/worker/test_breakpoint_hang.py. Verified on Python 3.13 and 3.14.

Closes temporalio#1104.
* Fall back to model_dump_json for OpenAI payload serialization

OpenAI response and stream event types whose pydantic serializer is a
lazily-built MockValSer cannot be serialized by the generic any-schema
serializer, raising PydanticSerializationError (e.g. when streaming via
WorkflowStreamClient). The model's own model_dump_json() handles them.

Fixes temporalio#1585

* Dispatch pydantic models to their own serializer

OpenAI's BaseModel sets defer_build=True, so a model's serializer is a
MockValSer placeholder until pydantic's lazy build runs. The generic any-schema
serializer reaches for that placeholder directly without triggering the build
and raises PydanticSerializationError. Route pydantic models through their own
model_dump_json (which triggers the build) by type instead of catching the
error; non-model values continue through the generic serializer unchanged.

* Build streamed events at the source instead of in the converter

Force the deferred pydantic build on each streamed event before it is published
or returned, so it serializes regardless of build state. This also covers the
activity's list return value, which the payload converter serializes generically
and cannot build on its own. Drop the now-redundant to_payload override.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants