Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7d9f34e
Add Deno/Pyodide executor skeleton
xpcmdshell Feb 8, 2026
a387967
Implement micropip deps install + network presets for Deno/Pyodide
xpcmdshell Feb 8, 2026
0a1f671
Type network_profile as Literal
xpcmdshell Feb 8, 2026
602f940
Fix tools proxy + expand Deno/Pyodide executor integration tests
xpcmdshell Feb 8, 2026
3dd8722
Add deeper Deno/Pyodide executor validation tests
xpcmdshell Feb 8, 2026
dca25fa
DenoPyodide: async sandbox RPC + chunked responses
xpcmdshell Feb 8, 2026
8c53419
DenoPyodide: test chunked RPC for large tool output
xpcmdshell Feb 8, 2026
e9b0338
Async sandbox: awaitable namespaces in in-process and subprocess
xpcmdshell Feb 8, 2026
448fd9c
ToolsNamespace: allow awaitable calls even when loop is set
xpcmdshell Feb 8, 2026
67f8e2e
DenoPyodide: add workflows.invoke
xpcmdshell Feb 8, 2026
985c83b
docs: document DenoPyodide executor and async-first sandbox API
xpcmdshell Feb 8, 2026
84a5ff2
Deno sandbox: friendly aliases + docs
xpcmdshell Feb 8, 2026
4ccb284
Deno sandbox: rename + keep agent API sync
xpcmdshell Feb 8, 2026
492c877
Docs: clarify DenoSandbox tool execution boundary
xpcmdshell Feb 8, 2026
163d055
Tools: add middleware plumbing (DenoSandbox wired)
xpcmdshell Feb 9, 2026
4561beb
Tools middleware: enforce on new adapters + unify dispatch
xpcmdshell Feb 9, 2026
ae3949e
Docs: mention tool middleware hook
xpcmdshell Feb 9, 2026
bdb7f55
Docs: note tool middleware in architecture
xpcmdshell Feb 9, 2026
e72f2aa
Examples: add DenoSandboxExecutor
xpcmdshell Feb 9, 2026
0a00e25
DenoSandbox: test workflow invoking another workflow
xpcmdshell Feb 9, 2026
f2038e4
CI: run DenoSandbox integration tests
xpcmdshell Feb 9, 2026
d2e4011
Tests: define pytest markers + auto-mark xdist groups
xpcmdshell Feb 9, 2026
df7a182
Docs: update AGENTS.md for DenoSandbox + test markers
xpcmdshell Feb 9, 2026
88d30e3
Docs: expand AGENTS.md test subset commands
xpcmdshell Feb 9, 2026
664cd7e
Prune performative tests and improve docker test gating
xpcmdshell Feb 9, 2026
73fba7f
Mark redis testcontainers as docker and skip when Docker unavailable
xpcmdshell Feb 9, 2026
1ea6930
Align pre-commit ruff with CI
xpcmdshell Feb 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,23 @@ jobs:
verbose: true
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}

test-deno-sandbox:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: astral-sh/setup-uv@v7
- uses: denoland/setup-deno@v2
- name: Cache Deno modules
uses: actions/cache@v5
with:
path: ~/.cache/deno
key: deno-${{ runner.os }}-${{ hashFiles('src/py_code_mode/execution/deno_sandbox/runner/**') }}
restore-keys: |
deno-${{ runner.os }}-
- run: uv sync --all-extras
- name: Run Deno sandbox integration tests
env:
PY_CODE_MODE_TEST_DENO: "1"
DENO_DIR: ~/.cache/deno
run: uv run pytest -n 0 tests/test_deno_sandbox_imports.py tests/test_deno_sandbox_executor.py -v
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.8.0
# Keep in sync with the ruff version pinned in uv.lock / used in CI.
rev: v0.14.9
hooks:
- id: ruff
args: [--fix]
- id: ruff-format

18 changes: 18 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ src/py_code_mode/
subprocess/ # Jupyter kernel-based subprocess executor
container/ # Docker container executor
in_process/ # Same-process executor
deno_sandbox/ # Deno + Pyodide (WASM) sandbox executor (experimental)
workflows/ # Skill storage, library, and vector stores
tools/ # Tool adapters: CLI, MCP, HTTP
adapters/ # CLI, MCP, HTTP adapter implementations
Expand Down Expand Up @@ -76,6 +77,7 @@ When agents write code, four namespaces are available:
| SubprocessExecutor | Recommended default. Process isolation via Jupyter kernel. |
| ContainerExecutor | Docker isolation for untrusted code. |
| InProcessExecutor | Maximum speed for trusted code. |
| DenoSandboxExecutor | Sandboxed Python via Deno + Pyodide (WASM). Tools execute host-side. |

---

Expand All @@ -98,6 +100,21 @@ uv run pytest -k "test_workflow"

# Run without parallelism (for debugging)
uv run pytest -n 0

# Run Deno sandbox integration tests (requires Deno installed)
PY_CODE_MODE_TEST_DENO=1 uv run pytest -n 0 tests/test_deno_sandbox_executor.py -v

# Filter subsets (markers are defined in pyproject.toml)
uv run pytest -m "not docker"

# Common subsets
uv run pytest -m "not slow"
uv run pytest -m docker
uv run pytest -m subprocess
uv run pytest -m "not docker and not subprocess"

# CI uses all extras; for local repro of CI failures you may want:
uv sync --all-extras
```

### Linting and Type Checking
Expand Down Expand Up @@ -192,6 +209,7 @@ Tools are defined in YAML files. Key patterns:
- Container tests are in `tests/container/` - require Docker
- Use `@pytest.mark.xdist_group("group_name")` for tests that need isolation
- Redis tests use testcontainers - spin up automatically
- Markers like `docker`, `subprocess`, `venv`, and `deno` are used to filter test subsets

---

Expand Down
16 changes: 11 additions & 5 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -791,7 +791,7 @@ async with Session(storage=storage, executor=executor) as session:
### Tool Execution

```
Agent writes: "tools.curl.get(url='...')"
Agent writes: "tools.curl.get(url='...')" (use `await` only in DenoSandboxExecutor)
|
v
+------------------------+
Expand All @@ -808,6 +808,12 @@ Agent writes: "tools.curl.get(url='...')"
+--------------+
```

Note on sandboxing:
- `DenoSandboxExecutor` sandboxes Python execution in Pyodide, but **tools execute host-side** (the sandbox calls back to the host over RPC to run tools). If you need strict sandbox boundaries, avoid `tools.*` and stick to pure Python plus `deps.*` in the sandbox.

Note on tool middleware:
- Tool calls can be wrapped by a host-side middleware chain (audit logging, approvals, allow/deny, retries, etc.). Enforcement guarantees are strongest for `DenoSandboxExecutor`, because sandboxed Python can only access tools via host RPC.

### ToolProxy Methods

```
Expand All @@ -819,7 +825,7 @@ Agent writes: "tools.curl.get(url='...')"
| |
| .call_async(**kwargs) |--> Always returns awaitable
| .call_sync(**kwargs) |--> Always blocks, returns result
| .__call__(**kwargs) |--> Context-aware (sync/async detection)
| .__call__(**kwargs) |--> Synchronous invocation
+------------------------+
|
v
Expand All @@ -828,14 +834,14 @@ Agent writes: "tools.curl.get(url='...')"
| |
| .call_async(**kwargs) |--> Always returns awaitable
| .call_sync(**kwargs) |--> Always blocks, returns result
| .__call__(**kwargs) |--> Context-aware (sync/async detection)
| .__call__(**kwargs) |--> Synchronous invocation
+------------------------+
```

### Skill Execution

```
Agent writes: "workflows.analyze_repo(repo='...')"
Agent writes: "workflows.analyze_repo(repo='...')" (use `await` only in DenoSandboxExecutor)
|
v
+------------------------+
Expand Down Expand Up @@ -880,7 +886,7 @@ Skill has access to:
### Artifact Storage

```
Agent writes: "artifacts.save('data.json', b'...', 'description')"
Agent writes: "artifacts.save('data.json', b'...', 'description')" (use `await` only in DenoSandboxExecutor)
|
v
+------------------------+
Expand Down
78 changes: 68 additions & 10 deletions docs/executors.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,60 @@
# Executors

Executors determine where and how agent code runs. Three backends are available: Subprocess, Container, and InProcess.
Executors determine where and how agent code runs. Four backends are available: Subprocess, Container, InProcess, and DenoSandbox (experimental).

## DenoSandboxExecutor (Experimental, Sandboxed)

`DenoSandboxExecutor` runs Python in **Pyodide (WASM)** inside a **Deno** subprocess. It relies on the Deno permission model for sandboxing.

Notes:
- Backend key: `"deno-sandbox"`.
- Example: `examples/deno-sandbox/`.

Key differences vs the other executors:
- **Async-first sandbox API**: use `await tools.*`, `await workflows.*`, `await artifacts.*`, `await deps.*`.
- **Best-effort deps**: dependency installs run via Pyodide `micropip` and many packages (especially those requiring native extensions) will not work.
- **Tool middleware support (host-side)**: you can attach tool call middleware via `DenoSandboxConfig.tool_middlewares` (useful for audit logging, approvals, allow/deny, etc.).

```python
from pathlib import Path
from py_code_mode import Session, FileStorage
from py_code_mode.execution import DenoSandboxConfig, DenoSandboxExecutor

storage = FileStorage(base_path=Path("./data"))

config = DenoSandboxConfig(
tools_path=Path("./tools"),
deno_dir=Path("./.deno-cache"), # Deno cache directory (used with --cached-only)
network_profile="deps-only", # "none" | "deps-only" | "full"
default_timeout=60.0,
)

executor = DenoSandboxExecutor(config)

async with Session(storage=storage, executor=executor) as session:
result = await session.run("await tools.list()")
```

### Security Model: Where Tools Execute

`DenoSandboxExecutor` sandboxes **Python execution** (the Pyodide runtime) inside a Deno subprocess. However, **tool execution is host-side**:
- If your agent calls `tools.*` while using `DenoSandboxExecutor`, the call is proxied over RPC back to the host Python process, and the tool runs there (using the configured ToolAdapters).
- This means a YAML tool that can read files, run commands, or access the network will do so with **host permissions**, not Deno sandbox permissions.

Practical guidance:
- If you want "true sandboxed code exec", keep agent code to **pure Python + `deps.*`** (Pyodide `micropip`) and avoid `tools.*`.
- If you attach host tools, treat them as a privileged escape hatch from the sandbox boundary.

### Network Profiles

`DenoSandboxConfig.network_profile` controls network access for the Deno subprocess:
- `none`: deny all network access (no runtime dep installs)
- `deps-only`: allow access to PyPI/CDN hosts needed for common `micropip` installs
- `full`: allow all network access

### Timeout And Reset

Timeouts are **soft** (the host stops waiting). If an execution times out, the session may be wedged until you call `session.reset()`, which restarts the sandbox.

## Quick Decision Guide

Expand All @@ -17,20 +71,24 @@ Need stronger isolation? → ContainerExecutor
- Filesystem and network isolation
- Requires Docker

Want sandboxing without Docker (and can accept Pyodide limitations)? → DenoSandboxExecutor (experimental)
- WASM-based Python runtime + Deno permission model
- Network and filesystem sandboxing via Deno permissions

Need maximum speed AND trust the code completely? → InProcessExecutor
- No isolation (runs in your process)
- Only for trusted code you control
```

| Requirement | Subprocess | Container | InProcess |
|-------------|------------|-----------|-----------|
| **Recommended for most users** | **Yes** | | |
| Process isolation | Yes | Yes | No |
| Crash recovery | Yes | Yes | No |
| Container isolation | No | Yes | No |
| No Docker required | Yes | No | Yes |
| Resource limits | Partial | Full | No |
| Untrusted code | No | Yes | No |
| Requirement | Subprocess | Container | DenoSandbox | InProcess |
|-------------|------------|-----------|------------|-----------|
| **Recommended for most users** | **Yes** | | | |
| Process isolation | Yes | Yes | Yes | No |
| Crash recovery | Yes | Yes | Yes | No |
| Container isolation | No | Yes | No | No |
| No Docker required | Yes | No | Yes | Yes |
| Resource limits | Partial | Full | Partial | No |
| Untrusted code | No | Yes | Yes (experimental) | No |

---

Expand Down
34 changes: 31 additions & 3 deletions docs/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,23 @@ Tools wrap external capabilities as callable functions. Three adapter types supp

Define command-line tools with YAML schema + recipes.

## Tool Middleware (Experimental)

py-code-mode supports a host-side middleware chain around tool execution. This is intended for:
- Audit logging and metrics
- Allow/deny decisions and interactive approvals
- Argument rewriting, retries, caching, etc.

Notes:
- Middleware runs where tools execute (host-side ToolAdapters).
- Enforcement guarantees are strongest with `DenoSandboxExecutor` because sandboxed Python can only access tools via RPC back to the host.

API surface:
- `ToolMiddleware`: `async def __call__(ctx: ToolCallContext, call_next) -> Any`
- `ToolCallContext`: includes `tool_name`, `callable_name`, `args`, and metadata like `executor_type`, `origin`, `request_id`.

To enable for `DenoSandboxExecutor`, pass `tool_middlewares` in `DenoSandboxConfig`.

### Schema Definition

```yaml
Expand Down Expand Up @@ -89,6 +106,13 @@ recipes:

### Agent Usage

Tool calls inside `Session.run()` are **synchronous** in the default executors (Subprocess/Container/InProcess).

Notes:
- If you need async tool calls in Python code, use `call_async(...)` explicitly.
- In `DenoSandboxExecutor`, tool calls are **async-first** and you must use `await tools.*`.
- In `DenoSandboxExecutor`, tool calls execute **outside** the sandbox: `await tools.*` is an RPC back to the host Python process, and the tool runs with host permissions (or container permissions if the tool adapter/executor is containerized).

```python
# Recipe invocation (recommended)
tools.curl.get(url="https://api.github.com/repos/owner/repo")
Expand All @@ -103,9 +127,13 @@ tools.curl(
)

# Discovery
tools.list() # All tools
tools.search("http") # Search by name/description/tags
tools.curl.list() # Recipes for a specific tool
tools.list() # All tools
tools.search("http") # Search by name/description/tags
tools.curl.list() # Recipes for a specific tool

# Explicit async (if you need it)
await tools.curl.call_async(url="https://example.com")
await tools.curl.get.call_async(url="https://example.com")
```

## MCP Tools
Expand Down
21 changes: 12 additions & 9 deletions docs/workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,14 @@ Over time, the workflow library grows. Simple workflows become building blocks f

## Creating Workflows

Workflows are async Python functions with an `async def run()` entry point:
Workflows are Python functions with a `run()` entry point. Both `def run(...)` and
`async def run(...)` are supported:

```python
# workflows/fetch_json.py
"""Fetch and parse JSON from a URL."""

async def run(url: str, headers: dict = None) -> dict:
def run(url: str, headers: dict = None) -> dict:
"""Fetch JSON data from a URL.

Args:
Expand All @@ -37,7 +38,7 @@ async def run(url: str, headers: dict = None) -> dict:
raise RuntimeError(f"Invalid JSON from {url}: {e}") from e
```

> **Note:** All workflows must use `async def run()`. Synchronous `def run()` is not supported.
> **Note:** If your workflow uses `async def run(...)`, it can still call tools/workflows/artifacts synchronously.

### Runtime Creation

Expand All @@ -46,7 +47,7 @@ Agents can create workflows dynamically:
```python
workflows.create(
name="fetch_json",
source='''async def run(url: str) -> dict:
source='''def run(url: str) -> dict:
"""Fetch and parse JSON from a URL."""
import json
response = tools.curl.get(url=url)
Expand Down Expand Up @@ -96,7 +97,7 @@ Workflows can invoke other workflows, enabling layered workflows:

```python
# workflows/fetch_json.py
async def run(url: str) -> dict:
def run(url: str) -> dict:
"""Fetch and parse JSON from a URL."""
import json
response = tools.curl.get(url=url)
Expand All @@ -107,11 +108,13 @@ async def run(url: str) -> dict:

```python
# workflows/get_repo_metadata.py
async def run(owner: str, repo: str) -> dict:
def run(owner: str, repo: str) -> dict:
"""Get GitHub repository metadata."""
# Uses the fetch_json workflow
data = workflows.invoke("fetch_json",
url=f"https://api.github.com/repos/{owner}/{repo}")
data = workflows.invoke(
"fetch_json",
url=f"https://api.github.com/repos/{owner}/{repo}",
)

return {
"name": data["name"],
Expand All @@ -125,7 +128,7 @@ async def run(owner: str, repo: str) -> dict:

```python
# workflows/analyze_multiple_repos.py
async def run(repos: list) -> dict:
def run(repos: list) -> dict:
"""Analyze multiple GitHub repositories."""
summaries = []
for repo in repos:
Expand Down
5 changes: 5 additions & 0 deletions examples/deno-sandbox/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
data/
.deno-cache/
__pycache__/
.pytest_cache/

Loading
Loading