feat (mcp server): Add sandbox MCP server with secure Python code execution by florenzi002 · Pull Request #187 · IBM/AssetOpsBench

florenzi002 · 2026-03-03T16:10:35Z

Adds sandbox for code execution in a python secure container via mcp (#166)
The main contribution is in mcp/servers/sandbox.

Downgrades required python version to >=3.12 for pydantic compatibility

Conceptually what is proposed in this PR is largely the same as what the mcp-sandbox utility provides. It exposes a couple of tools to run python code, either passed as string or file, in a dedicated lightweight and secure container.

this PR would work by registering the sandbox tools as top level mcp servers and could be reached by all the other tools, servers or agent, including the planner executor if needed.

DhavalRepo18 · 2026-03-03T17:04:27Z

@florenzi002, in my understanding, if there is an existing MCP Server, do I need to write the code again, or is there an economic way of just registering it into the current ecosystem?

I was originally under the impression that - existing MCP server stays where they are, MCP client gets registered. And we do not need to write code.

florenzi002 · 2026-03-04T09:55:43Z

@florenzi002, in my understanding, if there is an existing MCP Server, do I need to write the code again, or is there an economic way of just registering it into the current ecosystem?

I was originally under the impression that - existing MCP server stays where they are, MCP client gets registered. And we do not need to write code.

My understanding is that #166 was about registering a new MCP server to run code. This PR addresses the following in #166

Given that we are rebasing AssetOpsBench in the MCP-compliant environment, we would like to provide a Sandbox as part of the AssetOpsBench MCP echo system.

All the other servers stay the same, this is just an additional one registered alongside all the others e.g., utility server

DhavalRepo18 · 2026-03-05T03:20:10Z

@florenzi002 I discussed this with @ShuxinLin, and I will be primarily reviewing this PR.

DhavalRepo18 · 2026-03-05T03:22:43Z

@florenzi002 - Any further comments on what makes the existing MCP-Sandbox tough will be highly valuable.

#166 (comment)

florenzi002 · 2026-03-05T09:24:03Z

@florenzi002 - Any further comments on what makes the existing MCP-Sandbox tough will be highly valuable.

#166 (comment)

@DhavalRepo18

I've found that sandbox-mcp is primarily a GO utility. This means a need to install a whole GO compiler for a single dependency. Then it doesn't provide a way to install a subset of the sandboxes, so it always install about 6GB of sandboxes some of which we would probably never use, i think for starter a python only sandbox is all we need. Furthermore the current AssetOpsBench can be used with both docker or podman backends while the proposed library works with docker only due to some hardcoded paths in the source code of the project, we would lose that flexibility; it is relevant because while docker offers a free tier, it is a commercial product and a part of the open source community is moving away (or required by institutions) to use open source alternatives like podman. Finally sandbox MCP seems to be stuck with development to May 2025 so I think this small in-house alternative might prove to be more stable.

Conceptually what is proposed in this PR is largely the same as what the library provides. Ultimately it expose a couple of tools to run python code, either passed as string or file, in a dedicated lightweight and secure container.

this PR would work by registering the sandbox tools as top level mcp servers and could be reached by all the other tools, servers or agent, including the planner executor if needed.

DhavalRepo18

This PR needs test cases at the tool level testing.
This PR also needs some example scenarios to be tested: where we typically download data from IoT and then perform the same data aggregation (obtain first-order statistics) using Python code sandbox. Some example query - ``Give me the mean and max value of temperature for Chiller 6.''
The Docker should also expose the library available as a part of the MCP-doc string, and this will enable efficient coding at the LLM side
version (typically we fix library version to avoid a mismatch in APIs, etc)

DhavalRepo18 · 2026-03-11T12:46:29Z

There is one thing that came to my mind about stateless and stateful. Does this environment maintain the states?
storage mounting, the data exchange will be via file, most of the time, how this storage is being maintained during agent execution
Currently, the script or program will go maybe as a string, but what happens if LLM makes a decision on executing an existing .py file? (A sandbox receives Python code and data), Some design thinking should be done here.
Ideally, for a shortcut, the doc string of the MCP tool is embedded with an example, but the MCP allows you to have an additional two Resources and Prompts.

florenzi002 · 2026-03-18T13:55:32Z

There is one thing that came to my mind about stateless and stateful. Does this environment maintain the states?

No it doesn't. Containers are ephemeral and stateless. The alternative libraries are also stateless as such. I think if state is of importance at any time it could be made so the container returns the result of the script + a dump of the environment (e.g., variables, etc).

storage mounting, the data exchange will be via file, most of the time, how this storage is being maintained during agent execution

I think that files produced by the agent during execution and needed for a particular coding round can be dynamically mounted in the container before running the code as part of the mcp call, maybe b64 encoded strings or via any other network protocol. Alternative could be to mount persistent storage to the mcp server and let the agent upload there for long term storage, it is more complex though. Currently when running AssetOpsBench locally with both the mcp server and agent on the same machine the agent workspace can be mounted directly as part of the sandbox container solving the use case.

Currently, the script or program will go maybe as a string, but what happens if LLM makes a decision on executing an existing .py file? (A sandbox receives Python code and data), Some design thinking should be done here.

It is very similar to the above use case if i understand correctly.

Ideally, for a shortcut, the doc string of the MCP tool is embedded with an example, but the MCP allows you to have an additional two Resources and Prompts.

is this suggestion here to add a toolcall/mcp call example in the docstring?

DhavalRepo18 · 2026-03-18T14:07:03Z

@florenzi002 we like to merge this PR early next week. At present we are running all the code being submitted to reduce the future work.

DhavalRepo18 · 2026-03-19T20:08:08Z

This one is now in actively being reviwed. Please addess conflict and name (We do not use word Agent, if any).

DhavalRepo18 · 2026-03-19T22:33:08Z

This PR needs test cases at the tool level testing.

This PR also needs some example scenarios to be tested: where we typically download data from IoT and then perform the same data aggregation (obtain first-order statistics) using Python code sandbox. Some example query - ``Give me the mean and max value of temperature for Chiller 6.''

The Docker should also expose the library available as a part of the MCP-doc string, and this will enable efficient coding at the LLM side

version (typically we fix library version to avoid a mismatch in APIs, etc)

Please address the second bullet. We need one example to run on the existing dataset (connect IoT and Sandbox) and then add the instruction.MD just like other examples.

Add sandbox MCP server with secure Python code execution

71970fb

florenzi002 requested review from DhavalRepo18 and ShuxinLin March 3, 2026 16:10

florenzi002 self-assigned this Mar 3, 2026

Merge remote-tracking branch 'origin/main' into code-sandbox

fb86e06

florenzi002 mentioned this pull request Mar 4, 2026

feat: add VibrationAgent MCP server for vibration analysis benchmarks #190

Open

DhavalRepo18 removed the request for review from ShuxinLin March 5, 2026 03:20

DhavalRepo18 added the In Review label Mar 5, 2026

DhavalRepo18 requested changes Mar 9, 2026

View reviewed changes

Fabio Lorenzi added 4 commits March 9, 2026 16:10

adds libraries to mcp docstring with fixed version

c4e02fd

fixes path directories

4408455

fixes path directories

497f086

adds tests

e16b20f

Merge remote-tracking branch 'origin/main' into code-sandbox

b645e14

florenzi002 mentioned this pull request Mar 18, 2026

Delete all unnecessary branches #204

Open

Fabio Lorenzi added 2 commits March 19, 2026 21:27

INSTRUCTIONs for mcp server

1846c56

merge main

88a8eca

DhavalRepo18 marked this pull request as ready for review March 19, 2026 22:02

florenzi002 changed the title ~~Add sandbox MCP server with secure Python code execution~~ feat (mcp server): Add sandbox MCP server with secure Python code execution Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat (mcp server): Add sandbox MCP server with secure Python code execution#187

feat (mcp server): Add sandbox MCP server with secure Python code execution#187
florenzi002 wants to merge 9 commits intomainfrom
code-sandbox

florenzi002 commented Mar 3, 2026 •

edited

Loading

Uh oh!

DhavalRepo18 commented Mar 3, 2026

Uh oh!

florenzi002 commented Mar 4, 2026 •

edited

Loading

Uh oh!

DhavalRepo18 commented Mar 5, 2026

Uh oh!

DhavalRepo18 commented Mar 5, 2026

Uh oh!

florenzi002 commented Mar 5, 2026

Uh oh!

DhavalRepo18 left a comment •

edited by florenzi002

Loading

Uh oh!

DhavalRepo18 commented Mar 11, 2026 •

edited by florenzi002

Loading

Uh oh!

florenzi002 commented Mar 18, 2026 •

edited

Loading

Uh oh!

DhavalRepo18 commented Mar 18, 2026

Uh oh!

DhavalRepo18 commented Mar 19, 2026

Uh oh!

DhavalRepo18 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

florenzi002 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DhavalRepo18 commented Mar 3, 2026

Uh oh!

florenzi002 commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DhavalRepo18 commented Mar 5, 2026

Uh oh!

DhavalRepo18 commented Mar 5, 2026

Uh oh!

florenzi002 commented Mar 5, 2026

Uh oh!

DhavalRepo18 left a comment • edited by florenzi002 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DhavalRepo18 commented Mar 11, 2026 • edited by florenzi002 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

florenzi002 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DhavalRepo18 commented Mar 18, 2026

Uh oh!

DhavalRepo18 commented Mar 19, 2026

Uh oh!

DhavalRepo18 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

florenzi002 commented Mar 3, 2026 •

edited

Loading

florenzi002 commented Mar 4, 2026 •

edited

Loading

DhavalRepo18 left a comment •

edited by florenzi002

Loading

DhavalRepo18 commented Mar 11, 2026 •

edited by florenzi002

Loading

florenzi002 commented Mar 18, 2026 •

edited

Loading