-
Notifications
You must be signed in to change notification settings - Fork 122
feat: add shell tool #1107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
akihikokuroda
wants to merge
22
commits into
generative-computing:main
Choose a base branch
from
akihikokuroda:shell
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+3,764
−0
Open
feat: add shell tool #1107
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
2e49ae2
shell executor
akihikokuroda de4775c
shell executor
akihikokuroda 11ed4f4
update an example
akihikokuroda d6e53c2
review comments
akihikokuroda 6994227
review comments
akihikokuroda 06390ef
review comments
akihikokuroda 9ec3dba
fix CI issue
akihikokuroda 47d56ec
review comments
akihikokuroda 9573819
review comments
akihikokuroda 86c704a
review comment
akihikokuroda 1b401f8
review comment
akihikokuroda e5f1208
review comment
akihikokuroda ae35cfb
review comments
akihikokuroda e3300f0
Local by Default + Opt-In Sandboxing
akihikokuroda 3f2d61c
bash guardrails
akihikokuroda fd95ed5
add bash audit
akihikokuroda 414701c
remove unsafe_local_bash_executor() refernces
akihikokuroda 7ed72b2
remove sandbox=true
akihikokuroda 3f9da6e
review comments
akihikokuroda c9494f7
rebase main
akihikokuroda 66cac4c
review comments
akihikokuroda 38099e8
review comment
akihikokuroda File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,237 @@ | ||
| # pytest: e2e, ollama, qualitative | ||
| """Example usage patterns for bash_executor tool. | ||
|
|
||
| Demonstrates multiple ways to use Mellea's bash execution capabilities: | ||
| 1. Direct execution for local commands | ||
| 2. Wrapping as a MelleaTool for agent use | ||
| 3. LLM-based tool calling with forced tool use | ||
| 4. Working directory and path restrictions | ||
| 5. Demonstrate safety features | ||
| 6. Integration with error handling | ||
|
|
||
| ⚠️ Security note: bash_executor runs commands locally with a conservative | ||
| safety denylist (recommended for typical agentic workflows). The denylist | ||
| enforces: no sudo, no rm -rf, no destructive git operations, no writes to | ||
| /etc, /sys, /proc, etc. Write operations can also be constrained with | ||
| ``working_dir`` and explicit ``allowed_paths``. | ||
|
|
||
| For higher isolation requirements (untrusted code, security research), | ||
| provide isolation at the application layer (containers, VMs). | ||
|
|
||
| Note: Commands must use argv-friendly syntax (no pipes, redirects, or shell builtins). | ||
| Use individual commands and compose them in Python instead. | ||
| """ | ||
|
|
||
| from mellea import MelleaSession, start_session | ||
| from mellea.backends import ModelOption | ||
| from mellea.backends.tools import MelleaTool | ||
| from mellea.stdlib.requirements import uses_tool | ||
| from mellea.stdlib.tools.shell import bash_executor | ||
|
|
||
|
|
||
| def example_1_direct_execution() -> None: | ||
| """Example 1: Execute bash commands locally (default).""" | ||
| print("=== Example 1: Local Execution (Default) ===") | ||
|
|
||
| # Execute a simple command locally | ||
| result = bash_executor("echo 'Hello from Bash'") | ||
| print("Command: echo 'Hello from Bash'") | ||
| print(f"Success: {result.success}") | ||
| print(f"Output: {result.stdout}") | ||
| print() | ||
|
|
||
| # Execute a command to list files (no pipes/redirects) | ||
| result = bash_executor("ls -la") | ||
| print("Command: ls -la") | ||
| print(f"Success: {result.success}") | ||
| if result.stdout: | ||
| # Show first few lines | ||
| lines = result.stdout.split("\n")[:3] | ||
| print("Output (first 3 lines):\n" + "\n".join(lines)) | ||
| print() | ||
|
|
||
| # Demonstrate that pipes are blocked (for security) | ||
| result = bash_executor("ls -la | wc -l") | ||
| print("Command: ls -la | wc -l (pipe operator blocked)") | ||
| print(f"Rejected: {result.skipped}") | ||
| print(f"Reason: {result.skip_message}") | ||
| print() | ||
|
|
||
| # Attempt a dangerous command (will be rejected) | ||
| result = bash_executor("sudo echo unsafe") | ||
| print("Command: sudo echo unsafe") | ||
| print(f"Skipped: {result.skipped}") | ||
| print(f"Reason: {result.skip_message}") | ||
| print() | ||
|
|
||
|
|
||
| def example_2_wrapped_as_tool() -> None: | ||
| """Example 2: Wrap bash executor as a MelleaTool for LLM use.""" | ||
| print("=== Example 2: Wrapped as MelleaTool ===") | ||
|
|
||
| # Create tool from bash executor (local execution by default) | ||
| bash_tool = MelleaTool.from_callable(bash_executor) | ||
| print(f"Tool name: {bash_tool.name}") | ||
| print(f"Tool schema keys: {bash_tool.as_json_tool.keys()}") | ||
| print() | ||
|
|
||
| # Invoke the tool directly (normally LLM would call this) | ||
| result = bash_tool.run("pwd") | ||
| print("Tool invocation result:") | ||
| print(f" Success: {result.success}") | ||
| print(f" Output: {result.stdout}") | ||
| print() | ||
|
|
||
|
|
||
| def example_3_llm_with_forced_tool_use(m: MelleaSession) -> None: | ||
| """Example 3: LLM generates bash commands with forced tool use (requires Ollama). | ||
|
|
||
| This mirrors the Python interpreter pattern: ask the LLM to generate | ||
| a bash command, force it to use the tool, then execute the command. | ||
|
|
||
| Requirements: | ||
| - Ollama running locally (or compatible LLM configured) | ||
| - Run: ollama serve | ||
| """ | ||
| print("=== Example 3: LLM-Generated Bash Commands with Forced Tool Use ===") | ||
|
|
||
| result = m.instruct( | ||
| description="Use bash to find Python files in the current directory. " | ||
| "Generate a single command using find or ls (no pipes, redirects, or shell operators allowed).", | ||
| requirements=[uses_tool(bash_executor)], | ||
| model_options={ModelOption.TOOLS: [MelleaTool.from_callable(bash_executor)]}, | ||
| tool_calls=True, | ||
| ) | ||
|
|
||
| if result.tool_calls is None: | ||
| raise ValueError("Expected tool_calls but got None") | ||
|
|
||
| if "bash_executor" not in result.tool_calls: | ||
| available_tools = list(result.tool_calls.keys()) | ||
| raise ValueError( | ||
| f"Expected tool 'bash_executor' in tool_calls, but got: {available_tools}" | ||
| ) | ||
|
|
||
| # Extract the bash command the LLM generated | ||
| tool_call = result.tool_calls["bash_executor"] | ||
| if "command" not in tool_call.args: | ||
| raise ValueError( | ||
| f"Expected 'command' argument in tool call args, " | ||
| f"but got: {list(tool_call.args.keys())}" | ||
| ) | ||
|
|
||
| command = tool_call.args["command"] | ||
| print(f"LLM generated bash command:\n {command}\n") | ||
|
|
||
| # Execute the command | ||
| exec_result = tool_call.call_func() | ||
|
|
||
| print("Execution result:") | ||
| print(f" Success: {exec_result.success}") | ||
| print(f" Skipped: {exec_result.skipped}") | ||
| if exec_result.skip_message: | ||
| print(f" Skip reason: {exec_result.skip_message}") | ||
| print(f" Output: {exec_result.stdout}") | ||
| if exec_result.stderr: | ||
| print(f" Error: {exec_result.stderr}") | ||
| print() | ||
|
|
||
|
|
||
| def example_4_with_working_dir() -> None: | ||
| """Example 4: Restrict write validation and execution cwd to a directory.""" | ||
| print("=== Example 4: Working Directory Restriction ===") | ||
|
|
||
| import os | ||
| import tempfile | ||
|
|
||
| with tempfile.TemporaryDirectory() as tmpdir: | ||
| print(f"Working directory: {tmpdir}") | ||
|
|
||
| # Create a file using touch within the working directory (redirects blocked) | ||
| result = bash_executor("touch myfile.txt", working_dir=tmpdir) | ||
| print(f"Command: touch myfile.txt (relative path, executed in {tmpdir})") | ||
| print(f"Success: {result.success}") | ||
| print() | ||
|
|
||
| # Verify the file was created | ||
| file_path = os.path.join(tmpdir, "myfile.txt") | ||
| if os.path.exists(file_path): | ||
| print(f"✓ File created at: {file_path}") | ||
| print() | ||
|
|
||
| # Read it back | ||
| result = bash_executor("cat myfile.txt", working_dir=tmpdir) | ||
| print("Command: cat myfile.txt") | ||
| print(f"Output: {result.stdout}") | ||
| print() | ||
|
|
||
| # Writing to /tmp is always allowed (temp directory exception) | ||
| result = bash_executor("touch /tmp/tmpfile.txt", working_dir=tmpdir) | ||
| print(f"Command: touch /tmp/tmpfile.txt (with working_dir={tmpdir})") | ||
| print(f"Success: {result.success} (note: /tmp is always allowed)") | ||
| print() | ||
|
|
||
| # Attempt to write to system paths (will be rejected) | ||
| result = bash_executor("touch /etc/config.txt", working_dir=tmpdir) | ||
| print(f"Command: touch /etc/config.txt (with working_dir={tmpdir})") | ||
| print(f"Rejected: {result.skipped}") | ||
| print(f"Reason: {result.skip_message}") | ||
| print() | ||
|
|
||
|
|
||
| def example_5_safety_features() -> None: | ||
| """Example 5: Demonstrate safety features.""" | ||
| print("=== Example 5: Safety Features ===") | ||
|
|
||
| dangerous_commands = [ | ||
| ("rm -rf /home", "Recursive force delete"), | ||
| ("git push --force", "Force git push"), | ||
| ("sudo whoami", "Privilege escalation"), | ||
| ("bash -i", "Interactive shell"), | ||
| ("touch /etc/config", "Write to system path"), | ||
| ] | ||
|
|
||
| for cmd, description in dangerous_commands: | ||
| result = bash_executor(cmd) | ||
| print(f"{description}: {cmd}") | ||
| print(f" Rejected: {result.skipped}") | ||
| print(f" Reason: {result.skip_message}") | ||
| print() | ||
|
|
||
|
|
||
| def example_6_error_handling() -> None: | ||
| """Example 6: Handle execution errors gracefully.""" | ||
| print("=== Example 6: Error Handling ===") | ||
|
|
||
| # Command that fails (returns non-zero exit code) | ||
| result = bash_executor("false") | ||
| print("Command: false (POSIX command that returns exit code 1)") | ||
| print(f"Success: {result.success}") | ||
| print(f"Return code indicates failure: {not result.success}") | ||
| print() | ||
|
|
||
| # Command that doesn't exist | ||
| result = bash_executor("nonexistent_command_xyz") | ||
| print("Command: nonexistent_command_xyz") | ||
| print(f"Success: {result.success}") | ||
| if not result.success and result.stderr is not None: | ||
| print(f"Error output: {result.stderr[:100]}") | ||
| print() | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| example_1_direct_execution() | ||
| example_2_wrapped_as_tool() | ||
|
|
||
| # Example 3: Run with LLM-based tool calling (requires Ollama or compatible LLM) | ||
| try: | ||
| m = start_session() | ||
| example_3_llm_with_forced_tool_use(m) | ||
| except Exception as e: | ||
| print(f"Example 3 skipped: {e!s}") | ||
| print(" Requires: Ollama running locally or compatible LLM configured") | ||
| print(" See: https://docs.ollama.ai/") | ||
|
|
||
| example_4_with_working_dir() | ||
| example_5_safety_features() | ||
| example_6_error_handling() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified fixed — renamed to
example_4_with_working_dirwith correct numbering;__main__calls the LLM example. LGTM.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you check this comment? There may be some confusion.