Skip to content

v0.23.0

Choose a tag to compare

@mlikasam-askui mlikasam-askui released this 22 Jan 10:26
· 335 commits to main since this release
39ec572

Release Notes - v0.23.0

🎉 Overview

This release introduces a major overhaul of the prompting system, a new Tool Store for extending agent capabilities, automatic AgentOS injection, and numerous improvements and bug fixes. With 106 files changed, 5,972 additions, and 1,767 deletions, this is one of the most significant updates to the Vision Agent.

✨ New Features

Advanced Prompting System

A completely redesigned prompting paradigm with a structured 5-fold (plus optional 6th) system prompt architecture:

  • System Capabilities: Defines what the agent can do and how it should behave
  • Device Information: Provides platform-specific context (desktop, mobile, web)
  • UI Information: Custom information about your specific UI (strongly recommended)
  • Report Format: Specifies how to format execution results
  • Cache Use: New optional prompt part that specifies when and how the agent should use cache files
  • Additional Rules: Optional special handling for edge cases or known issues

The new system provides better structure, flexibility, and control over agent behavior. A comprehensive prompting guide has been added to help you create effective custom prompts.

Breaking Change: System prompts passed as strings will now show a deprecation warning. Use ActSystemPrompt instead.

Tool Store

Introducing a new Tool Store that provides optional, extensible tools organized by category:

  • Universal Tools (askui.tools.store.universal): Work with any agent type

    • ListFilesTool: List files in a directory
    • ReadFromFileTool: Read content from files
    • WriteToFileTool: Write content to files
    • PrintToConsoleTool: Print messages to console during execution
  • Computer Tools (askui.tools.store.computer): Require ComputerAgentOs

    • ComputerSaveScreenshotTool: Save screenshots during execution
    • Experimental window management tools:
      • AddWindowAsVirtualDisplayTool
      • ListProcessWindowsTool
      • ListProcessTool
      • SetProcessInFocusTool
      • SetWindowInFocusTool
  • Android Tools (askui.tools.store.android): Require AndroidAgentOs

    • AndroidSaveScreenshotTool: Save screenshots from Android devices

Tools can be passed to agent.act() or to the agent constructor as act_tools for persistent availability.

AgentOS Auto-Injection

Tools that require AgentOS (like computer or Android tools) now automatically receive the appropriate AgentOS instance. This simplifies tool usage and eliminates the need for manual AgentOS management in most cases.

Computer Tools Refactoring

Computer tools have been completely refactored and reorganized:

  • Tools are now properly modularized in askui.tools.computer
  • Removed deprecated AskUiComputerBaseTool
  • Improved separation of concerns and better code organization
  • New tools added:
    • GetSystemInfoTool: Retrieve system information
    • GetActiveProcessTool: Get information about the active process
    • ConnectTool and DisconnectTool: Manage computer connections in chat

Enhanced AgentOS Capabilities

  • Updated AgentOS JSON schema with expanded capabilities
  • New system information retrieval methods
  • Improved window management capabilities
  • Better error handling for gRPC invalid argument errors

SSL Verification Control

Added the ability to disable SSL verification for the user identification API via the ASKUI_HTTP_SSL_VERIFICATION environment variable. This is useful for development environments with self-signed certificates.

Note: SSL verification is enabled by default for security.

🔧 Improvements

Chat API Enhancements

  • Added computer connect/disconnect tools for chat interface
  • Improved chat history management
  • Enhanced MCP server integration for computer operations

Agent Improvements

  • Cleaned up agent and agent_base code
  • Fixed typechecking bugs
  • Improved reporter encoding handling
  • Added more reporter messages for better observability
  • Enhanced overlay support during e2e controller tests

Prompting Improvements

  • Refined how system prompts are provided
  • Introduced cache_use prompt part for better cache control
  • Improved prompt structure and organization
  • Better handling of prompt parts

Tool System Improvements

  • Better tool organization and categorization
  • Improved tool initialization and lifecycle management
  • Enhanced tool tagging system
  • Better support for tools with AgentOS requirements

🐛 Bug Fixes

  • Fixed act prompts issues
  • Fixed reporter encoding problems
  • Fixed tool initialization bugs
  • Fixed typechecking issues in agent
  • Fixed linter issues across the codebase
  • Fixed typos in documentation and code

📚 Documentation

🔄 Code Quality

  • Extensive code cleanup and refactoring
  • Improved type hints and type safety
  • Better code organization and structure
  • Enhanced test coverage

📊 Statistics

  • 106 files changed
  • 5,972 lines added
  • 1,767 lines removed
  • Net change: +4,205 lines

⚠️ Breaking Changes

  1. System Prompt Format: System prompts should now use ActSystemPrompt instead of plain strings. Passing strings will show a deprecation warning.

  2. Computer Tools: The AskUiComputerBaseTool has been removed. Use tools from askui.tools.computer or askui.tools.store.computer instead.

  3. Tool Organization: Computer tools have been reorganized. If you were using tools directly from askui.tools.computer, check the new structure.

🚀 Migration Guide

Using the Tool Store

from askui import VisionAgent
from askui.tools.store.universal import PrintToConsoleTool, WriteToFileTool
from askui.tools.store.computer import ComputerSaveScreenshotTool

with VisionAgent(act_tools=[
    PrintToConsoleTool(),
    WriteToFileTool(base_dir="./output"),
    ComputerSaveScreenshotTool(base_dir="./screenshots")
]) as agent:
    agent.act("Take a screenshot and save it")

📝 Full Changelog

For a complete list of changes, see the git log.


Upgrade: pip install --upgrade askui

Documentation: docs.askui.com