Skip to content

btucker/llm-apple

Repository files navigation

llm-apple

LLM plugin for Apple Foundation Models (Apple Intelligence)

This plugin exposes Apple's on-device Foundation Models through the llm CLI tool.

Requirements

Installation

pip install llm # if llm is not already installed
llm install llm-apple

Usage

Basic usage (streaming is enabled by default):

llm -m apple "What is the capital of France?"

Without streaming:

llm -m apple "Tell me a story" --no-stream

With options:

llm -m apple "Write a poem" -o temperature 1.5 -o max_tokens 500

With system instructions:

llm -m apple "What is Python?" --system "You are a helpful programming tutor"

Conversations

The plugin supports conversations, maintaining context across multiple prompts:

# Start a conversation
llm -m apple "My name is Alice" --save conversation1

# Continue the conversation
llm -m apple "What is my name?" --continue conversation1

Tool Calling

The plugin supports tool calling, allowing the model to call Python functions to access real-time data, perform actions, or integrate with external systems.

CLI Tool Usage

You can use tools from the command line using the --functions option:

# Define a function inline
llm -m apple "What's the weather in Paris?" \
  --functions 'def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    return f"Weather in {location}: 72°F, sunny"'

Or load functions from a Python file:

# Create a tools.py file
cat > tools.py << 'EOF'
def get_current_time() -> str:
    """Get the current time."""
    from datetime import datetime
    return datetime.now().strftime("%I:%M %p")

def get_weather(location: str) -> str:
    """Get weather for a location."""
    return f"Weather in {location}: 72°F, sunny"
EOF

# Use the functions from the file
llm -m apple "What time is it and what's the weather in Tokyo?" --functions tools.py

You can also use registered tool plugins with the -T or --tool flag (see llm tool documentation for more details).

Python API Tool Usage

import llm

def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    # In a real implementation, this would call a weather API
    return f"Weather in {location}: 72°F, sunny"

model = llm.get_model("apple")
response = model.prompt(
    "What's the weather in San Francisco?",
    tools=[llm.Tool(
        name="get_weather",
        description="Get current weather for a location",
        implementation=get_weather
    )]
)
print(response.text())

Tool Types Supported

Tools can have various parameter signatures:

No parameters:

def get_current_time() -> str:
    """Get the current time."""
    return "2:30 PM"

Single parameter:

def search_docs(query: str) -> str:
    """Search documentation."""
    return f"Results for: {query}"

Multiple parameters with mixed types:

def calculate(operation: str, x: int, y: int) -> str:
    """Perform a calculation."""
    ops = {"add": x + y, "multiply": x * y}
    return str(ops.get(operation, 0))

Optional parameters:

def get_temperature(city: str, units: str = "celsius") -> str:
    """Get temperature for a city."""
    return f"Temperature in {city}: 20°{units[0].upper()}"

Multiple Tools

You can register multiple tools in a single call:

def get_time() -> str:
    """Get the current time."""
    return "2:30 PM"

def get_date() -> str:
    """Get the current date."""
    return "November 7, 2024"

def get_weather(location: str) -> str:
    """Get weather for a location."""
    return f"Weather in {location}: 72°F, sunny"

tools = [
    llm.Tool(name="get_time", description="Get current time", implementation=get_time),
    llm.Tool(name="get_date", description="Get current date", implementation=get_date),
    llm.Tool(name="get_weather", description="Get weather", implementation=get_weather),
]

response = model.prompt(
    "What's the date, time, and weather in Paris?",
    tools=tools
)

The model will automatically select and call the appropriate tools based on the prompt.

Structured Output (Schemas)

The plugin supports structured output using JSON schemas, allowing you to get consistently formatted responses from the model.

CLI Schema Usage

Define the structure you want using JSON schema syntax:

# Simple person extraction
llm -m apple "Alice is 28 and lives in Paris" \
  --schema '{"type": "object", "properties": {"name": {"type": "string"}, "age": {"type": "integer"}, "city": {"type": "string"}}, "required": ["name", "age", "city"]}'

Or use the concise schema syntax:

# Same as above but simpler
llm -m apple "Alice is 28 and lives in Paris" --schema 'name, age int, city'

Extract multiple entities:

llm -m apple "Bob is 35 from London and Carol is 42 from Tokyo" \
  --schema 'items: [{"name": "string", "age": "integer", "city": "string"}]'

Save schemas for reuse:

# Save a schema
llm -m apple "Alice, 28, Paris" --schema 'name, age int, city' --save person-schema

# Reuse it
llm -m apple "Bob is 35 and lives in London" --schema t:person-schema

Python API Schema Usage

import llm

model = llm.get_model("apple")

schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "city": {"type": "string"}
    },
    "required": ["name", "age", "city"]
}

response = model.prompt(
    "Extract person info: Alice is 28 and lives in Paris",
    schema=schema
)

# Response will be JSON-formatted
print(response.text())  # {"name": "Alice", "age": 28, "city": "Paris"}

Important Notes

  • Streaming automatically disabled: When you provide a schema, streaming is automatically disabled since structured output requires the full response. You don't need to explicitly set stream=False - it happens automatically.

Available Options

  • temperature (float, 0.0-2.0, default: 1.0): Controls randomness in generation
    • 0.0 = deterministic
    • 2.0 = very random
  • max_tokens (int, default: 1024): Maximum tokens to generate

System prompts can be provided using llm's built-in --system or -s flag.

Availability

The plugin checks Apple Intelligence availability on startup. If Apple Intelligence is not available, you'll see an error message with details on why.

Common reasons:

  • Device not eligible (requires Apple Silicon)
  • Apple Intelligence not enabled in Settings
  • Model not ready (downloading or initializing)

Examples

Creative writing with higher temperature:

llm -m apple "Write a creative story about a robot" -o temperature 1.8

Factual query with lower temperature:

llm -m apple "Explain quantum computing" -o temperature 0.3

With system prompt for career guidance:

llm -m apple "Should I learn Python or JavaScript?" \
  --system "You are a career counselor specializing in tech"

Development

Running Tests

# Run all tests (unit tests with mocks)
uv run pytest

# Run tests with coverage
uv run pytest --cov=llm_apple --cov-report=html --cov-report=term

# Run integration tests (requires Apple Intelligence)
uv run pytest tests/test_integration_tools.py -v -s

Most tests use mocks to simulate the Apple Foundation Models API, so they can run on any platform without requiring actual Apple Intelligence hardware.

Integration tests in tests/test_integration_tools.py require Apple Intelligence to be available and will be automatically skipped if not present.

About

LLM plugin for local apple-foundation-models available on macOS 26

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages