LLM plugin for Apple Foundation Models (Apple Intelligence)
This plugin exposes Apple's on-device Foundation Models through the llm CLI tool.
- macOS 26.0 or later
- Apple Intelligence enabled
- Python 3.9 or later
- apple-foundation-models >= 0.2.0 installed
pip install llm # if llm is not already installed
llm install llm-appleBasic usage (streaming is enabled by default):
llm -m apple "What is the capital of France?"Without streaming:
llm -m apple "Tell me a story" --no-streamWith options:
llm -m apple "Write a poem" -o temperature 1.5 -o max_tokens 500With system instructions:
llm -m apple "What is Python?" --system "You are a helpful programming tutor"The plugin supports conversations, maintaining context across multiple prompts:
# Start a conversation
llm -m apple "My name is Alice" --save conversation1
# Continue the conversation
llm -m apple "What is my name?" --continue conversation1The plugin supports tool calling, allowing the model to call Python functions to access real-time data, perform actions, or integrate with external systems.
You can use tools from the command line using the --functions option:
# Define a function inline
llm -m apple "What's the weather in Paris?" \
--functions 'def get_weather(location: str) -> str:
"""Get the current weather for a location."""
return f"Weather in {location}: 72°F, sunny"'Or load functions from a Python file:
# Create a tools.py file
cat > tools.py << 'EOF'
def get_current_time() -> str:
"""Get the current time."""
from datetime import datetime
return datetime.now().strftime("%I:%M %p")
def get_weather(location: str) -> str:
"""Get weather for a location."""
return f"Weather in {location}: 72°F, sunny"
EOF
# Use the functions from the file
llm -m apple "What time is it and what's the weather in Tokyo?" --functions tools.pyYou can also use registered tool plugins with the -T or --tool flag (see llm tool documentation for more details).
import llm
def get_weather(location: str) -> str:
"""Get the current weather for a location."""
# In a real implementation, this would call a weather API
return f"Weather in {location}: 72°F, sunny"
model = llm.get_model("apple")
response = model.prompt(
"What's the weather in San Francisco?",
tools=[llm.Tool(
name="get_weather",
description="Get current weather for a location",
implementation=get_weather
)]
)
print(response.text())Tools can have various parameter signatures:
No parameters:
def get_current_time() -> str:
"""Get the current time."""
return "2:30 PM"Single parameter:
def search_docs(query: str) -> str:
"""Search documentation."""
return f"Results for: {query}"Multiple parameters with mixed types:
def calculate(operation: str, x: int, y: int) -> str:
"""Perform a calculation."""
ops = {"add": x + y, "multiply": x * y}
return str(ops.get(operation, 0))Optional parameters:
def get_temperature(city: str, units: str = "celsius") -> str:
"""Get temperature for a city."""
return f"Temperature in {city}: 20°{units[0].upper()}"You can register multiple tools in a single call:
def get_time() -> str:
"""Get the current time."""
return "2:30 PM"
def get_date() -> str:
"""Get the current date."""
return "November 7, 2024"
def get_weather(location: str) -> str:
"""Get weather for a location."""
return f"Weather in {location}: 72°F, sunny"
tools = [
llm.Tool(name="get_time", description="Get current time", implementation=get_time),
llm.Tool(name="get_date", description="Get current date", implementation=get_date),
llm.Tool(name="get_weather", description="Get weather", implementation=get_weather),
]
response = model.prompt(
"What's the date, time, and weather in Paris?",
tools=tools
)The model will automatically select and call the appropriate tools based on the prompt.
The plugin supports structured output using JSON schemas, allowing you to get consistently formatted responses from the model.
Define the structure you want using JSON schema syntax:
# Simple person extraction
llm -m apple "Alice is 28 and lives in Paris" \
--schema '{"type": "object", "properties": {"name": {"type": "string"}, "age": {"type": "integer"}, "city": {"type": "string"}}, "required": ["name", "age", "city"]}'Or use the concise schema syntax:
# Same as above but simpler
llm -m apple "Alice is 28 and lives in Paris" --schema 'name, age int, city'Extract multiple entities:
llm -m apple "Bob is 35 from London and Carol is 42 from Tokyo" \
--schema 'items: [{"name": "string", "age": "integer", "city": "string"}]'Save schemas for reuse:
# Save a schema
llm -m apple "Alice, 28, Paris" --schema 'name, age int, city' --save person-schema
# Reuse it
llm -m apple "Bob is 35 and lives in London" --schema t:person-schemaimport llm
model = llm.get_model("apple")
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"city": {"type": "string"}
},
"required": ["name", "age", "city"]
}
response = model.prompt(
"Extract person info: Alice is 28 and lives in Paris",
schema=schema
)
# Response will be JSON-formatted
print(response.text()) # {"name": "Alice", "age": 28, "city": "Paris"}- Streaming automatically disabled: When you provide a schema, streaming is automatically disabled since structured output requires the full response. You don't need to explicitly set
stream=False- it happens automatically.
temperature(float, 0.0-2.0, default: 1.0): Controls randomness in generation- 0.0 = deterministic
- 2.0 = very random
max_tokens(int, default: 1024): Maximum tokens to generate
System prompts can be provided using llm's built-in --system or -s flag.
The plugin checks Apple Intelligence availability on startup. If Apple Intelligence is not available, you'll see an error message with details on why.
Common reasons:
- Device not eligible (requires Apple Silicon)
- Apple Intelligence not enabled in Settings
- Model not ready (downloading or initializing)
Creative writing with higher temperature:
llm -m apple "Write a creative story about a robot" -o temperature 1.8Factual query with lower temperature:
llm -m apple "Explain quantum computing" -o temperature 0.3With system prompt for career guidance:
llm -m apple "Should I learn Python or JavaScript?" \
--system "You are a career counselor specializing in tech"# Run all tests (unit tests with mocks)
uv run pytest
# Run tests with coverage
uv run pytest --cov=llm_apple --cov-report=html --cov-report=term
# Run integration tests (requires Apple Intelligence)
uv run pytest tests/test_integration_tools.py -v -sMost tests use mocks to simulate the Apple Foundation Models API, so they can run on any platform without requiring actual Apple Intelligence hardware.
Integration tests in tests/test_integration_tools.py require Apple Intelligence to be available and will be automatically skipped if not present.