The production-grade GenAI metaframework built on Pydantic AI.
Copyright 2026 Firefly Software Foundation. Licensed under the Apache License 2.0.
- Why fireflyframework-agentic?
- Key Principles
- Architecture at a Glance
- Feature Highlights
- Requirements
- Installation
- 5-Minute Quick Start
- Using in Jupyter Notebooks
- Learn the Framework
- Development
- Contributing
- License
Pydantic AI provides an excellent foundation: type-safe, model-agnostic agents with structured output. But a production GenAI system demands far more than a single agent call. You need to orchestrate multi-step reasoning, validate and retry LLM outputs against schemas, manage conversation memory across turns, observe every call with traces and metrics, and run A/B experiments to compare models — all without coupling your domain logic to infrastructure concerns.
fireflyframework-agentic is the production framework built on top of Pydantic AI. It extends the engine with composable layers — from core configuration through agent management, intelligent reasoning, experimentation, and pipeline orchestration — so that every concern has a dedicated, protocol-driven module. You write your business logic; the framework provides the architecture.
What "metaframework" means in practice:
- You keep Pydantic AI's familiar
Agent,Tool, andRunContextAPIs unchanged. - The framework wraps them with lifecycle hooks, registries, delegation routers, memory managers, reasoning patterns, validation loops, and DAG pipelines — all optional, all composable, all swappable through Python protocols.
- No vendor lock-in: switch models, swap memory backends, or replace components without touching your agent code.
-
Protocol-driven contracts — Every extension point is defined as a
@runtime_checkableProtocolor abstract base class. The framework ships twelve protocols (AgentLike,ToolProtocol,GuardProtocol,ReasoningPattern,DelegationStrategy,StepExecutor,CompressionStrategy,MemoryStore,ValidationRule,Chunker,EmbeddingProtocol,VectorStoreProtocol) so you can swap or extend any component without modifying framework internals. -
Convention over configuration — Sensible defaults everywhere.
FireflyAgenticConfigis a Pydantic Settings singleton that reads from environment variables prefixed withFIREFLY_AGENTIC_and.envfiles. One config object governs model defaults, retry counts, token limits, telemetry emission (observability_enabled), strict-cost mode (cost_strict), memory backends, and validation thresholds — override only what you need. -
Layered composition — Layers with strict top-down dependency flow: Core → Agent → Intelligence → Experimentation → Orchestration. Higher layers depend on lower layers but never the reverse, keeping the dependency graph acyclic and each module independently testable.
-
Optional dependencies — Heavy libraries (
chromadb,pinecone,openai,asyncpg) are declared as pip extras ([openai-embeddings],[vectorstores-chroma],[postgres],[all]). The core framework imports them lazily inside factory functions so that you install only what your deployment requires.
graph TD
subgraph Orchestration Layer
PIPE["Pipeline / DAG Engine<br/><small>DAG · DAGNode · DAGEdge<br/>PipelineEngine · PipelineBuilder<br/>AgentStep · ReasoningStep · CallableStep<br/>FanOutStep · FanInStep · BranchStep<br/>BatchLLMStep · EmbeddingStep · RetrievalStep<br/>Checkpointer · AuditLog · state reducers</small>"]
end
subgraph Embeddings / Vector Stores
EMB["Embeddings<br/><small>EmbeddingProtocol · BaseEmbedder<br/>OpenAI · Azure · Cohere · Google<br/>Mistral · Voyage · Bedrock · Ollama<br/>EmbedderRegistry · similarity</small>"]
VS["Vector Stores<br/><small>VectorStoreProtocol · BaseVectorStore<br/>InMemory · ChromaDB · Pinecone · Qdrant<br/>pgvector · sqlite-vec<br/>Scoped/Tenant-scoped layer<br/>VectorStoreRegistry · search_text</small>"]
end
subgraph Experimentation Layer
EXP["Experiments<br/><small>Experiment · Variant<br/>ExperimentRunner · VariantComparator<br/>ExperimentTracker</small>"]
LAB["Lab<br/><small>LabSession · Benchmark<br/>EvalOrchestrator · EvalDataset<br/>ModelComparison</small>"]
end
subgraph Intelligence Layer
REASON["Reasoning Patterns<br/><small>ReAct · CoT · PlanAndExecute<br/>Reflexion · ToT · GoalDecomposition<br/>ReasoningPipeline</small>"]
VAL["Validation & QoS<br/><small>OutputReviewer · OutputValidator<br/>ConfidenceScorer · ConsistencyChecker<br/>GroundingChecker · 5 rule types</small>"]
OBS["Observability<br/><small>FireflyTracer · FireflyMetrics<br/>FireflyEvents · UsageTracker<br/>resolve_cost · BudgetGate<br/>@traced · @metered</small>"]
EXPL["Explainability<br/><small>TraceRecorder · ExplanationGenerator<br/>AuditTrail · ReportBuilder</small>"]
end
subgraph Agent Layer
AGT["Agents<br/><small>FireflyAgent · AgentRegistry<br/>DelegationRouter · 7 strategies<br/>MiddlewareChain · 10 middleware<br/>FallbackModelWrapper · ResultCache<br/>AgentLifecycle · @firefly_agent · 5 templates</small>"]
TOOLS["Tools<br/><small>BaseTool · ToolBuilder · ToolKit<br/>5 guards · 3 composers<br/>ToolRegistry · 9 built-ins</small>"]
PROMPTS["Prompts<br/><small>PromptTemplate · PromptRegistry<br/>3 composers · PromptValidator<br/>PromptLoader</small>"]
CONTENT["Content<br/><small>TextChunker · DocumentSplitter<br/>ImageTiler · BatchProcessor<br/>ContextCompressor · SlidingWindowManager</small>"]
MEM["Memory<br/><small>MemoryManager · ConversationMemory<br/>WorkingMemory · TokenEstimator<br/>InMemoryStore · FileStore · SQLiteStore</small>"]
end
subgraph Core Layer
CFG["Config<br/><small>FireflyAgenticConfig<br/>get_config · reset_config</small>"]
TYPES["Types & Protocols<br/><small>AgentLike protocol<br/>TypeVars · type aliases<br/>(other protocols live in their modules)</small>"]
EXC["Exceptions<br/><small>FireflyAgenticError hierarchy<br/>34 exception classes</small>"]
PLUG["Plugin System<br/><small>PluginDiscovery<br/>3 entry-point groups</small>"]
RESIL["Resilience<br/><small>CircuitBreaker<br/>CircuitBreakerMiddleware<br/>CircuitState</small>"]
STORE["Storage<br/><small>DatabaseStore · LocalBackend<br/>StorageBackend · WriteSession<br/>LockToken · RetryPolicy</small>"]
SEC["Security<br/><small>PromptGuard · OutputGuard<br/>AESEncryptionProvider<br/>EncryptedMemoryStore</small>"]
end
PIPE --> AGT
PIPE --> REASON
PIPE --> VAL
PIPE --> EMB
PIPE --> VS
VS --> EMB
EMB --> CFG
VS --> CFG
EXP --> AGT
LAB --> EXP
REASON --> AGT
OBS --> AGT
EXPL --> OBS
VAL --> AGT
AGT --> TOOLS
AGT --> PROMPTS
AGT --> CONTENT
AGT --> MEM
AGT --> CFG
TOOLS --> CFG
PROMPTS --> CFG
CONTENT --> CFG
MEM --> CFG
REASON --> CFG
VAL --> CFG
AGT --> RESIL
AGT --> SEC
MEM --> STORE
SEC --> CFG
RESIL --> CFG
STORE --> CFG
Every extension point is a @runtime_checkable protocol. Implement the protocol to
create your own components; the framework discovers them via duck typing.
classDiagram
class AgentLike {
<<Protocol>>
+run(prompt, **kwargs) Any
}
class ToolProtocol {
<<Protocol>>
+name: str
+description: str
+execute(**kwargs) Any
}
class GuardProtocol {
<<Protocol>>
+check(tool_name, kwargs) GuardResult
}
class ReasoningPattern {
<<Protocol>>
+execute(agent, input, **kwargs) ReasoningResult
}
class StepExecutor {
<<Protocol>>
+execute(context, inputs) Any
}
class DelegationStrategy {
<<Protocol>>
+select(agents, prompt, **kwargs) Any
}
class CompressionStrategy {
<<Protocol>>
+compress(text, max_tokens) str
}
class MemoryStore {
<<Protocol>>
+save(namespace, entry)
+load(namespace) list
+delete(namespace, entry_id)
+clear(namespace)
}
class ValidationRule {
<<Protocol>>
+name: str
+validate(value) ValidationRuleResult
}
class EmbeddingProtocol {
<<Protocol>>
+embed(texts) EmbeddingResult
+embed_one(text) list~float~
}
class VectorStoreProtocol {
<<Protocol>>
+upsert(documents, namespace)
+search(query_embedding, top_k) list~SearchResult~
+search_text(query, top_k) list~SearchResult~
+delete(ids, namespace)
}
AgentLike <|.. FireflyAgent
AgentLike <|.. pydantic_ai.Agent
ToolProtocol <|.. BaseTool
ToolProtocol <|.. SequentialComposer
ToolProtocol <|.. FallbackComposer
ToolProtocol <|.. ConditionalComposer
GuardProtocol <|.. ValidationGuard
GuardProtocol <|.. RateLimitGuard
GuardProtocol <|.. ApprovalGuard
GuardProtocol <|.. SandboxGuard
GuardProtocol <|.. CompositeGuard
ReasoningPattern <|.. AbstractReasoningPattern
ReasoningPattern <|.. ReasoningPipeline
StepExecutor <|.. AgentStep
StepExecutor <|.. ReasoningStep
StepExecutor <|.. CallableStep
StepExecutor <|.. FanOutStep
StepExecutor <|.. FanInStep
DelegationStrategy <|.. RoundRobinStrategy
DelegationStrategy <|.. CapabilityStrategy
DelegationStrategy <|.. ContentBasedStrategy
DelegationStrategy <|.. CostAwareStrategy
DelegationStrategy <|.. ChainStrategy
DelegationStrategy <|.. FallbackStrategy
DelegationStrategy <|.. WeightedStrategy
CompressionStrategy <|.. TruncationStrategy
CompressionStrategy <|.. SummarizationStrategy
CompressionStrategy <|.. MapReduceStrategy
MemoryStore <|.. InMemoryStore
MemoryStore <|.. FileStore
ValidationRule <|.. RegexRule
ValidationRule <|.. FormatRule
ValidationRule <|.. RangeRule
ValidationRule <|.. EnumRule
ValidationRule <|.. CustomRule
EmbeddingProtocol <|.. BaseEmbedder
VectorStoreProtocol <|.. BaseVectorStore
-
Agents —
FireflyAgentwrapspydantic_ai.Agentwith metadata, lifecycle hooks, and automatic registration.AgentRegistryprovides singleton name-based discovery.DelegationRouterroutes prompts across agent pools via seven strategies (RoundRobinStrategy,CapabilityStrategy,ContentBasedStrategy,CostAwareStrategy,ChainStrategy,FallbackStrategy,WeightedStrategy). A composable middleware stack (MiddlewareChainoverAgentMiddleware) wraps every run —LoggingMiddlewareis always wired andObservabilityMiddlewareis added whenobservability_enabled, withPromptGuardMiddleware,OutputGuardMiddleware,CostGuardMiddleware,CacheMiddleware,PromptCacheMiddleware,ExplainabilityMiddleware,ValidationMiddleware, andRetryMiddlewareavailable to add.FallbackModelWrapper/run_with_fallbackprovide automatic model failover, andResultCache/CacheStatisticsback response caching. The@firefly_agentdecorator defines an agent in one statement. Five template factories (create_summarizer_agent,create_classifier_agent,create_extractor_agent,create_conversational_agent,create_router_agent) cover common use cases out of the box. -
Tools —
ToolProtocol(duck-typed) andBaseTool(inheritance) let you choose your extensibility style.ToolBuilderprovides a fluent API for building tools without subclassing. Five guard types (ValidationGuard,RateLimitGuard,ApprovalGuard,SandboxGuard,CompositeGuard) intercept calls before execution. Three composition patterns (SequentialComposer,FallbackComposer,ConditionalComposer) build higher-order tools.ToolKitgroups tools for bulk registration. Nine built-in tools (calculator, datetime, filesystem, HTTP, JSON, search, shell, text, database) are ready to attach to any agent. -
Prompts —
PromptTemplaterenders Jinja2 templates with variable validation and token estimation.PromptRegistrymaps names to versioned templates. Three composers (SequentialComposer,ConditionalComposer,MergeComposer) combine templates at render time.PromptValidatorenforces token limits and required sections.PromptLoaderloads templates from strings, files, or entire directories. -
Reasoning — Six pluggable patterns implement
AbstractReasoningPattern's template-method loop (_reason→_act→_observe→_should_continue): ReAct (observe-think-act), Chain of Thought (step-by-step), Plan-and-Execute (goal → plan → steps with optional replanning), Reflexion (execute → critique → retry), Tree of Thoughts (branch → evaluate → select), and Goal Decomposition (goal → phases → tasks). All produce structuredReasoningResultwithReasoningTrace. Prompts are slot-overridable.OutputReviewercan validate final outputs.ReasoningPipelinechains patterns sequentially. -
Content —
TextChunkersplits by tokens, sentences, or paragraphs with configurable overlap;MarkdownChunkerchunks structure-aware on Markdown headings.DocumentSplitterdetects page breaks and section separators.ImageTilercomputes tile coordinates for VLM processing.BatchProcessorruns chunks through an agent concurrently with a semaphore.ContextCompressordelegates to pluggable strategies (TruncationStrategy,SummarizationStrategy,MapReduceStrategy) —ContextCompressor.compressis async.SlidingWindowManagermaintains a rolling token-budgeted context window. The[binary]-gatedcontent.binarysubmodule normalises uploaded files into consumer-ready artifacts:BinaryNormalizer(withBinaryConfig) producesBinaryArtifacts,sniff_media_typedetects formats,build_office_converterselects anOfficeConverter(GotenbergConverter,LibreOfficeConverter,NoOpOfficeConverter), andPdfGuard,ImageNormalizer,ArchiveUnpacker, andEmailUnpackerhandle PDFs, images, archives, and emails. -
Memory —
ConversationMemorystores per-conversation turn history with token-budget enforcement (newest-first FIFO eviction).WorkingMemoryprovides a scoped key-value scratchpad backed byMemoryStore(InMemoryStore,FileStore, orSQLiteStore).MemoryManagercomposes both behind a unified API and supportsfork()for isolating working memory in delegated agents or pipeline branches while sharing conversation context.create_llm_summarizerbuilds an LLM-backed history summarizer for long conversations. -
Validation — Five composable rules (
RegexRule,FormatRule,RangeRule,EnumRule,CustomRule) feed intoFieldValidatorandOutputValidator.OutputReviewerwraps agent calls with parse-then-validate retry logic: on failure it builds a feedback prompt and retries up to N times.RubricRevieweradds LLM-as-judge grading against a rubric (RubricReviewer.from_rubric_file). QoS guards (ConfidenceScorer,ConsistencyChecker,GroundingChecker, plus theQoSGuardaggregator returning aQoSResult) detect hallucinations and low-quality extractions before they propagate downstream. -
Pipeline —
DAGholdsDAGNodeandDAGEdgeobjects with cycle detection and topological sort.PipelineEngineexecutes nodes level-by-level viaasyncio.gatherfor maximum concurrency, with per-node condition gates, retries, and timeouts.PipelineBuilderoffers a fluent API (add_node/add_edge/chain). Step types adapt agents, patterns, and functions to DAG nodes:AgentStep,ReasoningStep,CallableStep,FanOutStep,FanInStep,BranchStep,BatchLLMStep,EmbeddingStep, andRetrievalStep. State reducers (append,extend,merge_dict,replace) merge fan-out results, and control signals (Pause,Send) drive branching and human-in-the-loop pauses.Checkpointer/FileCheckpointer(withCheckpointRecord) persist and resume long runs, and a pluggable audit-log family (AuditLog,FileAuditLog,LoggingAuditLog,OtelAuditLog,QueryableAuditLogoverAuditEntry) records execution traces. -
Observability —
FireflyTracercreates OpenTelemetry spans scoped to agents, tools, and reasoning steps.FireflyMetricsrecords tokens (total, prompt, completion), latency, cost, errors, and reasoning depth via the OTel metrics API.FireflyEventsemits structured log records.@tracedand@metereddecorators instrument any function with one line. The framework emits model and agent telemetry purely through the OpenTelemetry API; the host application owns OTel SDK and exporter configuration.UsageTrackerautomatically records token usage, cost estimates, and latency for every agent run, reasoning step, and pipeline execution. Cost is computed through a resolver chain (resolve_cost,genai_prices_cost,provider_reported_cost,DEFAULT_RESOLVERS); setcost_strictto raiseUnknownModelCostErrorwhen no price is found.BudgetGateenforces token/cost budgets per scope (BudgetRule,BudgetMode,BudgetWindow), and a pluggable sink family (LoggingSink,JSONLFileSink,OTelMetricsSink,EventBusSink,CostSink) routes usage records wherever you need them. -
Explainability —
TraceRecordercaptures every LLM call, tool invocation, and reasoning step asDecisionRecordobjects.ExplanationGeneratorturns records into human-readable narratives.AuditTrailprovides an append-only, immutable log with JSON export for compliance.ReportBuilderproduces Markdown and JSON reports with statistics. -
Security —
PromptGuardscans inbound prompts for injection and jailbreak patterns;OutputGuardredacts secrets and PII from model output (default_prompt_guard/default_output_guardprovide ready-to-use instances). At-rest protection comes fromAESEncryptionProvider(behind theEncryptionProviderprotocol) andEncryptedMemoryStore, which encryptsMemoryEntry.contentwhile leaving keys, metadata, and timestamps in plaintext. Inbound request authentication and authorization are a hosting concern, not the framework's. -
Resilience —
CircuitBreakertrips after a configurable failure threshold and rejects calls withCircuitBreakerOpenErrorwhile open, transitioning throughCircuitState(closed → open → half-open).CircuitBreakerMiddlewareplugs it into the agent middleware chain so a failing model is short-circuited before it drains your budget. -
Storage —
StorageBackendabstracts blob/object storage withLocalBackendout of the box;DatabaseStorepersists artifacts with leasing (WriteSession,LockToken), a configurableRetryPolicy, andStorageMetadata. Typed errors (StorageUploadError,StorageDownloadError,StorageLeaseError,StorageTransientError,StoreUnavailableError) make failure handling explicit. -
Experiments —
Experimentdefines variants with model, temperature, and prompt overrides.ExperimentRunnerexecutes all variants against a dataset via anagent_factorycallable.ExperimentTrackerpersists results with optional JSON export.VariantComparatorcomputes latency, output length, and comparison summaries. -
Lab —
LabSessionmanages interactive agent sessions with history.Benchmarkruns agents against standardised inputs and reports p95 latency.EvalOrchestratorscores agent outputs with pluggableScorerfunctions.EvalDatasetloads/saves test cases from JSON.ModelComparisonruns the same prompts across multiple agents for side-by-side analysis.Optional developer tooling.
fireflyframework_agentic.experiments(A/B experiments) andfireflyframework_agentic.lab(offline evaluation / benchmarking) are leaf modules — nothing in the core imports them and they add no third-party dependencies. Import them only if you run experiments or evaluations; agent-building consumers can ignore them. -
Embeddings —
EmbeddingProtocol(duck-typed) andBaseEmbedder(inheritance with auto-batching) provide provider-agnostic text embedding. Eight providers ship out of the box: OpenAI, Azure OpenAI, Cohere, Google, Mistral, Voyage AI, AWS Bedrock, and Ollama (local).EmbedderRegistrymanages named instances. Built-in similarity utilities (cosine_similarity,euclidean_distance,dot_product) compare vectors without external dependencies. Configuration viaembedding_batch_size,embedding_max_retries, anddefault_embedding_model. -
Vector Stores —
VectorStoreProtocolandBaseVectorStoreprovide pluggable storage and retrieval with six backends: InMemoryVectorStore (zero-dependency, brute-force cosine), ChromaVectorStore, PineconeVectorStore, QdrantVectorStore, PgVectorVectorStore (Postgres + pgvector), and SqliteVecVectorStore (embedded sqlite-vec). A multi-tenant isolation layer (ScopedVectorStore,TenantScopedVectorStore, plusscope_namespace/parse_scope_namespacehelpers) namespaces documents per tenant or scope. Auto-embedding upserts documents without pre-computed vectors.search_textembeds a query string and searches in one call, andSearchFilternarrows results by metadata. Namespace scoping isolates document collections.VectorStoreRegistrymanages named instances.EmbeddingStepandRetrievalStepintegrate directly into DAG pipelines for retrieval-augmented workflows. -
Studio — moved to its own repository: fireflyframework-agentic-studio. A browser-based visual IDE for building agent pipelines (drag-and-drop canvas, code generation, AI assistant, time-travel debugging). Install with
pip install fireflyframework-agentic-studioand launch withfirefly studio.
Runtime:
Core dependencies (installed automatically):
- pydantic-ai
>=1.99.0— Agent engine (model calls, tool dispatch, streaming) - pydantic
>=2.10.0— Data validation and settings - pydantic-settings
>=2.7.0— Environment-based configuration - Jinja2
>=3.1.0— Prompt template engine - httpx
>=0.28.0— Async HTTP client (built-in HTTP tool, Gotenberg converter) - OpenTelemetry API
>=1.29.0— Tracing and metrics - OpenTelemetry SDK
>=1.29.0— Telemetry primitives - genai-prices
>=0.0.1— LLM pricing data for cost resolution - markdown-it-py
>=3.0— Structure-aware Markdown chunking - python-dotenv
>=1.0.0—.envloading for example scripts
Optional dependencies (installed via extras):
[embeddings]— numpy for fast in-memory vector math[openai-embeddings]— openai>=1.0.0for OpenAI/Azure embeddings[vectorstores-chroma]— chromadb>=0.5.0[vectorstores-pinecone]— pinecone>=5.0.0[vectorstores-qdrant]— qdrant-client>=1.12.0[vectorstores-pgvector]— asyncpg>=0.30.0for Postgres + pgvector[vectorstores-sqlite-vec]— sqlite-vec>=0.1.6for embedded vector search[binary]— pypdf, Pillow, pillow-heif, cairosvg, py7zr, extract-msg forcontent.binary[all]— Everything (memory backends, security, all embedding providers, all vector stores, watch, binary)
LLM provider keys (at least one):
OPENAI_API_KEYfor OpenAI modelsANTHROPIC_API_KEYfor Anthropic modelsGEMINI_API_KEYfor Google Gemini modelsGROQ_API_KEYfor Groq models- Or any Pydantic AI-supported provider
The interactive installer detects your platform, checks Python and UV, lets you choose extras, and installs everything with progress indicators and verification.
macOS / Linux:
curl -fsSL https://raw.githubusercontent.com/fireflyframework/fireflyframework-agentic/main/install.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/fireflyframework/fireflyframework-agentic/main/install.ps1 | iexBoth installers support non-interactive mode for CI/CD:
# macOS / Linux — install with all extras, no prompts
curl -fsSL https://raw.githubusercontent.com/fireflyframework/fireflyframework-agentic/main/install.sh | bash# Windows — install with all extras, no prompts
.\install.ps1 -NonInteractive -Extras allgit clone https://github.com/fireflyframework/fireflyframework-agentic.git
cd fireflyframework-agentic
uv sync --all-extras # or: pip install -e ".[all]"| Extra | What it adds | When you need it |
|---|---|---|
postgres |
asyncpg, SQLAlchemy | PostgreSQL memory / storage persistence |
mongodb |
motor, pymongo | MongoDB memory persistence |
security |
cryptography | At-rest encryption (EncryptedMemoryStore, AESEncryptionProvider) |
embeddings |
numpy | Fast in-memory vector math |
openai-embeddings |
openai | OpenAI / Azure text embeddings |
cohere-embeddings |
cohere | Cohere text embeddings |
google-embeddings |
google-generativeai | Google text embeddings |
mistral-embeddings |
mistralai | Mistral text embeddings |
voyage-embeddings |
voyageai | Voyage AI text embeddings |
azure-embeddings |
openai | Azure OpenAI text embeddings |
bedrock-embeddings |
boto3 | AWS Bedrock text embeddings |
ollama-embeddings |
httpx | Ollama local text embeddings |
vectorstores-chroma |
chromadb | ChromaDB vector store backend |
vectorstores-pinecone |
pinecone | Pinecone vector store backend |
vectorstores-qdrant |
qdrant-client | Qdrant vector store backend |
vectorstores-pgvector |
asyncpg | Postgres + pgvector vector store backend |
vectorstores-sqlite-vec |
sqlite-vec | Embedded sqlite-vec vector store backend |
binary |
pypdf, Pillow, pillow-heif, cairosvg, py7zr, extract-msg | content.binary file normalisation |
watch |
watchfiles | File-watching for content sources |
all |
Everything above | Full install with all integrations |
python -c "import fireflyframework_agentic; print(fireflyframework_agentic.__version__)"macOS / Linux:
curl -fsSL https://raw.githubusercontent.com/fireflyframework/fireflyframework-agentic/main/uninstall.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/fireflyframework/fireflyframework-agentic/main/uninstall.ps1 | iexOr manually remove the cloned directory and its virtual environment.
Create a .env file (or set environment variables):
# Provider API key (Pydantic AI reads these automatically)
OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# GEMINI_API_KEY=...
# GROQ_API_KEY=gsk_...
# Framework settings
FIREFLY_AGENTIC_DEFAULT_MODEL=openai:gpt-4o
FIREFLY_AGENTIC_DEFAULT_TEMPERATURE=0.3The model string format is "provider:model_name" — e.g. "openai:gpt-4o",
"anthropic:claude-sonnet-4-20250514", "google:gemini-2.0-flash". Pydantic AI resolves
the matching API key from environment variables automatically. For programmatic credential
management (Azure, Bedrock, custom endpoints), pass a Pydantic AI Model object directly
to FireflyAgent(model=...) — see the tutorial.
from fireflyframework_agentic.agents import firefly_agent
@firefly_agent(name="assistant", model="openai:gpt-4o")
def assistant_instructions(ctx):
return "You are a helpful conversational assistant."from fireflyframework_agentic.tools import firefly_tool
@firefly_tool(name="lookup", description="Look up a term")
async def lookup(query: str) -> str:
return f"Result for {query}"from fireflyframework_agentic.agents import FireflyAgent
from fireflyframework_agentic.memory import MemoryManager
memory = MemoryManager(max_conversation_tokens=32_000)
agent = FireflyAgent(name="bot", model="openai:gpt-4o", memory=memory)
cid = memory.new_conversation()
result = await agent.run("Hello!", conversation_id=cid)
result = await agent.run("What did I just say?", conversation_id=cid)from fireflyframework_agentic.reasoning import ReActPattern
react = ReActPattern(max_steps=5)
result = await react.execute(agent, "What is the weather in London?")
print(result.output)from pydantic import BaseModel
from fireflyframework_agentic.validation import OutputReviewer
class Answer(BaseModel):
answer: str
confidence: float
reviewer = OutputReviewer(output_type=Answer, max_retries=2)
result = await reviewer.review(agent, "What is 2+2?")
print(result.output) # Answer(answer="4", confidence=0.99)from fireflyframework_agentic.pipeline.builder import PipelineBuilder
from fireflyframework_agentic.pipeline.steps import AgentStep, CallableStep
pipeline = (
PipelineBuilder("my-pipeline")
.add_node("classify", AgentStep(classifier_agent))
.add_node("extract", AgentStep(extractor_agent))
.add_node("validate", CallableStep(validate_fn))
.chain("classify", "extract", "validate")
.build()
)
result = await pipeline.run(inputs="Process this document")from fireflyframework_agentic.embeddings.providers import OpenAIEmbedder
from fireflyframework_agentic.vectorstores import InMemoryVectorStore, VectorDocument
embedder = OpenAIEmbedder(model="text-embedding-3-small")
store = InMemoryVectorStore(embedder=embedder)
# Upsert documents (auto-embedded)
await store.upsert([
VectorDocument(id="1", text="Python is great for AI"),
VectorDocument(id="2", text="Rust is fast and safe"),
])
# Search by text
results = await store.search_text("machine learning languages", top_k=1)
print(results[0].document.text) # Python is great for AIfirefly-agentic works seamlessly in Jupyter notebooks and JupyterLab.
Since the framework is async-first, use await directly in notebook cells
(Jupyter provides a running event loop automatically).
# From your clone directory
cd fireflyframework-agentic
source .venv/bin/activate # activate the venv created by the installer
pip install ipykernel # install Jupyter kernel support
python -m ipykernel install --user --name fireflyagentic --display-name "Firefly Agentic"
jupyter lab # or: jupyter notebookThen select the Firefly Agentic kernel when creating a new notebook.
# Cell 1 — configure
import os
os.environ["OPENAI_API_KEY"] = "sk-..." # or set in .env
os.environ["FIREFLY_AGENTIC_DEFAULT_MODEL"] = "openai:gpt-4o"# Cell 2 — create an agent
from fireflyframework_agentic.agents import FireflyAgent
agent = FireflyAgent(name="notebook-bot", model="openai:gpt-4o")
result = await agent.run("Explain quantum entanglement in two sentences.")
print(result.output)# Cell 3 — use memory for multi-turn conversations
from fireflyframework_agentic.memory import MemoryManager
memory = MemoryManager(max_conversation_tokens=32_000)
agent_with_mem = FireflyAgent(name="chat", model="openai:gpt-4o", memory=memory)
cid = memory.new_conversation()
result = await agent_with_mem.run("My name is Alice.", conversation_id=cid)
print(result.output)
result = await agent_with_mem.run("What is my name?", conversation_id=cid)
print(result.output) # Alice# Cell 4 — reasoning patterns
from fireflyframework_agentic.reasoning import ReActPattern
react = ReActPattern(max_steps=5)
result = await react.execute(agent, "What are the top 3 uses of Python in 2026?")
print(result.output)# Cell 5 — structured output with validation
from pydantic import BaseModel
from fireflyframework_agentic.validation import OutputReviewer
class Summary(BaseModel):
title: str
bullet_points: list[str]
confidence: float
reviewer = OutputReviewer(output_type=Summary, max_retries=2)
result = await reviewer.review(agent, "Summarize the benefits of async Python.")
result.output # displays the structured Summary object in the notebookTip: You do not need
asyncio.run()ornest_asyncioin Jupyter —awaitworks at the top level of any cell because Jupyter runs its own event loop.
docs/tutorial.md is an 18-chapter, hands-on guide that teaches every concept from zero to expert through a real-world Intelligent Document Processing pipeline. Start here if you want to learn the framework thoroughly.
docs/use-case-idp.md is a focused walkthrough of building a 7-phase IDP pipeline that ingests, splits, classifies, extracts, validates, assembles, and explains data from corporate documents — using agents, reasoning, document splitting, content processing, validation, explainability, and pipelines.
Detailed guides for each module:
- Architecture — Design principles and layer diagram
- Agents — Lifecycle, registry, delegation, decorators
- Template Agents — Summarizer, classifier, extractor, conversational, router
- Tools — Protocol, builder, guards, composition, built-ins
- Prompts — Templates, versioning, composition, validation
- Reasoning Patterns — 6 patterns, structured outputs, custom patterns
- Content — Chunking, compression, batch processing
- Memory — Conversation history, working memory, storage backends
- Validation — Rules, QoS guards, output reviewer
- Embeddings — 8 providers, auto-batching, similarity, registry
- Vector Stores — 6 backends, tenant scoping, auto-embedding, search_text, namespaces
- Pipeline — DAG orchestrator, parallel execution, checkpointing, audit log, retries
- Observability — Tracing, metrics, events, cost resolvers, budget gates
- Explainability — Decision recording, audit trails, reports
- Security — Prompt/output guards, at-rest encryption
- Experiments — A/B testing, variant comparison
- Lab — Benchmarks, datasets, evaluators
- Studio — moved to fireflyframework-agentic-studio
git clone https://github.com/fireflyframework/fireflyframework-agentic.git
cd fireflyframework-agentic
uv sync --all-extrasuv run pytest # Run the test suite
uv run ruff check fireflyframework_agentic/ tests/ # Lint
uv run pyright fireflyframework_agentic/ # Type checkDev dependencies (installed with uv sync):
pytest >=8.3.0,
pytest-asyncio >=0.24.0,
pytest-cov >=6.0.0,
ruff >=0.9.0,
pyright >=1.1.0,
httpx >=0.28.0.
See CONTRIBUTING.md for guidelines.
See CHANGELOG.md for notable changes.
Apache License 2.0. See LICENSE for the full text.