Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
- uses: actions/setup-python@v6
with:
python-version: '3.13'
- run: uv sync --extra dev --extra rest --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
- run: uv sync --extra dev --extra binary --extra vectorstores-sqlite-vec --extra vectorstores-pgvector --extra openai-embeddings
- run: uv run pytest --cov --cov-report=term-missing --durations=50

report-failure:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pr-gate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ jobs:
- uses: actions/setup-python@v6
with:
python-version: '3.13'
- run: uv sync --extra dev --extra rest --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
- run: uv sync --extra dev --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
- run: uv run pyright

test:
Expand All @@ -72,7 +72,7 @@ jobs:
- uses: actions/setup-python@v6
with:
python-version: '3.13'
- run: uv sync --extra dev --extra rest --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
- run: uv sync --extra dev --extra binary --extra vectorstores-sqlite-vec --extra vectorstores-pgvector --extra openai-embeddings
- run: uv run pytest -m "not nightly" --cov --cov-report=term-missing

build:
Expand Down
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

Copyright 2026 Firefly Software Foundation. Licensed under the Apache License 2.0.

## [26.05.33] - 2026-05-31

### Removed

- **BREAKING — REST/queue exposure layer.** Deleted the `fireflyframework_agentic.exposure`
package (FastAPI app factory, HTTP/WS controllers, health probes, SSE, CORS/rate-limit/auth
middleware, and Kafka/RabbitMQ/Redis consumer/producer hosts), the `rest`/`kafka`/`rabbitmq`/
`redis`/`queues` extras, the `ExposureError`/`QueueConnectionError` exceptions, and the
REST-serving config fields `auth_api_keys`/`auth_bearer_tokens`/`cors_allowed_origins`.
Serving/hosting is now owned by the consuming service. The framework is a pure in-process
library: it serves no port and consumes no broker.
- **BREAKING — service/infra observability.** Removed `observability.configure_exporters`
(global OTel SDK provider/exporter wiring), the W3C trace-context propagation helpers
(`inject_trace_context`/`extract_trace_context`/`get_trace_context`/`set_trace_context`/
`trace_context_scope`), the `WebhookSink`, and the `otlp_endpoint` config field. The
framework still emits model/agent spans/metrics via the OpenTelemetry API; configuring the
SDK/exporters and cross-service trace propagation is now the host's responsibility.
- **BREAKING — inbound RBAC auth.** Removed `security.RBACManager`/`require_permission`, the
`rbac_enabled`/`rbac_jwt_secret`/`rbac_multi_tenant` config fields, and the `pyjwt`
dependency from the `security` extra (`cryptography` stays for `EncryptedMemoryStore`).
Inbound-request authorization is a hosting concern owned by the service.

### Changed

- **`experiments`/`lab` documented as optional** leaf developer-tooling modules (no code or
dependency change; they were already not imported by the core).

## [26.05.32] - 2026-05-31

### Fixed
Expand Down
123 changes: 28 additions & 95 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,65 +40,57 @@ Copyright 2026 Firefly Software Foundation. Licensed under the Apache License 2.
model-agnostic agents with structured output. But a production GenAI system demands
far more than a single agent call. You need to orchestrate multi-step reasoning,
validate and retry LLM outputs against schemas, manage conversation memory across
turns, observe every call with traces and metrics, run A/B experiments to compare
models, and expose the whole thing over REST or message queues — all without coupling
your domain logic to infrastructure concerns.
turns, observe every call with traces and metrics, and run A/B experiments to compare
models — all without coupling your domain logic to infrastructure concerns.

**fireflyframework-agentic is the production framework built on top of Pydantic AI.**
It extends the engine with six composable layers — from core configuration through
agent management, intelligent reasoning, experimentation, pipeline orchestration,
and service exposure — so that every concern has a dedicated, protocol-driven module.
It extends the engine with composable layers — from core configuration through
agent management, intelligent reasoning, experimentation, and pipeline orchestration
so that every concern has a dedicated, protocol-driven module.
You write your business logic; the framework provides the architecture.

**What "metaframework" means in practice:**

- You keep Pydantic AI's familiar `Agent`, `Tool`, and `RunContext` APIs unchanged.
- The framework wraps them with lifecycle hooks, registries, delegation routers,
memory managers, reasoning patterns, validation loops, DAG pipelines, and exposure
endpoints — all optional, all composable, all swappable through Python protocols.
- No vendor lock-in: switch models, swap memory backends, or replace the REST layer
memory managers, reasoning patterns, validation loops, and DAG pipelines — all
optional, all composable, all swappable through Python protocols.
- No vendor lock-in: switch models, swap memory backends, or replace components
without touching your agent code.

---

## Key Principles

1. **Protocol-driven contracts** — Every extension point is defined as a
`@runtime_checkable` `Protocol` or abstract base class. The framework ships thirteen
`@runtime_checkable` `Protocol` or abstract base class. The framework ships twelve
protocols (`AgentLike`, `ToolProtocol`, `GuardProtocol`, `ReasoningPattern`,
`DelegationStrategy`, `StepExecutor`, `CompressionStrategy`, `MemoryStore`,
`ValidationRule`, `Chunker`, `EmbeddingProtocol`, `VectorStoreProtocol`,
`QueueConsumer` / `QueueProducer`) so you can swap or extend any component
without modifying framework internals.
`ValidationRule`, `Chunker`, `EmbeddingProtocol`, `VectorStoreProtocol`) so you can
swap or extend any component without modifying framework internals.

2. **Convention over configuration** — Sensible defaults everywhere.
`FireflyAgenticConfig` is a Pydantic Settings singleton that reads from environment
variables prefixed with `FIREFLY_AGENTIC_` and `.env` files. One config object
governs model defaults, retry counts, token limits, observability endpoints,
memory backends, and validation thresholds — override only what you need.

3. **Layered composition** — Six layers with strict top-down dependency flow:
**Core → Agent → Intelligence → Experimentation → Orchestration → Exposure**.
3. **Layered composition** — Layers with strict top-down dependency flow:
**Core → Agent → Intelligence → Experimentation → Orchestration**.
Higher layers depend on lower layers but never the reverse, keeping the
dependency graph acyclic and each module independently testable.

4. **Optional dependencies** — Heavy libraries (`fastapi`, `aiokafka`, `aio-pika`,
`redis`, `chromadb`, `pinecone`, `openai`) are declared as pip extras (`[rest]`,
`[kafka]`, `[rabbitmq]`, `[redis]`, `[openai-embeddings]`,
`[vectorstores-chroma]`, `[all]`). The core framework imports them lazily inside
factory functions so that you install only what your deployment requires.
4. **Optional dependencies** — Heavy libraries (`chromadb`, `pinecone`, `openai`,
`asyncpg`) are declared as pip extras (`[openai-embeddings]`,
`[vectorstores-chroma]`, `[postgres]`, `[all]`). The core framework imports them
lazily inside factory functions so that you install only what your deployment requires.

---

## Architecture at a Glance

```mermaid
graph TD
subgraph Exposure Layer
REST["REST API<br/><small>create_agentic_app · SSE streaming<br/>health · middleware · router</small>"]
QUEUES["Message Queues<br/><small>Kafka · RabbitMQ · Redis<br/>consumers · producers · QueueRouter</small>"]
end

subgraph Orchestration Layer
PIPE["Pipeline / DAG Engine<br/><small>DAG · DAGNode · DAGEdge<br/>PipelineEngine · PipelineBuilder<br/>AgentStep · ReasoningStep · CallableStep<br/>FanOutStep · FanInStep<br/>EmbeddingStep · RetrievalStep</small>"]
end
Expand All @@ -116,7 +108,7 @@ graph TD
subgraph Intelligence Layer
REASON["Reasoning Patterns<br/><small>ReAct · CoT · PlanAndExecute<br/>Reflexion · ToT · GoalDecomposition<br/>ReasoningPipeline</small>"]
VAL["Validation & QoS<br/><small>OutputReviewer · OutputValidator<br/>ConfidenceScorer · ConsistencyChecker<br/>GroundingChecker · 5 rule types</small>"]
OBS["Observability<br/><small>FireflyTracer · FireflyMetrics<br/>FireflyEvents · UsageTracker<br/>CostCalculator · @traced · @metered<br/>configure_exporters</small>"]
OBS["Observability<br/><small>FireflyTracer · FireflyMetrics<br/>FireflyEvents · UsageTracker<br/>CostCalculator · @traced · @metered</small>"]
EXPL["Explainability<br/><small>TraceRecorder · ExplanationGenerator<br/>AuditTrail · ReportBuilder</small>"]
end

Expand All @@ -135,8 +127,6 @@ graph TD
PLUG["Plugin System<br/><small>PluginDiscovery<br/>3 entry-point groups</small>"]
end

REST --> PIPE
QUEUES --> PIPE
PIPE --> AGT
PIPE --> REASON
PIPE --> VAL
Expand Down Expand Up @@ -213,15 +203,6 @@ classDiagram
+name: str
+validate(value) ValidationRuleResult
}
class QueueConsumer {
<<Protocol>>
+start()
+stop()
}
class QueueProducer {
<<Protocol>>
+publish(message)
}
class EmbeddingProtocol {
<<Protocol>>
+embed(texts) EmbeddingResult
Expand Down Expand Up @@ -265,12 +246,6 @@ classDiagram
ValidationRule <|.. RangeRule
ValidationRule <|.. EnumRule
ValidationRule <|.. CustomRule
QueueConsumer <|.. KafkaAgentConsumer
QueueConsumer <|.. RabbitMQAgentConsumer
QueueConsumer <|.. RedisAgentConsumer
QueueProducer <|.. KafkaAgentProducer
QueueProducer <|.. RabbitMQAgentProducer
QueueProducer <|.. RedisAgentProducer
EmbeddingProtocol <|.. BaseEmbedder
VectorStoreProtocol <|.. BaseVectorStore
```
Expand Down Expand Up @@ -346,8 +321,9 @@ classDiagram
tools, and reasoning steps. `FireflyMetrics` records tokens (total, prompt,
completion), latency, cost, errors, and reasoning depth via the OTel metrics API.
`FireflyEvents` emits structured log records. `@traced` and `@metered` decorators
instrument any function with one line. `configure_exporters` sets up OTLP or
console exporters. `UsageTracker` automatically records token usage, cost
instrument any function with one line. The framework emits model and agent
telemetry purely through the OpenTelemetry API; the host application owns OTel
SDK and exporter configuration. `UsageTracker` automatically records token usage, cost
estimates, and latency for every agent run, reasoning step, and pipeline
execution. `CostCalculator` supports a built-in static price table and optional
`genai-prices` integration. Budget enforcement logs warnings when configurable
Expand All @@ -370,14 +346,11 @@ classDiagram
`EvalDataset` loads/saves test cases from JSON. `ModelComparison` runs the
same prompts across multiple agents for side-by-side analysis.

- **Exposure** — `create_agentic_app()` produces a FastAPI application with
auto-generated `POST /agents/{name}/run` endpoints, SSE streaming via
`sse_stream`, health/readiness/liveness checks, CORS and request-ID middleware,
and multimodal input support. Queue consumers (`KafkaAgentConsumer`,
`RabbitMQAgentConsumer`, `RedisAgentConsumer`) route messages to agents.
Queue producers (`KafkaAgentProducer`, `RabbitMQAgentProducer`,
`RedisAgentProducer`) publish results back. `QueueRouter` provides
pattern-based message routing across agents.
> **Optional developer tooling.** `fireflyframework_agentic.experiments` (A/B
> experiments) and `fireflyframework_agentic.lab` (offline evaluation /
> benchmarking) are leaf modules — nothing in the core imports them and they add
> no third-party dependencies. Import them only if you run experiments or
> evaluations; agent-building consumers can ignore them.

- **Embeddings** — `EmbeddingProtocol` (duck-typed) and `BaseEmbedder`
(inheritance with auto-batching) provide provider-agnostic text embedding.
Expand Down Expand Up @@ -425,17 +398,12 @@ classDiagram

**Optional dependencies** (installed via extras):

- `[rest]` — [FastAPI](https://fastapi.tiangolo.com/) `>=0.115.0`, [Uvicorn](https://www.uvicorn.org/) `>=0.34.0`, [sse-starlette](https://github.com/sysid/sse-starlette) `>=2.0.0`
- `[kafka]` — [aiokafka](https://aiokafka.readthedocs.io/) `>=0.12.0`
- `[rabbitmq]` — [aio-pika](https://aio-pika.readthedocs.io/) `>=9.5.0`
- `[redis]` — [redis-py](https://redis-py.readthedocs.io/) `>=5.2.0`
- `[costs]` — [genai-prices](https://pypi.org/project/genai-prices/) for up-to-date LLM pricing data
- `[queues]` — All queue backends (Kafka + RabbitMQ + Redis)
- `[openai-embeddings]` — [openai](https://github.com/openai/openai-python) `>=1.0.0` for OpenAI/Azure embeddings
- `[vectorstores-chroma]` — [chromadb](https://www.trychroma.com/) `>=0.5.0`
- `[vectorstores-pinecone]` — [pinecone](https://www.pinecone.io/) `>=5.0.0`
- `[vectorstores-qdrant]` — [qdrant-client](https://qdrant.tech/) `>=1.12.0`
- `[all]` — Everything (REST + queues + embeddings + vector stores + costs + security + HTTP)
- `[all]` — Everything (embeddings + vector stores + costs + security + HTTP)

**LLM provider keys** (at least one):

Expand Down Expand Up @@ -490,11 +458,6 @@ uv sync --all-extras # or: pip install -e ".[all]"

| Extra | What it adds | When you need it |
|---|---|---|
| `rest` | FastAPI, Uvicorn, SSE | Exposing agents as REST endpoints |
| `kafka` | aiokafka | Consuming/producing via Apache Kafka |
| `rabbitmq` | aio-pika | Consuming/producing via RabbitMQ |
| `redis` | redis-py | Consuming/producing via Redis Pub/Sub |
| `queues` | All of the above | Any message queue integration |
| `postgres` | asyncpg, SQLAlchemy | PostgreSQL memory persistence |
| `mongodb` | motor, pymongo | MongoDB memory persistence |
| `security` | PyJWT, cryptography | RBAC, encryption, JWT auth |
Expand Down Expand Up @@ -657,34 +620,6 @@ results = await store.search_text("machine learning languages", top_k=1)
print(results[0].document.text) # Python is great for AI
```

### 9. Expose via REST

```python
from fireflyframework_agentic.exposure.rest import create_agentic_app

app = create_agentic_app(title="My GenAI Service")
# uvicorn myapp:app --reload
```

### 10. Expose via Queues (Consumer)

```python
from fireflyframework_agentic.exposure.queues.kafka import KafkaAgentConsumer

consumer = KafkaAgentConsumer("assistant", topic="requests", bootstrap_servers="localhost:9092")
await consumer.start()
```

### 11. Publish via Queues (Producer)

```python
from fireflyframework_agentic.exposure.queues.kafka import KafkaAgentProducer

producer = KafkaAgentProducer(topic="results", bootstrap_servers="localhost:9092")
await producer.publish({"agent": "assistant", "output": "Done processing."})
await producer.close()
```

## Using in Jupyter Notebooks

firefly-agentic works seamlessly in Jupyter notebooks and JupyterLab.
Expand Down Expand Up @@ -779,7 +714,7 @@ pipeline. Start here if you want to learn the framework thoroughly.
**[docs/use-case-idp.md](docs/use-case-idp.md)** is a focused walkthrough of building a
7-phase IDP pipeline that ingests, splits, classifies, extracts, validates, assembles,
and explains data from corporate documents — using agents, reasoning, document splitting,
content processing, validation, explainability, pipelines, and REST exposure.
content processing, validation, explainability, and pipelines.

### Module Reference

Expand All @@ -801,8 +736,6 @@ Detailed guides for each module:
- [Explainability](docs/explainability.md) — Decision recording, audit trails, reports
- [Experiments](docs/experiments.md) — A/B testing, variant comparison
- [Lab](docs/lab.md) — Benchmarks, datasets, evaluators
- [Exposure REST](docs/exposure-rest.md) — FastAPI integration, SSE streaming
- [Exposure Queues](docs/exposure-queues.md) — Kafka, RabbitMQ, Redis integration
- Studio — moved to [fireflyframework-agentic-studio](https://github.com/fireflyframework/fireflyframework-agentic-studio)
---

Expand Down
21 changes: 9 additions & 12 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ Copyright 2026 Firefly Software Foundation. Licensed under the Apache License 2.
---

**fireflyframework-agentic** is the production-grade GenAI metaframework built on
[Pydantic AI](https://ai.pydantic.dev/). It extends the engine with six composable
[Pydantic AI](https://ai.pydantic.dev/). It extends the engine with composable
layers — from core configuration through agent management, intelligent reasoning,
experimentation, pipeline orchestration, and service exposure — so that every concern
experimentation, and pipeline orchestration — so that every concern
has a dedicated, protocol-driven module.

---
Expand All @@ -28,14 +28,14 @@ has a dedicated, protocol-driven module.

## Documentation Map

The framework is organised into six layers. Each layer depends only on the layers
The framework is organised into layered modules. Each layer depends only on the layers
below it, keeping the dependency graph acyclic and each module independently testable.

### Core Layer

| | |
|---|---|
| **[Architecture](architecture.md)** | Design principles, six-layer model, protocol hierarchy, dependency flow |
| **[Architecture](architecture.md)** | Design principles, layered model, protocol hierarchy, dependency flow |

### Agent Layer

Expand Down Expand Up @@ -82,19 +82,16 @@ below it, keeping the dependency graph acyclic and each module independently tes
| **[Experiments](experiments.md)** | `Experiment`, `Variant`, `ExperimentRunner`, `ExperimentTracker`, `VariantComparator` |
| **[Lab](lab.md)** | `LabSession`, `Benchmark`, `EvalOrchestrator`, `EvalDataset`, `ModelComparison` |

> **Optional developer tooling.** `experiments` and `lab` are leaf modules — nothing
> in the core imports them and they add no third-party dependencies. Import them only
> if you run experiments or evaluations; agent-building consumers can ignore them.

### Orchestration Layer

| | |
|---|---|
| **[Pipeline](pipeline.md)** | `DAG`, `PipelineEngine`, `PipelineBuilder`, step types, parallel execution, retries |

### Exposure Layer

| | |
|---|---|
| **[REST Exposure](exposure-rest.md)** | `create_agentic_app()`, auto-generated routes, SSE streaming, WebSocket, auth middleware, conversation CRUD, rate limiting, health checks |
| **[Queue Exposure](exposure-queues.md)** | Kafka, RabbitMQ, Redis consumers/producers, `QueueRouter` |

### Studio

Studio (visual IDE, project API, scheduling, tunnel exposure, BPM tutorial)
Expand All @@ -109,7 +106,7 @@ lives in a separate repository:
every concept from zero to expert through a real-world **Intelligent Document
Processing** pipeline. It covers configuration, agents, tools, prompts, reasoning,
content processing, memory, validation, pipelines, observability, explainability,
experiments, lab, REST and queue exposure, deployment, and advanced patterns.
experiments, lab, deployment, and advanced patterns.

---

Expand Down
9 changes: 0 additions & 9 deletions docs/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -909,15 +909,6 @@ async with await agent.run_stream("Question", streaming_mode="incremental") as s
print(token, end="", flush=True)
```

### REST API Integration

The framework's REST API exposes both streaming modes:

- **`POST /agents/{name}/stream`** — Buffered streaming (SSE)
- **`POST /agents/{name}/stream/incremental`** — Incremental streaming (SSE)

See [REST API Guide](exposure-rest.md) for details.

---

## Run Timeout
Expand Down
Loading
Loading