fireflyframework · ancongui · May 31, 2026 · May 31, 2026 · May 31, 2026 · May 31, 2026
diff --git a/.github/workflows/nightly.yml b/.github/workflows/nightly.yml
@@ -19,7 +19,7 @@ jobs:
       - uses: actions/setup-python@v6
         with:
           python-version: '3.13'
-      - run: uv sync --extra dev --extra rest --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
+      - run: uv sync --extra dev --extra binary --extra vectorstores-sqlite-vec --extra vectorstores-pgvector --extra openai-embeddings
       - run: uv run pytest --cov --cov-report=term-missing --durations=50
 
   report-failure:

diff --git a/.github/workflows/pr-gate.yml b/.github/workflows/pr-gate.yml
@@ -57,7 +57,7 @@ jobs:
       - uses: actions/setup-python@v6
         with:
           python-version: '3.13'
-      - run: uv sync --extra dev --extra rest --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
+      - run: uv sync --extra dev --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
       - run: uv run pyright
 
   test:
@@ -72,7 +72,7 @@ jobs:
       - uses: actions/setup-python@v6
         with:
           python-version: '3.13'
-      - run: uv sync --extra dev --extra rest --extra binary --extra vectorstores-sqlite-vec --extra openai-embeddings
+      - run: uv sync --extra dev --extra binary --extra vectorstores-sqlite-vec --extra vectorstores-pgvector --extra openai-embeddings
       - run: uv run pytest -m "not nightly" --cov --cov-report=term-missing
 
   build:

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,6 +7,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 Copyright 2026 Firefly Software Foundation. Licensed under the Apache License 2.0.
 
+## [26.05.33] - 2026-05-31
+
+### Removed
+
+- **BREAKING — REST/queue exposure layer.** Deleted the `fireflyframework_agentic.exposure`
+  package (FastAPI app factory, HTTP/WS controllers, health probes, SSE, CORS/rate-limit/auth
+  middleware, and Kafka/RabbitMQ/Redis consumer/producer hosts), the `rest`/`kafka`/`rabbitmq`/
+  `redis`/`queues` extras, the `ExposureError`/`QueueConnectionError` exceptions, and the
+  REST-serving config fields `auth_api_keys`/`auth_bearer_tokens`/`cors_allowed_origins`.
+  Serving/hosting is now owned by the consuming service. The framework is a pure in-process
+  library: it serves no port and consumes no broker.
+- **BREAKING — service/infra observability.** Removed `observability.configure_exporters`
+  (global OTel SDK provider/exporter wiring), the W3C trace-context propagation helpers
+  (`inject_trace_context`/`extract_trace_context`/`get_trace_context`/`set_trace_context`/
+  `trace_context_scope`), the `WebhookSink`, and the `otlp_endpoint` config field. The
+  framework still emits model/agent spans/metrics via the OpenTelemetry API; configuring the
+  SDK/exporters and cross-service trace propagation is now the host's responsibility.
+- **BREAKING — inbound RBAC auth.** Removed `security.RBACManager`/`require_permission`, the
+  `rbac_enabled`/`rbac_jwt_secret`/`rbac_multi_tenant` config fields, and the `pyjwt`
+  dependency from the `security` extra (`cryptography` stays for `EncryptedMemoryStore`).
+  Inbound-request authorization is a hosting concern owned by the service.
+
+### Changed
+
+- **`experiments`/`lab` documented as optional** leaf developer-tooling modules (no code or
+  dependency change; they were already not imported by the core).
+
 ## [26.05.32] - 2026-05-31
 
 ### Fixed

diff --git a/README.md b/README.md
@@ -40,65 +40,57 @@ Copyright 2026 Firefly Software Foundation. Licensed under the Apache License 2.
 model-agnostic agents with structured output. But a production GenAI system demands
 far more than a single agent call. You need to orchestrate multi-step reasoning,
 validate and retry LLM outputs against schemas, manage conversation memory across
-turns, observe every call with traces and metrics, run A/B experiments to compare
-models, and expose the whole thing over REST or message queues — all without coupling
-your domain logic to infrastructure concerns.
+turns, observe every call with traces and metrics, and run A/B experiments to compare
+models — all without coupling your domain logic to infrastructure concerns.
 
 **fireflyframework-agentic is the production framework built on top of Pydantic AI.**
-It extends the engine with six composable layers — from core configuration through
-agent management, intelligent reasoning, experimentation, pipeline orchestration,
-and service exposure — so that every concern has a dedicated, protocol-driven module.
+It extends the engine with composable layers — from core configuration through
+agent management, intelligent reasoning, experimentation, and pipeline orchestration —
+so that every concern has a dedicated, protocol-driven module.
 You write your business logic; the framework provides the architecture.
 
 **What "metaframework" means in practice:**
 
 - You keep Pydantic AI's familiar `Agent`, `Tool`, and `RunContext` APIs unchanged.
 - The framework wraps them with lifecycle hooks, registries, delegation routers,
-  memory managers, reasoning patterns, validation loops, DAG pipelines, and exposure
-  endpoints — all optional, all composable, all swappable through Python protocols.
-- No vendor lock-in: switch models, swap memory backends, or replace the REST layer
+  memory managers, reasoning patterns, validation loops, and DAG pipelines — all
+  optional, all composable, all swappable through Python protocols.
+- No vendor lock-in: switch models, swap memory backends, or replace components
   without touching your agent code.
 
 ---
 
 ## Key Principles
 
 1. **Protocol-driven contracts** — Every extension point is defined as a
-   `@runtime_checkable` `Protocol` or abstract base class. The framework ships thirteen
+   `@runtime_checkable` `Protocol` or abstract base class. The framework ships twelve
    protocols (`AgentLike`, `ToolProtocol`, `GuardProtocol`, `ReasoningPattern`,
    `DelegationStrategy`, `StepExecutor`, `CompressionStrategy`, `MemoryStore`,
-   `ValidationRule`, `Chunker`, `EmbeddingProtocol`, `VectorStoreProtocol`,
-   `QueueConsumer` / `QueueProducer`) so you can swap or extend any component
-   without modifying framework internals.
+   `ValidationRule`, `Chunker`, `EmbeddingProtocol`, `VectorStoreProtocol`) so you can
+   swap or extend any component without modifying framework internals.
 
 2. **Convention over configuration** — Sensible defaults everywhere.
    `FireflyAgenticConfig` is a Pydantic Settings singleton that reads from environment
    variables prefixed with `FIREFLY_AGENTIC_` and `.env` files. One config object
    governs model defaults, retry counts, token limits, observability endpoints,
    memory backends, and validation thresholds — override only what you need.
 
-3. **Layered composition** — Six layers with strict top-down dependency flow:
-   **Core → Agent → Intelligence → Experimentation → Orchestration → Exposure**.
+3. **Layered composition** — Layers with strict top-down dependency flow:
+   **Core → Agent → Intelligence → Experimentation → Orchestration**.
    Higher layers depend on lower layers but never the reverse, keeping the
    dependency graph acyclic and each module independently testable.
 
-4. **Optional dependencies** — Heavy libraries (`fastapi`, `aiokafka`, `aio-pika`,
-   `redis`, `chromadb`, `pinecone`, `openai`) are declared as pip extras (`[rest]`,
-   `[kafka]`, `[rabbitmq]`, `[redis]`, `[openai-embeddings]`,
-   `[vectorstores-chroma]`, `[all]`). The core framework imports them lazily inside
-   factory functions so that you install only what your deployment requires.
+4. **Optional dependencies** — Heavy libraries (`chromadb`, `pinecone`, `openai`,
+   `asyncpg`) are declared as pip extras (`[openai-embeddings]`,
+   `[vectorstores-chroma]`, `[postgres]`, `[all]`). The core framework imports them
+   lazily inside factory functions so that you install only what your deployment requires.
 
 ---
 
 ## Architecture at a Glance
 
 ```mermaid
 graph TD
-    subgraph Exposure Layer
-        REST["REST API<br/><small>create_agentic_app · SSE streaming<br/>health · middleware · router</small>"]
-        QUEUES["Message Queues<br/><small>Kafka · RabbitMQ · Redis<br/>consumers · producers · QueueRouter</small>"]
-    end
-
     subgraph Orchestration Layer
         PIPE["Pipeline / DAG Engine<br/><small>DAG · DAGNode · DAGEdge<br/>PipelineEngine · PipelineBuilder<br/>AgentStep · ReasoningStep · CallableStep<br/>FanOutStep · FanInStep<br/>EmbeddingStep · RetrievalStep</small>"]
     end
@@ -116,7 +108,7 @@ graph TD
     subgraph Intelligence Layer
         REASON["Reasoning Patterns<br/><small>ReAct · CoT · PlanAndExecute<br/>Reflexion · ToT · GoalDecomposition<br/>ReasoningPipeline</small>"]
         VAL["Validation & QoS<br/><small>OutputReviewer · OutputValidator<br/>ConfidenceScorer · ConsistencyChecker<br/>GroundingChecker · 5 rule types</small>"]
-        OBS["Observability<br/><small>FireflyTracer · FireflyMetrics<br/>FireflyEvents · UsageTracker<br/>CostCalculator · @traced · @metered<br/>configure_exporters</small>"]
+        OBS["Observability<br/><small>FireflyTracer · FireflyMetrics<br/>FireflyEvents · UsageTracker<br/>CostCalculator · @traced · @metered</small>"]
         EXPL["Explainability<br/><small>TraceRecorder · ExplanationGenerator<br/>AuditTrail · ReportBuilder</small>"]
     end
 
@@ -135,8 +127,6 @@ graph TD
         PLUG["Plugin System<br/><small>PluginDiscovery<br/>3 entry-point groups</small>"]
     end
 
-    REST --> PIPE
-    QUEUES --> PIPE
     PIPE --> AGT
     PIPE --> REASON
     PIPE --> VAL
@@ -213,15 +203,6 @@ classDiagram
         +name: str
         +validate(value) ValidationRuleResult
     }
-    class QueueConsumer {
-        <<Protocol>>
-        +start()
-        +stop()
-    }
-    class QueueProducer {
-        <<Protocol>>
-        +publish(message)
-    }
     class EmbeddingProtocol {
         <<Protocol>>
         +embed(texts) EmbeddingResult
@@ -265,12 +246,6 @@ classDiagram
     ValidationRule <|.. RangeRule
     ValidationRule <|.. EnumRule
     ValidationRule <|.. CustomRule
-    QueueConsumer <|.. KafkaAgentConsumer
-    QueueConsumer <|.. RabbitMQAgentConsumer
-    QueueConsumer <|.. RedisAgentConsumer
-    QueueProducer <|.. KafkaAgentProducer
-    QueueProducer <|.. RabbitMQAgentProducer
-    QueueProducer <|.. RedisAgentProducer
     EmbeddingProtocol <|.. BaseEmbedder
     VectorStoreProtocol <|.. BaseVectorStore
 ```
@@ -346,8 +321,9 @@ classDiagram
   tools, and reasoning steps. `FireflyMetrics` records tokens (total, prompt,
   completion), latency, cost, errors, and reasoning depth via the OTel metrics API.
   `FireflyEvents` emits structured log records. `@traced` and `@metered` decorators
-  instrument any function with one line. `configure_exporters` sets up OTLP or
-  console exporters. `UsageTracker` automatically records token usage, cost
+  instrument any function with one line. The framework emits model and agent
+  telemetry purely through the OpenTelemetry API; the host application owns OTel
+  SDK and exporter configuration. `UsageTracker` automatically records token usage, cost
   estimates, and latency for every agent run, reasoning step, and pipeline
   execution. `CostCalculator` supports a built-in static price table and optional
   `genai-prices` integration. Budget enforcement logs warnings when configurable
@@ -370,14 +346,11 @@ classDiagram
   `EvalDataset` loads/saves test cases from JSON. `ModelComparison` runs the
   same prompts across multiple agents for side-by-side analysis.
 
-- **Exposure** — `create_agentic_app()` produces a FastAPI application with
-  auto-generated `POST /agents/{name}/run` endpoints, SSE streaming via
-  `sse_stream`, health/readiness/liveness checks, CORS and request-ID middleware,
-  and multimodal input support. Queue consumers (`KafkaAgentConsumer`,
-  `RabbitMQAgentConsumer`, `RedisAgentConsumer`) route messages to agents.
-  Queue producers (`KafkaAgentProducer`, `RabbitMQAgentProducer`,
-  `RedisAgentProducer`) publish results back. `QueueRouter` provides
-  pattern-based message routing across agents.
+  > **Optional developer tooling.** `fireflyframework_agentic.experiments` (A/B
+  > experiments) and `fireflyframework_agentic.lab` (offline evaluation /
+  > benchmarking) are leaf modules — nothing in the core imports them and they add
+  > no third-party dependencies. Import them only if you run experiments or
+  > evaluations; agent-building consumers can ignore them.
 
 - **Embeddings** — `EmbeddingProtocol` (duck-typed) and `BaseEmbedder`
   (inheritance with auto-batching) provide provider-agnostic text embedding.
@@ -425,17 +398,12 @@ classDiagram
 
 **Optional dependencies** (installed via extras):
 
-- `[rest]` — [FastAPI](https://fastapi.tiangolo.com/) `>=0.115.0`, [Uvicorn](https://www.uvicorn.org/) `>=0.34.0`, [sse-starlette](https://github.com/sysid/sse-starlette) `>=2.0.0`
-- `[kafka]` — [aiokafka](https://aiokafka.readthedocs.io/) `>=0.12.0`
-- `[rabbitmq]` — [aio-pika](https://aio-pika.readthedocs.io/) `>=9.5.0`
-- `[redis]` — [redis-py](https://redis-py.readthedocs.io/) `>=5.2.0`
 - `[costs]` — [genai-prices](https://pypi.org/project/genai-prices/) for up-to-date LLM pricing data
-- `[queues]` — All queue backends (Kafka + RabbitMQ + Redis)
 - `[openai-embeddings]` — [openai](https://github.com/openai/openai-python) `>=1.0.0` for OpenAI/Azure embeddings
 - `[vectorstores-chroma]` — [chromadb](https://www.trychroma.com/) `>=0.5.0`
 - `[vectorstores-pinecone]` — [pinecone](https://www.pinecone.io/) `>=5.0.0`
 - `[vectorstores-qdrant]` — [qdrant-client](https://qdrant.tech/) `>=1.12.0`
-- `[all]` — Everything (REST + queues + embeddings + vector stores + costs + security + HTTP)
+- `[all]` — Everything (embeddings + vector stores + costs + security + HTTP)
 
 **LLM provider keys** (at least one):
 
@@ -490,11 +458,6 @@ uv sync --all-extras # or: pip install -e ".[all]"
 
 | Extra | What it adds | When you need it |
 |---|---|---|
-| `rest` | FastAPI, Uvicorn, SSE | Exposing agents as REST endpoints |
-| `kafka` | aiokafka | Consuming/producing via Apache Kafka |
-| `rabbitmq` | aio-pika | Consuming/producing via RabbitMQ |
-| `redis` | redis-py | Consuming/producing via Redis Pub/Sub |
-| `queues` | All of the above | Any message queue integration |
 | `postgres` | asyncpg, SQLAlchemy | PostgreSQL memory persistence |
 | `mongodb` | motor, pymongo | MongoDB memory persistence |
 | `security` | PyJWT, cryptography | RBAC, encryption, JWT auth |
@@ -657,34 +620,6 @@ results = await store.search_text("machine learning languages", top_k=1)
 print(results[0].document.text)  # Python is great for AI
 ```
 
-### 9. Expose via REST
-
-```python
-from fireflyframework_agentic.exposure.rest import create_agentic_app
-
-app = create_agentic_app(title="My GenAI Service")
-# uvicorn myapp:app --reload
-```
-
-### 10. Expose via Queues (Consumer)
-
-```python
-from fireflyframework_agentic.exposure.queues.kafka import KafkaAgentConsumer
-
-consumer = KafkaAgentConsumer("assistant", topic="requests", bootstrap_servers="localhost:9092")
-await consumer.start()
-```
-
-### 11. Publish via Queues (Producer)
-
-```python
-from fireflyframework_agentic.exposure.queues.kafka import KafkaAgentProducer
-
-producer = KafkaAgentProducer(topic="results", bootstrap_servers="localhost:9092")
-await producer.publish({"agent": "assistant", "output": "Done processing."})
-await producer.close()
-```
-
 ## Using in Jupyter Notebooks
 
 firefly-agentic works seamlessly in Jupyter notebooks and JupyterLab.
@@ -779,7 +714,7 @@ pipeline. Start here if you want to learn the framework thoroughly.
 **[docs/use-case-idp.md](docs/use-case-idp.md)** is a focused walkthrough of building a
 7-phase IDP pipeline that ingests, splits, classifies, extracts, validates, assembles,
 and explains data from corporate documents — using agents, reasoning, document splitting,
-content processing, validation, explainability, pipelines, and REST exposure.
+content processing, validation, explainability, and pipelines.
 
 ### Module Reference
 
@@ -801,8 +736,6 @@ Detailed guides for each module:
 - [Explainability](docs/explainability.md) — Decision recording, audit trails, reports
 - [Experiments](docs/experiments.md) — A/B testing, variant comparison
 - [Lab](docs/lab.md) — Benchmarks, datasets, evaluators
-- [Exposure REST](docs/exposure-rest.md) — FastAPI integration, SSE streaming
-- [Exposure Queues](docs/exposure-queues.md) — Kafka, RabbitMQ, Redis integration
 - Studio — moved to [fireflyframework-agentic-studio](https://github.com/fireflyframework/fireflyframework-agentic-studio)
 ---
 

diff --git a/docs/README.md b/docs/README.md
@@ -8,9 +8,9 @@ Copyright 2026 Firefly Software Foundation. Licensed under the Apache License 2.
 ---
 
 **fireflyframework-agentic** is the production-grade GenAI metaframework built on
-[Pydantic AI](https://ai.pydantic.dev/). It extends the engine with six composable
+[Pydantic AI](https://ai.pydantic.dev/). It extends the engine with composable
 layers — from core configuration through agent management, intelligent reasoning,
-experimentation, pipeline orchestration, and service exposure — so that every concern
+experimentation, and pipeline orchestration — so that every concern
 has a dedicated, protocol-driven module.
 
 ---
@@ -28,14 +28,14 @@ has a dedicated, protocol-driven module.
 
 ## Documentation Map
 
-The framework is organised into six layers. Each layer depends only on the layers
+The framework is organised into layered modules. Each layer depends only on the layers
 below it, keeping the dependency graph acyclic and each module independently testable.
 
 ### Core Layer
 
 | | |
 |---|---|
-| **[Architecture](architecture.md)** | Design principles, six-layer model, protocol hierarchy, dependency flow |
+| **[Architecture](architecture.md)** | Design principles, layered model, protocol hierarchy, dependency flow |
 
 ### Agent Layer
 
@@ -82,19 +82,16 @@ below it, keeping the dependency graph acyclic and each module independently tes
 | **[Experiments](experiments.md)** | `Experiment`, `Variant`, `ExperimentRunner`, `ExperimentTracker`, `VariantComparator` |
 | **[Lab](lab.md)** | `LabSession`, `Benchmark`, `EvalOrchestrator`, `EvalDataset`, `ModelComparison` |
 
+> **Optional developer tooling.** `experiments` and `lab` are leaf modules — nothing
+> in the core imports them and they add no third-party dependencies. Import them only
+> if you run experiments or evaluations; agent-building consumers can ignore them.
+
 ### Orchestration Layer
 
 | | |
 |---|---|
 | **[Pipeline](pipeline.md)** | `DAG`, `PipelineEngine`, `PipelineBuilder`, step types, parallel execution, retries |
 
-### Exposure Layer
-
-| | |
-|---|---|
-| **[REST Exposure](exposure-rest.md)** | `create_agentic_app()`, auto-generated routes, SSE streaming, WebSocket, auth middleware, conversation CRUD, rate limiting, health checks |
-| **[Queue Exposure](exposure-queues.md)** | Kafka, RabbitMQ, Redis consumers/producers, `QueueRouter` |
-
 ### Studio
 
 Studio (visual IDE, project API, scheduling, tunnel exposure, BPM tutorial)
@@ -109,7 +106,7 @@ lives in a separate repository:
 every concept from zero to expert through a real-world **Intelligent Document
 Processing** pipeline. It covers configuration, agents, tools, prompts, reasoning,
 content processing, memory, validation, pipelines, observability, explainability,
-experiments, lab, REST and queue exposure, deployment, and advanced patterns.
+experiments, lab, deployment, and advanced patterns.
 
 ---
 

diff --git a/docs/agents.md b/docs/agents.md
@@ -909,15 +909,6 @@ async with await agent.run_stream("Question", streaming_mode="incremental") as s
         print(token, end="", flush=True)
 ```
 
-### REST API Integration
-
-The framework's REST API exposes both streaming modes:
-
-- **`POST /agents/{name}/stream`** — Buffered streaming (SSE)
-- **`POST /agents/{name}/stream/incremental`** — Incremental streaming (SSE)
-
-See [REST API Guide](exposure-rest.md) for details.
-
 ---
 
 ## Run Timeout