diff --git a/README.md b/README.md
index 3e838ed0..bea66003 100644
--- a/README.md
+++ b/README.md
@@ -4,7 +4,7 @@
[](https://github.com/fireflyframework/fireflyframework-genai/actions/workflows/ci.yml)
[](https://www.python.org/downloads/)
[](LICENSE)
-[]()
+[]()
[](https://docs.astral.sh/ruff/)
Copyright 2026 Firefly Software Solutions Inc. Licensed under the Apache License 2.0.
@@ -63,11 +63,12 @@ You write your business logic; the framework provides the architecture.
## Key Principles
1. **Protocol-driven contracts** — Every extension point is defined as a
- `@runtime_checkable` `Protocol` or abstract base class. The framework ships eleven
+ `@runtime_checkable` `Protocol` or abstract base class. The framework ships thirteen
protocols (`AgentLike`, `ToolProtocol`, `GuardProtocol`, `ReasoningPattern`,
`DelegationStrategy`, `StepExecutor`, `CompressionStrategy`, `MemoryStore`,
- `ValidationRule`, `Chunker`, `QueueConsumer` / `QueueProducer`) so you can swap or
- extend any component without modifying framework internals.
+ `ValidationRule`, `Chunker`, `EmbeddingProtocol`, `VectorStoreProtocol`,
+ `QueueConsumer` / `QueueProducer`) so you can swap or extend any component
+ without modifying framework internals.
2. **Convention over configuration** — Sensible defaults everywhere.
`FireflyGenAIConfig` is a Pydantic Settings singleton that reads from environment
@@ -81,9 +82,10 @@ You write your business logic; the framework provides the architecture.
dependency graph acyclic and each module independently testable.
4. **Optional dependencies** — Heavy libraries (`fastapi`, `aiokafka`, `aio-pika`,
- `redis`) are declared as pip extras (`[rest]`, `[studio]`, `[kafka]`, `[rabbitmq]`,
- `[redis]`, `[all]`). The core framework imports them lazily inside factory
- functions so that you install only what your deployment requires.
+ `redis`, `chromadb`, `pinecone`, `openai`) are declared as pip extras (`[rest]`,
+ `[studio]`, `[kafka]`, `[rabbitmq]`, `[redis]`, `[openai-embeddings]`,
+ `[vectorstores-chroma]`, `[all]`). The core framework imports them lazily inside
+ factory functions so that you install only what your deployment requires.
---
@@ -97,7 +99,12 @@ graph TD
end
subgraph Orchestration Layer
- PIPE["Pipeline / DAG Engine
DAG · DAGNode · DAGEdge
PipelineEngine · PipelineBuilder
AgentStep · ReasoningStep · CallableStep
FanOutStep · FanInStep"]
+ PIPE["Pipeline / DAG Engine
DAG · DAGNode · DAGEdge
PipelineEngine · PipelineBuilder
AgentStep · ReasoningStep · CallableStep
FanOutStep · FanInStep
EmbeddingStep · RetrievalStep"]
+ end
+
+ subgraph Embeddings / Vector Stores
+ EMB["Embeddings
EmbeddingProtocol · BaseEmbedder
OpenAI · Azure · Cohere · Google
Mistral · Voyage · Bedrock · Ollama
EmbedderRegistry · similarity"]
+ VS["Vector Stores
VectorStoreProtocol · BaseVectorStore
InMemory · ChromaDB · Pinecone · Qdrant
VectorStoreRegistry · search_text"]
end
subgraph Experimentation Layer
@@ -132,6 +139,11 @@ graph TD
PIPE --> AGT
PIPE --> REASON
PIPE --> VAL
+ PIPE --> EMB
+ PIPE --> VS
+ VS --> EMB
+ EMB --> CFG
+ VS --> CFG
EXP --> AGT
LAB --> EXP
REASON --> AGT
@@ -209,6 +221,18 @@ classDiagram
<>
+publish(message)
}
+ class EmbeddingProtocol {
+ <>
+ +embed(texts) EmbeddingResult
+ +embed_one(text) list~float~
+ }
+ class VectorStoreProtocol {
+ <>
+ +upsert(documents, namespace)
+ +search(query_embedding, top_k) list~SearchResult~
+ +search_text(query, top_k) list~SearchResult~
+ +delete(ids, namespace)
+ }
AgentLike <|.. FireflyAgent
AgentLike <|.. pydantic_ai.Agent
@@ -246,6 +270,8 @@ classDiagram
QueueProducer <|.. KafkaAgentProducer
QueueProducer <|.. RabbitMQAgentProducer
QueueProducer <|.. RedisAgentProducer
+ EmbeddingProtocol <|.. BaseEmbedder
+ VectorStoreProtocol <|.. BaseVectorStore
```
---
@@ -352,6 +378,24 @@ classDiagram
`RedisAgentProducer`) publish results back. `QueueRouter` provides
pattern-based message routing across agents.
+- **Embeddings** — `EmbeddingProtocol` (duck-typed) and `BaseEmbedder`
+ (inheritance with auto-batching) provide provider-agnostic text embedding.
+ Eight providers ship out of the box: **OpenAI**, **Azure OpenAI**, **Cohere**,
+ **Google**, **Mistral**, **Voyage AI**, **AWS Bedrock**, and **Ollama** (local).
+ `EmbedderRegistry` manages named instances. Built-in similarity utilities
+ (`cosine_similarity`, `euclidean_distance`, `dot_product`) compare vectors
+ without external dependencies. Configuration via `embedding_batch_size`,
+ `embedding_max_retries`, and `default_embedding_model`.
+
+- **Vector Stores** — `VectorStoreProtocol` and `BaseVectorStore` provide
+ pluggable storage and retrieval with four backends: **InMemoryVectorStore**
+ (zero-dependency, brute-force cosine), **ChromaDB**, **Pinecone**, and **Qdrant**.
+ Auto-embedding upserts documents without pre-computed vectors. `search_text`
+ embeds a query string and searches in one call. Namespace scoping isolates
+ document collections. `VectorStoreRegistry` manages named instances.
+ `EmbeddingStep` and `RetrievalStep` integrate directly into DAG pipelines
+ for RAG workflows.
+
- **Studio** — `firefly studio` launches a browser-based visual IDE for
building agent pipelines. Drag and connect Agent, Tool, Reasoning, and
Condition nodes on an interactive canvas. The Code tab generates Python
@@ -388,7 +432,11 @@ classDiagram
- `[redis]` — [redis-py](https://redis-py.readthedocs.io/) `>=5.2.0`
- `[costs]` — [genai-prices](https://pypi.org/project/genai-prices/) for up-to-date LLM pricing data
- `[queues]` — All queue backends (Kafka + RabbitMQ + Redis)
-- `[all]` — Everything (REST + all queues + costs + security + HTTP)
+- `[openai-embeddings]` — [openai](https://github.com/openai/openai-python) `>=1.0.0` for OpenAI/Azure embeddings
+- `[vectorstores-chroma]` — [chromadb](https://www.trychroma.com/) `>=0.5.0`
+- `[vectorstores-pinecone]` — [pinecone](https://www.pinecone.io/) `>=5.0.0`
+- `[vectorstores-qdrant]` — [qdrant-client](https://qdrant.tech/) `>=1.12.0`
+- `[all]` — Everything (REST + queues + embeddings + vector stores + costs + security + HTTP)
**LLM provider keys** (at least one):
@@ -454,6 +502,17 @@ uv sync --all-extras # or: pip install -e ".[all]"
| `security` | PyJWT, cryptography | RBAC, encryption, JWT auth |
| `http` | httpx | HTTP connection pooling for tools |
| `costs` | genai-prices | Up-to-date LLM pricing data |
+| `openai-embeddings` | openai | OpenAI / Azure text embeddings |
+| `cohere-embeddings` | cohere | Cohere text embeddings |
+| `google-embeddings` | google-generativeai | Google text embeddings |
+| `mistral-embeddings` | mistralai | Mistral text embeddings |
+| `voyage-embeddings` | voyageai | Voyage AI text embeddings |
+| `azure-embeddings` | openai | Azure OpenAI text embeddings |
+| `bedrock-embeddings` | boto3 | AWS Bedrock text embeddings |
+| `ollama-embeddings` | httpx | Ollama local text embeddings |
+| `vectorstores-chroma` | chromadb | ChromaDB vector store backend |
+| `vectorstores-pinecone` | pinecone | Pinecone vector store backend |
+| `vectorstores-qdrant` | qdrant-client | Qdrant vector store backend |
| `all` | Everything above | Full deployment with all integrations |
### Verify Installation
@@ -587,7 +646,27 @@ pipeline = (
result = await pipeline.run(inputs="Process this document")
```
-### 8. Expose via REST
+### 8. Embed and Search (RAG)
+
+```python
+from fireflyframework_genai.embeddings.providers import OpenAIEmbedder
+from fireflyframework_genai.vectorstores import InMemoryVectorStore, VectorDocument
+
+embedder = OpenAIEmbedder(model="text-embedding-3-small")
+store = InMemoryVectorStore(embedder=embedder)
+
+# Upsert documents (auto-embedded)
+await store.upsert([
+ VectorDocument(id="1", text="Python is great for AI"),
+ VectorDocument(id="2", text="Rust is fast and safe"),
+])
+
+# Search by text
+results = await store.search_text("machine learning languages", top_k=1)
+print(results[0].document.text) # Python is great for AI
+```
+
+### 9. Expose via REST
```python
from fireflyframework_genai.exposure.rest import create_genai_app
@@ -596,7 +675,7 @@ app = create_genai_app(title="My GenAI Service")
# uvicorn myapp:app --reload
```
-### 9. Expose via Queues (Consumer)
+### 10. Expose via Queues (Consumer)
```python
from fireflyframework_genai.exposure.queues.kafka import KafkaAgentConsumer
@@ -605,7 +684,7 @@ consumer = KafkaAgentConsumer("assistant", topic="requests", bootstrap_servers="
await consumer.start()
```
-### 10. Publish via Queues (Producer)
+### 11. Publish via Queues (Producer)
```python
from fireflyframework_genai.exposure.queues.kafka import KafkaAgentProducer
@@ -724,6 +803,8 @@ Detailed guides for each module:
- [Content](docs/content.md) — Chunking, compression, batch processing
- [Memory](docs/memory.md) — Conversation history, working memory, storage backends
- [Validation](docs/validation.md) — Rules, QoS guards, output reviewer
+- [Embeddings](docs/embeddings.md) — 8 providers, auto-batching, similarity, registry
+- [Vector Stores](docs/vectorstores.md) — 4 backends, auto-embedding, search_text, namespaces
- [Pipeline](docs/pipeline.md) — DAG orchestrator, parallel execution, retries
- [Observability](docs/observability.md) — Tracing, metrics, events
- [Explainability](docs/explainability.md) — Decision recording, audit trails, reports
@@ -743,7 +824,7 @@ uv sync --all-extras
```
```bash
-uv run pytest # Run 367+ tests
+uv run pytest # Run 1383+ tests
uv run ruff check src/ tests/ # Lint
uv run pyright src/ # Type check
```