FLAMEHAVEN FileSearch

Self-hosted RAG search engine. Production-ready in 3 minutes.

Quick Start • Features • Documentation • API Reference • Contributing

🎯 Why FLAMEHAVEN FileSearch?

Stop sending your sensitive documents to third-party services. FLAMEHAVEN FileSearch is a production-grade RAG search engine — BM25+hybrid retrieval, 34 file formats, multi-LLM (Gemini, OpenAI, Claude, Ollama) — running self-hosted in minutes, not days.

# Gemini (cloud) — one command, three minutes
docker run -d -p 8000:8000 -e GEMINI_API_KEY="your_key" flamehaven-filesearch:1.6.1

# Ollama — fully local, zero API cost (Gemma, Llama, Mistral, Qwen, Phi …)
# Step 1: pull a model  →  ollama pull gemma4:27b
docker run -d -p 8000:8000 \
  -e LLM_PROVIDER=ollama \
  -e LOCAL_MODEL=gemma4:27b \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  flamehaven-filesearch:1.6.1

🚀 Fast

Production deployment in 3 minutes
Vector generation in <1ms
Zero ML dependencies

🔒 Private

100% self-hosted
Your data never leaves your infrastructure
Enterprise-grade security

💰 Cost-Effective

Free tier: 1,500 queries/month
No infrastructure costs
Open source & MIT licensed

Features ✨

Core Capabilities

Capability	Detail
Search Modes	Keyword, semantic, and hybrid (BM25+RRF) with automatic typo correction
34 File Formats	PDF, DOCX/DOC, XLSX, PPTX, RTF, HTML, CSV, LaTeX, WebVTT, images + plain text — see Document Parsing
RAG Pipeline	Structure-aware chunking, KnowledgeAtom 2-level indexing, sliding-window context enrichment, mtime parse cache
Ultra-Fast Vectors	DSP v2.0 generates embeddings in <1ms — no ML frameworks required
Source Attribution	Every answer links back to the originating document and chunk
Framework SDKs	LangChain, LlamaIndex, Haystack, CrewAI adapters out of the box
Enterprise Auth	API key hashing (SHA256+salt), OAuth2/OIDC, fine-grained permissions
Admin Dashboard	Real-time metrics, quota management, batch processing (1–100 queries)
Flexible Storage	SQLite (default) · PostgreSQL + pgvector · Redis cache (optional)

What changed in each release? See CHANGELOG.md for the full version history.

Quick Start 🚀

Option 1: Docker (Recommended)

The fastest path to production:

docker run -d \
  -p 8000:8000 \
  -e GEMINI_API_KEY="your_gemini_api_key" \
  -e FLAMEHAVEN_ADMIN_KEY="secure_admin_password" \
  -v $(pwd)/data:/app/data \
  flamehaven-filesearch:1.6.1

✅ Server running at http://localhost:8000

Option 2: Python SDK

Perfect for integrating into existing applications:

from flamehaven_filesearch import FlamehavenFileSearch, FileSearchConfig

# Initialize
config = FileSearchConfig(google_api_key="your_gemini_key")
fs = FlamehavenFileSearch(config)

# Upload and search
fs.upload_file("company_handbook.pdf", store="docs")
result = fs.search("What is our remote work policy?", store="docs")

print(result['answer'])
# Output: "Employees can work remotely up to 3 days per week..."

Option 3: REST API

For language-agnostic integration:

# 1. Generate API key
curl -X POST http://localhost:8000/api/admin/keys \
  -H "X-Admin-Key: your_admin_key" \
  -d '{"name":"production","permissions":["upload","search"]}'

# 2. Upload document
curl -X POST http://localhost:8000/api/upload/single \
  -H "Authorization: Bearer sk_live_abc123..." \
  -F "file=@document.pdf" \
  -F "store=my_docs"

# 3. Search
curl -X POST http://localhost:8000/api/search \
  -H "Authorization: Bearer sk_live_abc123..." \
  -H "Content-Type: application/json" \
  -d 
  '{ 
    "query": "What are the main findings?",
    "store": "my_docs",
    "search_mode": "hybrid"
  }'

📦 Installation

# Core package (HTML, CSV, LaTeX, WebVTT, plain-text parsing included — zero extra deps)
pip install flamehaven-filesearch

# + Document parsers: PDF (pymupdf/pypdf), DOCX, XLSX, PPTX, RTF
pip install flamehaven-filesearch[parsers]

# + Image OCR (Pillow + pytesseract; requires Tesseract system binary)
pip install flamehaven-filesearch[vision]

# + Google Gemini API
pip install flamehaven-filesearch[google]

# + REST API server (FastAPI + uvicorn)
pip install flamehaven-filesearch[api]

# + HNSW vector index
pip install flamehaven-filesearch[vector]

# + PostgreSQL backend
pip install flamehaven-filesearch[postgres]

# Everything
pip install flamehaven-filesearch[all]

# Build from source
git clone https://github.com/flamehaven01/Flamehaven-Filesearch.git
cd Flamehaven-Filesearch
docker build -t flamehaven-filesearch:1.6.1 .

Framework Integrations

Framework SDKs (LangChain, LlamaIndex, etc.) are imported lazily — install only what you need:

# LangChain  (pip install langchain-core)
from flamehaven_filesearch.integrations import FlamehavenLangChainLoader
docs = FlamehavenLangChainLoader("report.pdf", chunk=True).load()

# LlamaIndex  (pip install llama-index-core)
from flamehaven_filesearch.integrations import FlamehavenLlamaIndexReader
nodes = FlamehavenLlamaIndexReader(chunk=True).load_data(["report.pdf", "slides.pptx"])

# Haystack  (pip install haystack-ai)
from flamehaven_filesearch.integrations import FlamehavenHaystackConverter
result = FlamehavenHaystackConverter().run(sources=["report.pdf"])

# CrewAI  (pip install crewai)
from flamehaven_filesearch.integrations import FlamehavenCrewAITool
tool = FlamehavenCrewAITool()           # pass to your agent's tools list

Configuration ⚙️

LLM Provider Selection

FLAMEHAVEN supports four LLM backends — switch with a single env var:

`LLM_PROVIDER`	Required variables	Notes
`gemini` (default)	`GEMINI_API_KEY`	Google Gemini file-search API
`ollama`	`LOCAL_MODEL`, `OLLAMA_BASE_URL`	Local inference via Ollama — Gemma 4/3, Llama 3.2, Qwen 2.5, Mistral, Phi-4 …
`openai`	`OPENAI_API_KEY`	OpenAI or any OpenAI-compatible endpoint
`anthropic`	`ANTHROPIC_API_KEY`	Anthropic Claude
`openai_compatible`	`OPENAI_API_KEY`, `OPENAI_BASE_URL`	vLLM, LM Studio, Kimi, etc.

# Gemini (default)
export GEMINI_API_KEY="your_google_gemini_api_key"

# Ollama (fully local)
export LLM_PROVIDER=ollama
export LOCAL_MODEL=gemma4:27b          # or gemma4:4b, qwen2.5:7b, llama3.2 …
export OLLAMA_BASE_URL=http://localhost:11434

# OpenAI
export LLM_PROVIDER=openai
export OPENAI_API_KEY="sk-..."
export DEFAULT_MODEL=gpt-4o-mini       # optional override

# Anthropic
export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

Required Environment Variables

export FLAMEHAVEN_ADMIN_KEY="your_secure_admin_password"
# Plus the provider credentials above (at least one provider)

Optional Configuration

export HOST="0.0.0.0"              # Bind address
export PORT="8000"                  # Server port
export REDIS_HOST="localhost"       # Distributed caching
export REDIS_PORT="6379"            # Redis port
export MAX_OUTPUT_TOKENS="1024"     # Max answer tokens
export TEMPERATURE="0.5"            # Model temperature (0.0–1.0)
export MAX_SOURCES="5"              # Max source documents per answer

Advanced Configuration

Create a config.yaml for fine-tuned control:

vector_store:
  quantization: int8
  compression: gravitas_pack
  
search:
  default_mode: hybrid
  typo_correction: true
  max_results: 10
  
security:
  rate_limit: 100  # requests per minute
  max_file_size: 52428800  # 50MB

📊 Performance

Metric	Value	Notes
Vector Generation	`<1ms`	DSP v2.0, zero ML dependencies
Memory Footprint	`75% reduced`	Int8 quantization vs float32
Metadata Size	`90% smaller`	Gravitas-Pack compression
Test Suite	`476 tests`	All passing (pytest)
Cold Start	`3 seconds`	Docker container ready

Real-World Benchmarks

Environment: Docker on Apple M1 Mac, 16GB RAM
Document Set: 500 PDFs, ~2GB total

Health Check:           8ms
Search (cache hit):     9ms
Search (cache miss):    1,250ms  (includes Gemini API call)
Batch Search (10):      2,500ms  (parallel processing)
Upload (50MB file):     3,200ms  (with indexing)

Architecture 🏗️

flowchart TD
    Client(["Client\n(HTTP / SDK)"])

    subgraph API["REST API Layer (FastAPI)"]
        Upload["/api/upload"]
        Search["/api/search"]
        Admin["/api/admin"]
    end

    subgraph Engine["Engine Layer"]
        FP["FileParser\n+ BackendRegistry\n(34 formats)"]
        Cache["ParseCache\n(mtime-based)"]
        Chunker["TextChunker\n+ KnowledgeAtom\n(chunk atoms)"]
        DSP["DSP v2.0\nEmbedding Generator\n(&lt;1ms, zero-ML)"]
        BM25["BM25 + RRF\nHybrid Search\n(v1.6.0)"]
        Scorer["SemanticScorer\n+ TypoCorrector"]
    end

    subgraph Storage["Storage Layer"]
        SQLite[("SQLite\nMetadata Store")]
        Vec[("Vector Store\n(local / pgvector)")]
        Redis[("Redis Cache\n(optional)")]
    end

    subgraph LLM["LLM Provider (env: LLM_PROVIDER)"]
        Gemini["Gemini\n(cloud)"]
        Ollama["Ollama\n(local)"]
        OAI["OpenAI /\nAnthropic /\nCompatible"]
    end
    Metrics["Metrics Logger"]

    Client --> Upload & Search & Admin
    Upload --> FP
    FP <-->|"cache hit/miss"| Cache
    FP --> Chunker
    Chunker --> DSP
    DSP --> Vec
    FP --> SQLite

    Search --> Scorer
    Scorer --> DSP
    DSP --> Vec
    Scorer -->|"gemini"| Gemini
    Scorer -->|"ollama"| Ollama
    Scorer -->|"openai/anthropic"| OAI
    LLM --> Client

    Admin --> Metrics
    Admin --> SQLite
    Storage <-->|"read / write"| Redis

Full layer detail: Architecture.md

Security 🔒

FLAMEHAVEN takes security seriously:

✅ API Key Hashing - SHA256 with salt
✅ Rate Limiting - Per-key quotas (default: 100/min)
✅ Permission System - Granular access control
✅ Audit Logging - Complete request history
✅ OWASP Headers - Security headers enabled by default
✅ Input Validation - Strict file type and size checks

Security Best Practices

# Use strong admin keys
export FLAMEHAVEN_ADMIN_KEY=$(openssl rand -base64 32)

# Enable HTTPS in production
# (use nginx/traefik as reverse proxy)

# Rotate API keys regularly
curl -X DELETE http://localhost:8000/api/admin/keys/old_key_id \
  -H "X-Admin-Key: $FLAMEHAVEN_ADMIN_KEY"

Roadmap 🗺️

Full roadmap: ROADMAP.md

v1.4.x (Completed)

Multimodal search (image + text)
HNSW vector indexing for faster search
OAuth2/OIDC integration
PostgreSQL backend (metadata + pgvector)
Usage-budget controls and reporting
pgvector tuning and reliability hardening
CI/CD — ruff replaces flake8; pipelines fully green

v1.5.x (Completed)

Universal Document Parser — 34 formats, zero doc-AI dependency (v1.5.0)
Internal text chunker — structure-aware + token-aware, zero ML deps (v1.5.0)
Framework integrations — LangChain, LlamaIndex, Haystack, CrewAI (v1.5.0)
Backend Plugin Architecture — AbstractFormatBackend + BackendRegistry (v1.5.2)
Parse cache — mtime-based, extract_text(use_cache=True) (v1.5.2)
ContextExtractor — sliding-window RAG chunk enrichment (v1.5.2)
Multi-provider LLM support — OpenAI, Claude, Ollama, Gemini (v1.5.3)

v1.6.0 (Completed)

BM25 + RRF hybrid search — Korean+English tokenizer, lazy per-store index
KnowledgeAtom 2-level indexing — chunk atoms with fragment URIs
Stable URI scheme — local://<store>/<quote(abs_path)>, collision-free
core.py mixin segmentation — 1258 → 221 lines, 3 focused modules
Fix: search_stream double intent-refine bug

v1.6.1 (Completed)

v2.0.0 (Q3 2026)

Multi-language support (15+ languages) — multilingual stopwords + jieba
Kubernetes Helm charts
Distributed indexing

Troubleshooting 🐛

❌ 401 Unauthorized Error

Problem: API returns 401 when making requests.

Solutions:

Verify FLAMEHAVEN_ADMIN_KEY environment variable is set
Check Authorization: Bearer sk_live_... header format
Ensure API key hasn't expired (check admin dashboard)

# Debug: Check if admin key is set
echo $FLAMEHAVEN_ADMIN_KEY

# Regenerate API key
curl -X POST http://localhost:8000/api/admin/keys \
  -H "X-Admin-Key: $FLAMEHAVEN_ADMIN_KEY" \
  -d '{"name":"debug","permissions":["search"]}'

🐌 Slow Search Performance

Problem: Searches taking >5 seconds.

Solutions:

Check cache hit rate: FLAMEHAVEN_METRICS_ENABLED=1 curl http://localhost:8000/metrics
Enable Redis for distributed caching
Verify Gemini API latency (should be <1.5s)

# Enable Redis caching
docker run -d --name redis redis:7-alpine
export REDIS_HOST=localhost

💾 High Memory Usage

Problem: Container using >2GB RAM.

Solutions:

Enable Redis with LRU eviction policy
Reduce max file size in config
Monitor with Prometheus endpoint

# Configure Redis memory limit
docker run -d \
  -p 6379:6379 \
  redis:7-alpine \
  --maxmemory 512mb \
  --maxmemory-policy allkeys-lru

More solutions in our Wiki Troubleshooting Guide.

Documentation 📚

Documentation Hub

Use the links below to jump to the most relevant guide.

Topic	Description
Document Parsing	Supported formats, internal parsers, RAG chunking
Hybrid Search	BM25+RRF, KnowledgeAtom indexing, stable URI scheme (v1.6.0)
Framework Integrations	LangChain, LlamaIndex, Haystack, CrewAI adapters
API Reference	REST endpoints, payloads, rate limits
Architecture	How all layers fit together (v1.6.0)
Configuration Reference	Full list of environment variables and config fields
Production Deployment	Docker, systemd, reverse proxy, scaling tips
Troubleshooting	Step-by-step debugging playbook
Benchmarks	Performance measurements and methodology

These Markdown files live inside the repository so they stay versioned alongside the code. Feel free to contribute improvements via pull requests.

Additional Resources

Interactive API Docs - OpenAPI/Swagger interface (when server is running)
CHANGELOG - Version history and breaking changes
CONTRIBUTING - How to contribute code
Examples - Sample integrations and use cases

Contributing 🤝

We love contributions! FLAMEHAVEN is better because of developers like you.

Good First Issues

🟢 [Easy] Add dark mode to admin dashboard (1-2 hours)
🟡 [Medium] PostgreSQL backend for usage tracker (multi-instance deployments)
🔴 [Advanced] Kubernetes Helm charts for production deployment

See CONTRIBUTING.md for development setup and guidelines.

Contributors

Community & Support 💬

💬 Discussions: GitHub Discussions
🐛 Bug Reports: GitHub Issues
🔒 Security: security@flamehaven.space
📧 General: info@flamehaven.space

License 📄

Distributed under the MIT License. See LICENSE for more information.

🙏 Acknowledgments

Built with amazing open source tools:

FastAPI - Modern Python web framework
Google Gemini - Semantic understanding and reasoning
SQLite - Lightweight, embedded database
Redis - In-memory caching (optional)

⭐ Star us on GitHub • 📖 Read the Docs • 🚀 Deploy Now

Built with 🔥 by the Flamehaven Core Team

Last updated: April 20, 2026 • Version 1.6.1

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
examples		examples
flamehaven_filesearch		flamehaven_filesearch
frontend/dashboard		frontend/dashboard
scripts		scripts
tests		tests
tools		tools
uiux		uiux
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.gitleaksignore		.gitleaksignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.slopconfig.yaml		.slopconfig.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
eval_self.py		eval_self.py
pyproject.toml		pyproject.toml
pytest.fast.ini		pytest.fast.ini
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

FLAMEHAVEN FileSearch

Self-hosted RAG search engine. Production-ready in 3 minutes.

🎯 Why FLAMEHAVEN FileSearch?

🚀 Fast

🔒 Private

💰 Cost-Effective

Features ✨

Core Capabilities

Quick Start 🚀

Option 1: Docker (Recommended)

Option 2: Python SDK

Option 3: REST API

📦 Installation

Framework Integrations

Configuration ⚙️

LLM Provider Selection

Required Environment Variables

Optional Configuration

Advanced Configuration

📊 Performance

Real-World Benchmarks

Architecture 🏗️

Security 🔒

Security Best Practices

Roadmap 🗺️

v1.4.x (Completed)

v1.5.x (Completed)

v1.6.0 (Completed)

v1.6.1 (Completed)

v2.0.0 (Q3 2026)

Troubleshooting 🐛

Documentation 📚

Documentation Hub

Additional Resources

Contributing 🤝

Good First Issues

Contributors

Community & Support 💬

License 📄

🙏 Acknowledgments

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages