MindsDB Query Engine connects to 200+ data sources — databases, warehouses, applications, files — and lets you query them live in one SQL dialect, with no ETL. Index unstructured content into knowledge bases, then search it by meaning, by keyword, or both at once, with plain SQL filters on top. Everything is reachable from any MySQL- or PostgreSQL-compatible client.
Where this fits: MindsDB now builds MindsHub — a hub for open AI agents. The Query Engine remains a standalone open-source project, and it pairs well with MindsHub agents: connect it to give an agent live, SQL-queryable access to your data and semantic search. The full story: MindsHub vs MindsDB.
MySQL clients · PostgreSQL clients · BI tools · ORMs · HTTP API
│
┌──────────────▼───────────────┐
│ MindsDB Query Engine │
│ one SQL dialect over │
│ a federated query planner │
└──────────────┬───────────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
┌─────────▼─────────┐ ┌─────────▼─────────┐ ┌─────────▼─────────┐
│ Databases │ │ Apps & files │ │ Knowledge bases │
│ Postgres, MySQL, │ │ Slack, web crawler│ │ embeddings + │
│ MongoDB, Snowflake│ │ docs, sheets, │ │ vector store + │
│ BigQuery, S3, … │ │ email, calendars… │ │ BM25 index │
└───────────────────┘ └───────────────────┘ └───────────────────┘
queried live, in place — data is never copied
- One server, three interfaces. The engine ships a built-in SQL editor on HTTP (
:47334) and speaks the MySQL (:47335) and PostgreSQL (:47336) wire protocols — somysql,psql, DBeaver, SQLAlchemy, or any BI tool connects directly. - Federated queries, no pipelines.
CREATE DATABASEattaches a live data source through an integration handler. The planner translates each query, pushes work down to the source, and streams results back — your data stays where it is. Source-specific syntax is still available via native queries. - Knowledge bases are the semantic layer. A knowledge base combines an embedding model, an optional reranking model, and a vector store (e.g. pgvector).
INSERT INTOit to chunk, embed, and index content;SELECTfrom it to retrieve by meaning, filtered by metadata columns like any other table. - Hybrid retrieval. Hybrid search runs vector similarity and BM25 keyword matching in parallel and merges the results — for queries that mix natural language with exact identifiers, codes, or acronyms.
- Organize and automate. Projects namespace your work, views save cross-source transformations, and jobs schedule any SQL to run on an interval — e.g. to keep knowledge bases fresh.
Run with Docker:
docker run --name mindsdb_container \
-e MINDSDB_APIS=http,mysql \
-p 47334:47334 -p 47335:47335 \
mindsdb/mindsdbOr install from PyPI:
pip install mindsdb # add extras as needed, e.g. mindsdb[pgvector,openai,postgres]
python -m mindsdbThen open the editor at http://127.0.0.1:47334, or connect any MySQL client to port 47335. The quickstart walks through the rest.
Six SQL statements, start to finish. Full syntax for every statement is in the SQL reference.
1. Attach your data sources (docs) — they are queried live, nothing is imported:
CREATE DATABASE my_pg
WITH ENGINE = 'postgres',
PARAMETERS = {
"host": "localhost", "port": 5432,
"user": "user", "password": "pass",
"database": "mydb"
};
CREATE DATABASE my_mongo
WITH ENGINE = 'mongodb',
PARAMETERS = {
"host": "mongodb+srv://user:pass@cluster.example.net",
"database": "support"
};2. Query across sources in one dialect (docs) — even non-SQL stores like MongoDB, and save the result as a view:
CREATE VIEW open_tickets_by_product AS (
SELECT p.name, COUNT(t.ticket_id) AS open_tickets
FROM my_mongo.support_tickets AS t
JOIN my_pg.products AS p
ON t.product_id = p.id
WHERE t.status = 'open'
GROUP BY p.name
);3. Create a knowledge base (docs) — an embedding model plus a vector store, addressable as a table:
CREATE KNOWLEDGE_BASE support_kb
USING
embedding_model = {
"provider": "openai",
"model_name": "text-embedding-3-large",
"api_key": "sk-..."
},
storage = my_pgvector.support_kb_store, -- a pgvector connection
content_columns = ['subject', 'body'],
metadata_columns = ['product_name', 'priority', 'created_at'],
id_column = 'ticket_id';4. Index your content (docs) — rows are chunked, embedded, and upserted:
INSERT INTO support_kb
SELECT ticket_id, subject, body, product_name, priority, created_at
FROM my_mongo.support_tickets;5. Search by meaning, filter by metadata (docs):
SELECT chunk_content, product_name, relevance
FROM support_kb
WHERE content = 'cannot connect after the latest update'
AND priority <= 2
AND relevance >= 0.5
LIMIT 10;
-- hybrid search: blend vector similarity with BM25 keyword matching
SELECT *
FROM support_kb
WHERE content = 'error ERR-4421'
AND hybrid_search = true;▶ How to use semantic search with metadata filters — a good explainer of this feature.
6. Keep the index fresh with a job (docs):
CREATE JOB refresh_support_kb (
INSERT INTO support_kb
SELECT ticket_id, subject, body, product_name, priority, created_at
FROM my_mongo.support_tickets
WHERE created_at > LAST
)
EVERY hour;| You need | Go to |
|---|---|
| Ask a question | Discord |
| Report a bug | GitHub Issues — please include reproduction steps |
| Commercial support | Contact the team |
Security note: if you find a vulnerability, please do not open a public issue — follow our security policy instead.
Contributions are welcome — code, integrations, docs, and bug reports alike. We follow the fork-and-pull workflow: see the contribution guide to get set up, and browse the open issues for somewhere to start. Good first areas are new integration handlers, bug fixes, and documentation improvements.
- Documentation
- MindsHub — open AI agents, from the same team
- MindsHub vs MindsDB — how the product evolved
- Discord
- Contact
MindsDB Core is licensed under the Elastic License 2.0; some directories carry their own license — see the LICENSE file for the full structure.