diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
new file mode 100644
index 0000000..3d3a9a5
--- /dev/null
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,18 @@
+## Linked issue
+
+Closes #
+
+
+
+## Summary
+
+
+
+## Checklist
+
+- [ ] I have read [CONTRIBUTING.md](../blob/main/CONTRIBUTING.md)
+- [ ] This PR is linked to an existing issue (above)
+- [ ] `make test` passes
+- [ ] `make lint` and `make typecheck` pass (or `make pre-commit`)
+- [ ] Added or updated tests for any behaviour changes
+- [ ] Updated docstrings / docs for any public API changes
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 547cac2..61da255 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -4,27 +4,28 @@ Thanks for your interest in semble. This document explains how contributions wor
## tl;dr
-- **Bug fix or typo?** Open a PR directly.
-- **New feature or behaviour change?** Open an issue first to discuss with us.
+- **Every PR must link to an existing issue.** Open an issue to discuss before writing code, then link it from your PR (e.g. `Closes #123`).
- **AI-generated PRs** will be closed without review if they weren't discussed beforehand.
---
## Discuss before building
-Our libraries are small and focused by design. We care a lot about keeping it that way. Before you invest time writing code for a new feature, please open an issue describing:
+Our libraries are small and focused by design. We care a lot about keeping it that way. Before you invest time writing code, please open an issue describing:
- What problem you're solving
- Why it belongs in semble (as opposed to a wrapper or separate tool)
-- What API or behaviour change it would involve
+- What API or behaviour change it would involve, if any
- A minimal (code) example of how it would work
-**PRs that add features without a prior issue will be closed.**
+This applies to small PRs (e.g. bug fixes and documentation updates) as well. A quick issue lets us confirm the fix is wanted and aligned with how we'd want to solve it, so you don't waste time on a PR we'd need to reject or rework.
-## What we welcome
+**PRs without a linked issue will be closed.**
-- Bug fixes (with a test that reproduces the issue)
-- Documentation improvements and example fixes
+## What we generally welcome
+
+- Bug fixes (with a linked issue and a test that reproduces the issue)
+- Documentation improvements and example fixes (with a linked issue)
## What we generally won't accept
@@ -47,6 +48,7 @@ If you want a feature, include the things listed under "Discuss before building"
Before opening a PR:
+- [ ] Link to an existing issue (e.g. `Closes #123`). PRs without one will be closed
- [ ] Run `make test` and confirm all tests pass
- [ ] Run `make lint` and `make typecheck`
- [ ] Run `make fix` to auto-fix any lint issues
diff --git a/README.md b/README.md
index e6e1b5a..6a583c2 100644
--- a/README.md
+++ b/README.md
@@ -18,35 +18,41 @@
[Quickstart](#quickstart) •
[MCP Server](#mcp-server) •
-[Bash / AGENTS.md](#bash-agentsmd) •
+[AGENTS.md](#agentsmd) •
[CLI](#cli) •
[Benchmarks](#benchmarks)
-Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Run it as an [MCP server](#mcp-server) or call it from the shell via [AGENTS.md](#bash-agentsmd) and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
+Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see [benchmarks](#benchmarks)). Everything runs on CPU with no API keys, GPU, or external services. Run it as an [MCP server](#mcp-server) or call it from the shell via [AGENTS.md](#agentsmd) and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo.
## Quickstart
-Your agent queries Semble in natural language (e.g. `"How is authentication handled?"`) and gets back only the relevant code snippets, without grepping or reading full files. Set it up as an MCP server or via AGENTS.md:
+Your agent queries Semble in natural language (e.g. `"How is authentication handled?"`) and gets back only the relevant code snippets, without grepping or reading full files.
-### MCP (Claude Code)
+Semble has three complementary setup paths. The recommended setup is using all three (but you can pick and choose based on your needs):
-Add Semble to Claude Code (requires [uv](https://docs.astral.sh/uv/getting-started/installation/)):
+- **[MCP server](#mcp-server)**: an MCP server for your agent.
+- **[AGENTS.md](#agentsmd)**: an AGENTS.md snippet with instructions for calling Semble via the CLI.
+- **[Sub-agent](#sub-agent-setup)**: a dedicated `semble-search` sub-agent for harnesses that support it.
+
+### MCP
+
+Expose Semble as a native tool via MCP so your agent can call it directly. Add it to Claude Code (requires [uv](https://docs.astral.sh/uv/getting-started/installation/)):
```bash
claude mcp add semble -s user -- uvx --from "semble[mcp]" semble
```
-Using another agent harness? See [MCP Server](#mcp-server) below for per-agent setup.
+See [MCP Server](#mcp-server) below for other harnesses (Cursor, Codex, OpenCode, etc.).
-### Bash / AGENTS.md
+### AGENTS.md
-Install Semble, then add the snippet below to your `AGENTS.md` or `CLAUDE.md`:
+Add Semble usage instructions to your agent's context so it knows when and how to call the CLI. Install the Semble CLI, then add the snippet below to your `AGENTS.md` or `CLAUDE.md`:
```bash
-pip install semble # Install with pip
-uv tool install semble # Or install with uv
+uv tool install semble # Install with uv (recommended)
+pip install semble # Or with pip
```
@@ -109,15 +115,23 @@ If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its plac
-Note that sub-agents cannot call MCP tools directly, see [Bash / AGENTS.md](#bash-agentsmd) and [sub-agent setup](#sub-agent-setup) below for details.
+### Sub-agent
+
+For harnesses that support sub-agents, install a dedicated `semble-search` sub-agent so search runs in its own context (requires the CLI):
+
+```bash
+semble init # Claude Code → .claude/agents/semble-search.md
+```
+
+See [Sub-agent setup](#sub-agent-setup) below for other harnesses (Cursor, Codex, OpenCode, etc.).
Updating Semble
```bash
-pip install --upgrade semble # with pip
uv tool upgrade semble # with uv
uv cache clean semble # for MCP users (restart your MCP client after)
+pip install --upgrade semble # with pip
```
@@ -316,70 +330,7 @@ Add to `~/.config/zed/settings.json` (or `.zed/settings.json` in your project):
By default the MCP server indexes only code files. To also index documentation, config, or everything, append `--content docs`, `--content config`, or `--content all` to the server command, or a combination, e.g. `--content code docs`. For example, in Claude Code: `claude mcp add semble -s user -- uvx --from "semble[mcp]" semble --content all`.
-
-
-## Bash / AGENTS.md
-
-An alternative to MCP is to invoke Semble via Bash. Sub-agents cannot call MCP tools directly, so this is the only option for sub-agent support; it can also be used alongside MCP for the top-level agent.
-
-To add Bash support, append the following to your `AGENTS.md`, `CLAUDE.md`, `GEMINI.md`, or equivalent:
-
-```markdown
-## Code Search
-
-Use `semble search` to find code by describing what it does or naming a symbol/identifier, instead of grep:
-
-```bash
-semble search "authentication flow" ./my-project
-semble search "save_pretrained" ./my-project
-semble search "save model to disk" ./my-project --top-k 10
-```
-
-If you anticipate doing more than one search, use `semble index` to create an index.
-
-```bash
-semble index ./my-project -o my_index
-```
-
-You can then reuse this index later on:
-
-```bash
-semble search "save_pretrained" --index my_index
-```
-
-An index is not automatically updated, so if the code changes significantly, reindex. If you notice stale results while resolving searches to files, reindex.
-
-Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config:
-
-```bash
-semble search "deployment guide" ./my-project --content docs
-semble search "database host port" ./my-project --content config
-semble search "authentication" ./my-project --content all
-```
-
-Use `semble find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result):
-
-```bash
-semble find-related src/auth.py 42 ./my-project
-```
-
-Like search, `find-related` also accepts an `--index` argument.
-
-`path` defaults to the current directory when omitted; git URLs are accepted.
-
-If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its place.
-
-### Workflow
-
-1. Index the repo using `semble index -o cached_index`.
-2. Start with `semble search` to find relevant chunks. Pass the index to achieve results faster.
-3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything.
-4. Inspect full files only when the returned chunk does not give enough context.
-5. Optionally use `semble find-related` with a promising result's `file_path` and `line` to discover related implementations.
-6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string.
-```
-
-### Sub-agent setup
+## Sub-agent setup
Claude Code, Gemini CLI, Cursor, OpenCode, GitHub Copilot CLI, and Kiro all support a dedicated semble search sub-agent. Run `semble init` once in your project root:
@@ -399,17 +350,12 @@ If semble is not on `$PATH`, prefix the command with `uvx --from "semble[mcp]"`.
Semble also ships as a standalone CLI. This is useful in scripts or anywhere you want search results without an MCP session.
```bash
-# Index a local repository
-semble index ./my-project -o my-index
-
# Search a local repo
semble search "authentication flow" ./my-project
-# Or with index (significantly faster)
-# the index flag applies to all commands below.
-semble search "authentication flow" --index my-index
-# Search for a symbol or identifier
-semble search "save_pretrained" ./my-project
+# Index first for faster repeated searches (--index works with any command below)
+semble index ./my-project -o my-index
+semble search "authentication flow" --index my-index
# Search a remote repo (cloned on demand)
semble search "save model to disk" https://github.com/MinishLab/model2vec
@@ -417,14 +363,8 @@ semble search "save model to disk" https://github.com/MinishLab/model2vec
# Limit results
semble search "save model to disk" ./my-project --top-k 10
-# Search docs and prose (markdown, rst, etc.) instead of code
-semble search "deployment guide" ./my-project --content docs
-
-# Search config files (yaml, toml, terraform, etc.)
-semble search "database host port" ./my-project --content config
-
-# Search everything (code, docs, and config)
-semble search "authentication" ./my-project --content all
+# Search docs/config/everything instead of just code
+semble search "deployment guide" ./my-project --content docs # or: config, all
# Find code similar to a known location
semble find-related src/auth.py 42 ./my-project