From cecfffbfb4352204cf5c941e05761610096c75c3 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Mon, 30 Mar 2026 18:28:37 +0100 Subject: [PATCH 01/19] Add posting index and covering index documentation New page: concepts/deep-dive/posting-index.md - When to use, creating with INDEX TYPE POSTING and INCLUDE - Covering index: how it works, supported types, choosing columns - Verifying with EXPLAIN, comparison with bitmap index - All accelerated query patterns with examples - SQL optimizer hints (no_covering, no_index) - Trade-offs: storage, write performance, memory - Architecture: file types, generations, sealing, FSST compression - Limitations Updated pages: - indexes.md: added index type comparison table - create-table.md: added posting index and INCLUDE syntax - alter-table-alter-column-add-index.md: added posting + INCLUDE examples - sql-optimizer-hints.md: added no_covering and no_index hints - schema-design-essentials.md: added indexing decision guide - sidebars.js: added posting-index to Deep Dive navigation --- documentation/concepts/deep-dive/indexes.md | 10 + .../concepts/deep-dive/posting-index.md | 328 ++++++++++++++++++ .../concepts/deep-dive/sql-optimizer-hints.md | 34 ++ .../sql/alter-table-alter-column-add-index.md | 34 +- documentation/query/sql/create-table.md | 46 ++- documentation/schema-design-essentials.md | 38 ++ documentation/sidebars.js | 1 + 7 files changed, 487 insertions(+), 4 deletions(-) create mode 100644 documentation/concepts/deep-dive/posting-index.md diff --git a/documentation/concepts/deep-dive/indexes.md b/documentation/concepts/deep-dive/indexes.md index 68a865032..baa0f7d26 100644 --- a/documentation/concepts/deep-dive/indexes.md +++ b/documentation/concepts/deep-dive/indexes.md @@ -14,6 +14,16 @@ Indexing is available for [symbol](/docs/concepts/symbol/) columns in both table and [materialized views](/docs/concepts/materialized-views). Index support for other types will be added over time. +QuestDB supports two index types: + +| Index type | Syntax | Covering support | Best for | +|------------|--------|-----------------|----------| +| **Bitmap** (default) | `INDEX` or `INDEX TYPE BITMAP` | No | General-purpose, low write overhead | +| **Posting** | `INDEX TYPE POSTING` | Yes (via `INCLUDE`) | Read-heavy workloads, selective queries, wide tables | + +See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) +for the detailed guide on the posting index and its covering query capabilities. + ## Index creation and deletion The following are ways to index a `symbol` column: diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md new file mode 100644 index 000000000..e84aad5ac --- /dev/null +++ b/documentation/concepts/deep-dive/posting-index.md @@ -0,0 +1,328 @@ +--- +title: Posting index and covering index +sidebar_label: Posting index +description: + The posting index is a compact, high-performance index for symbol columns + that supports covering queries. Learn how it works, when to use it, and + how to optimize queries with INCLUDE columns. +--- + +The **posting index** is an advanced index type for +[symbol](/docs/concepts/symbol/) columns that provides better compression, +faster reads, and **covering index** support compared to the default bitmap +index. + +A **covering index** stores additional column values alongside the index +entries, so queries that only need those columns can be answered entirely from +the index without reading the main column files. + +## When to use the posting index + +Use the posting index when: + +- You frequently filter on a symbol column (`WHERE symbol = 'X'`) +- Your queries select a small set of columns alongside the symbol filter +- You want to reduce I/O by reading from compact sidecar files instead of + full column files +- You need efficient `DISTINCT` queries on a symbol column +- You need efficient `LATEST ON` queries partitioned by a symbol column + +The posting index is especially effective for high-cardinality symbol columns +(hundreds to thousands of distinct values) and wide tables where reading full +column files is expensive. + +## Creating a posting index + +### At table creation + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING, + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +### With covering columns (INCLUDE) + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (exchange, price, timestamp), + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +The `INCLUDE` clause specifies which columns are stored in the index sidecar +files. Queries that only read these columns plus the indexed symbol column +can be served entirely from the index. + +### On an existing table + +```questdb-sql +ALTER TABLE trades + ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (exchange, price); +``` + +## Covering index + +The covering index is the most powerful feature of the posting index. When all +columns in a query's `SELECT` list are either: + +- The indexed symbol column itself (from the `WHERE` clause) +- Listed in the `INCLUDE` clause + +...the query engine reads data directly from the index sidecar files, bypassing +the main column files entirely. This is significantly faster for selective +queries on wide tables. + +### Supported column types in INCLUDE + +| Type | Supported | Notes | +|------|-----------|-------| +| BOOLEAN, BYTE, SHORT, CHAR | Yes | Fixed-width, 1-2 bytes per value | +| INT, FLOAT, IPv4 | Yes | Fixed-width, 4 bytes per value | +| LONG, DOUBLE, TIMESTAMP, DATE | Yes | Fixed-width, 8 bytes per value | +| GEOBYTE, GEOSHORT, GEOINT, GEOLONG | Yes | Fixed-width, 1-8 bytes depending on precision | +| DECIMAL8, DECIMAL16, DECIMAL32, DECIMAL64 | Yes | Fixed-width, 1-8 bytes depending on precision | +| SYMBOL | Yes | Stored as integer key, resolved at query time | +| VARCHAR | Yes | Variable-width, FSST compressed in sealed partitions | +| STRING | Yes | Variable-width, FSST compressed in sealed partitions | +| BINARY | No | Not yet supported | +| UUID, LONG256 | No | Not yet supported (requires multi-long sidecar format) | +| DECIMAL128, DECIMAL256 | No | Not yet supported | +| Arrays (DOUBLE[][], etc.) | No | Not supported | + +### How to choose INCLUDE columns + +Include columns that you frequently select together with the indexed symbol: + +```questdb-sql +-- If your typical queries look like this: +SELECT timestamp, price, quantity FROM trades WHERE symbol = 'AAPL'; + +-- Then include those columns: +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (timestamp, price, quantity), + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE, + -- other columns not needed in hot queries + raw_data VARCHAR, + metadata VARCHAR +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +:::tip + +Only include columns that appear in your most frequent queries. Each included +column adds storage overhead and slows down writes slightly. Columns not in +the `INCLUDE` list can still be queried — they just won't benefit from the +covering optimization and will be read from column files. + +::: + +### Verifying covering index usage + +Use `EXPLAIN` to verify that a query uses the covering index: + +```questdb-sql +EXPLAIN SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; +``` + +If the covering index is used, the plan shows `CoveringIndex`: + +``` +SelectedRecord + CoveringIndex on: symbol with: timestamp, price + filter: symbol='AAPL' +``` + +If you see `DeferredSingleSymbolFilterPageFrame` or `PageFrame` instead, the +query is reading from column files. This happens when the `SELECT` list +includes columns not in the `INCLUDE` list. + +## Comparison with bitmap index + +| Feature | Bitmap index | Posting index | +|---------|-------------|---------------| +| Storage size | 8-16 bytes/value | ~1 byte/value | +| Covering index (INCLUDE) | No | Yes | +| DISTINCT acceleration | No | Yes | +| Write overhead | Minimal | Minimal (without INCLUDE) | +| Write overhead with INCLUDE | N/A | Moderate (depends on INCLUDE columns) | +| LATEST ON optimization | Yes | Yes | +| Syntax | `INDEX` or `INDEX TYPE BITMAP` | `INDEX TYPE POSTING` | + +## Query patterns accelerated + +### Point queries (WHERE symbol = 'X') + +```questdb-sql +-- Reads from sidecar if price is in INCLUDE +SELECT price FROM trades WHERE symbol = 'AAPL'; +``` + +### Point queries with additional filters + +If the additional filter columns are also in INCLUDE, the covering index +is still used with a filter applied on top: + +```questdb-sql +-- Covering index + filter on covered column +SELECT price FROM trades WHERE symbol = 'AAPL' AND price > 100; +``` + +### IN-list queries + +```questdb-sql +-- Multiple keys, still uses covering index +SELECT price FROM trades WHERE symbol IN ('AAPL', 'GOOGL', 'MSFT'); +``` + +### LATEST ON queries + +```questdb-sql +-- Latest row per symbol, reads from sidecar +SELECT timestamp, symbol, price +FROM trades +WHERE symbol = 'AAPL' +LATEST ON timestamp PARTITION BY symbol; +``` + +### DISTINCT queries + +```questdb-sql +-- Enumerates keys from index metadata, O(keys x partitions) instead of full scan +SELECT DISTINCT symbol FROM trades; + +-- Also works with timestamp filters +SELECT DISTINCT symbol FROM trades WHERE timestamp > '2024-01-01'; +``` + +### COUNT queries + +```questdb-sql +-- Uses index to scan only matching rows instead of full table +SELECT COUNT(*) FROM trades WHERE symbol = 'AAPL'; +``` + +### Aggregate queries on covered columns + +```questdb-sql +-- Vectorized GROUP BY reads from sidecar page frames +SELECT count(*), min(price), max(price) +FROM trades +WHERE symbol = 'AAPL'; +``` + +## SQL optimizer hints + +Two hints control index usage: + +### no_covering + +Forces the query to read from column files instead of the covering index +sidecar. Useful for benchmarking or when the covering path has an issue. + +```questdb-sql +SELECT /*+ no_covering */ price FROM trades WHERE symbol = 'AAPL'; +``` + +### no_index + +Completely disables index usage, falling back to a full table scan with +filter. Also implies `no_covering`. + +```questdb-sql +SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; +``` + +## Trade-offs + +### Storage + +The posting index itself is very compact (~1 byte per indexed value). +The covering sidecar adds storage proportional to the included columns: + +- Fixed-width columns (DOUBLE, INT, etc.): exact column size, compressed + with ALP (Adaptive Lossless floating-Point) and Frame-of-Reference bitpacking +- Variable-width columns (VARCHAR, STRING): FSST compressed in sealed + partitions, typically 2-5x smaller than raw column data +- The sidecar is typically 0.5-5% of the total column file size for the + included columns + +### Write performance + +Write overhead depends on the number and type of INCLUDE columns. Typical +ranges (measured with 100K row inserts, 50 symbol keys): + +- **Posting index without INCLUDE**: ~15-20% slower than no index +- **Posting index with fixed-width INCLUDE** (DOUBLE, INT): ~40-50% slower +- **Posting index with VARCHAR INCLUDE**: ~2x slower + +Actual overhead varies with row size, cardinality, and hardware. Query +performance improvements typically far outweigh the write cost for +read-heavy workloads. + +### Memory + +The posting index uses native memory for encoding/decoding buffers. +The covering index's FSST symbol tables use ~70KB of native memory per +compressed column per active reader. + +## Architecture + +The posting index stores data in three file types per partition: + +- **`.pk`** — Key file: double-buffered metadata pages with generation + directory (32 bytes per generation entry) +- **`.pv`** — Value file: delta + Frame-of-Reference bitpacked row IDs, + organized into stride-indexed generations +- **`.pci` + `.pc0`, `.pc1`, ...** — Sidecar files: covered column values + stored alongside the posting list, one file per INCLUDE column + +### Generations and sealing + +Data is written incrementally as **generations** (one per commit). Each +generation contains a sparse block of key→rowID mappings. Periodically, +generations are **sealed** into a single dense generation with stride-indexed +layout for optimal read performance. + +Sealing happens automatically when the generation count reaches the maximum +(125) or when the partition is closed. Sealed data uses two encoding modes +per stride (256 keys): + +- **Delta mode**: per-key delta encoding with bitpacking +- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking + +The encoder trial-encodes both modes and picks the smaller one per stride. + +### FSST compression for strings + +VARCHAR and STRING columns in the INCLUDE list are compressed using FSST +(Fast Static Symbol Table) compression during sealing. FSST replaces +frequently occurring 1-8 byte patterns with single-byte codes, typically +achieving 2-5x compression on string data with repetitive patterns. + +The FSST symbol table is trained per stride block and stored inline in the +sidecar file. Decompression is transparent to the query engine. + +## Limitations + +:::warning + +- INCLUDE is only supported for POSTING index type (not BITMAP) +- Array columns (DOUBLE[][], etc.) cannot be included +- BINARY, UUID, LONG256, DECIMAL128, and DECIMAL256 columns cannot yet be included +- SAMPLE BY queries do not currently use the covering index + (they fall back to the regular index path) +- REINDEX on WAL tables requires dropping and re-adding the index + (this applies to all index types, not just posting) + +::: diff --git a/documentation/concepts/deep-dive/sql-optimizer-hints.md b/documentation/concepts/deep-dive/sql-optimizer-hints.md index 93f207598..7a989a54b 100644 --- a/documentation/concepts/deep-dive/sql-optimizer-hints.md +++ b/documentation/concepts/deep-dive/sql-optimizer-hints.md @@ -358,3 +358,37 @@ your symbol set is high-cardinality. - superseded by `asof_index` - `asof_memoized_search` - superseded by `asof_memoized` + +----- + +## Index hints + +These hints control whether the query optimizer uses indexes (bitmap or posting) +for symbol column lookups. + +### no_covering + +Disables the [covering index](/docs/concepts/deep-dive/posting-index/) +optimization, forcing the query to read from column files instead of the +index sidecar. The index is still used for row ID lookup, but column values +are read from the main column files. + +```questdb-sql +SELECT /*+ no_covering */ price FROM trades WHERE symbol = 'AAPL'; +``` + +This is useful for benchmarking covering index performance or working around +a specific issue with the covering path. + +### no_index + +Completely disables all index usage for the query, including bitmap index, +posting index, and covering index. The query falls back to a full table scan +with a filter applied to every row. Also implies `no_covering`. + +```questdb-sql +SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; +``` + +This is useful for benchmarking index effectiveness or forcing a table scan +when you know the filter selectivity is low (many rows match). diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index 3ddea73cb..bff10995f 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -10,13 +10,41 @@ Indexes an existing [`symbol`](/docs/concepts/symbol/) column. ![Flow chart showing the syntax of the ALTER TABLE ALTER COLUMN ADD INDEX keyword](/images/docs/diagrams/alterTableAddIndex.svg) - Adding an [index](/docs/concepts/deep-dive/indexes/) is an atomic, non-blocking, and non-waiting operation. Once complete, the SQL optimizer will start using the new index for SQL executions. -## Example +## Examples + +### Adding a bitmap index (default) -```questdb-sql title="Adding an index" +```questdb-sql ALTER TABLE trades ALTER COLUMN instrument ADD INDEX; ``` + +### Adding a posting index + +```questdb-sql +ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING; +``` + +### Adding a posting index with covering columns + +The `INCLUDE` clause stores additional column values in the index sidecar +files, enabling covering queries that bypass column file reads: + +```questdb-sql +ALTER TABLE trades + ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity, timestamp); +``` + +After this, queries that only select columns from the `INCLUDE` list (plus the +indexed symbol column) are served from the index sidecar: + +```questdb-sql +-- This query reads from the index sidecar, not from column files +SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; +``` + +See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) +for supported column types and performance details. diff --git a/documentation/query/sql/create-table.md b/documentation/query/sql/create-table.md index ac77b4e20..8ec7a0299 100644 --- a/documentation/query/sql/create-table.md +++ b/documentation/query/sql/create-table.md @@ -475,6 +475,8 @@ must be of type [symbol](/docs/concepts/symbol/). ![Flow chart showing the syntax of the index function](/images/docs/diagrams/indexDef.svg) +### Bitmap index (default) + ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, @@ -484,13 +486,55 @@ CREATE TABLE trades ( ), INDEX(symbol) TIMESTAMP(timestamp); ``` +### Posting index + +The posting index offers better compression and read performance than the +default bitmap index. Use `INDEX TYPE POSTING`: + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING, + price DOUBLE, + amount DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +### Posting index with covering columns (INCLUDE) + +The `INCLUDE` clause stores additional column values in the index sidecar +files. Queries that only need these columns plus the indexed symbol can be +served entirely from the index, bypassing column files: + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, timestamp, exchange), + exchange SYMBOL, + price DOUBLE, + amount DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +With this schema, the following query reads only from the index sidecar: + +```questdb-sql +SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; +``` + +See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) +for a comprehensive guide including supported column types, query patterns, +and performance characteristics. + :::warning - The **index capacity** and [**symbol capacity**](/docs/concepts/symbol/) are different settings. - The index capacity value should not be changed, unless a user is aware of all - the implications. ::: + the implications. + +::: See the [Index concept](/docs/concepts/deep-dive/indexes/#how-indexes-work) for more information about indexes. diff --git a/documentation/schema-design-essentials.md b/documentation/schema-design-essentials.md index 592e9d09f..ea585a933 100644 --- a/documentation/schema-design-essentials.md +++ b/documentation/schema-design-essentials.md @@ -75,6 +75,44 @@ TIMESTAMP(ts) PARTITION BY MONTH; See [Partitions](/docs/concepts/partitions/) for details. +## Indexing + +Index your primary filter columns to speed up `WHERE` clause queries. QuestDB +supports two index types for SYMBOL columns: + +```questdb-sql +-- Default bitmap index — low overhead, good for most cases +CREATE TABLE trades ( + ts TIMESTAMP, + symbol SYMBOL INDEX, + price DOUBLE +) TIMESTAMP(ts) PARTITION BY DAY WAL; + +-- Posting index with covering columns — best for read-heavy, selective queries +CREATE TABLE trades ( + ts TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, ts), + price DOUBLE, + raw_data VARCHAR -- not in INCLUDE, read from column files +) TIMESTAMP(ts) PARTITION BY DAY WAL; +``` + +**When to choose each:** + +| Scenario | Recommendation | +|----------|---------------| +| General purpose, write-heavy | Bitmap index (`INDEX`) | +| Read-heavy, filtering on symbol | Posting index (`INDEX TYPE POSTING`) | +| Frequent queries on a few columns | Posting with `INCLUDE` | +| Wide table, queries select subset | Posting with `INCLUDE` — biggest win | + +The covering index (`INCLUDE`) lets queries that only select covered columns +read from compact sidecar files instead of full column files. Use `EXPLAIN` to +verify your queries use the `CoveringIndex` plan. + +See [Indexes](/docs/concepts/deep-dive/indexes/) and +[Posting index](/docs/concepts/deep-dive/posting-index/) for details. + ## Data types ### SYMBOL vs VARCHAR diff --git a/documentation/sidebars.js b/documentation/sidebars.js index 0a83522d1..9c3850465 100644 --- a/documentation/sidebars.js +++ b/documentation/sidebars.js @@ -538,6 +538,7 @@ module.exports = { collapsed: true, items: [ "concepts/deep-dive/indexes", + "concepts/deep-dive/posting-index", "concepts/deep-dive/interval-scan", "concepts/deep-dive/jit-compiler", "concepts/deep-dive/query-tracing", From 1527456806d2c81c31c171ff61e1d95ef002efb5 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:29:43 +0100 Subject: [PATCH 02/19] Update posting/covering index docs to match current codebase - All column types now supported in INCLUDE (UUID, LONG256, BINARY, DECIMAL128/256, arrays were added since initial docs) - Document encoding options: POSTING DELTA and POSTING EF syntax - Document designated timestamp auto-inclusion in covering index - Add out-of-line INDEX(col TYPE POSTING) syntax examples - Add SHOW COLUMNS indexType and indexInclude columns to show.md, meta.md (table_columns), and posting-index.md - Add SHOW CREATE TABLE example with posting index - Note CAPACITY restriction (bitmap only) across all relevant pages - Note INCLUDE restrictions (inline syntax only, cannot include indexed column itself) - Update storage/compression details per column type category Co-Authored-By: Claude Opus 4.6 --- documentation/concepts/deep-dive/indexes.md | 7 +- .../concepts/deep-dive/posting-index.md | 140 ++++++++++++++---- documentation/query/functions/meta.md | 16 +- .../sql/alter-table-alter-column-add-index.md | 15 +- documentation/query/sql/create-table.md | 29 +++- documentation/query/sql/show.md | 35 ++++- documentation/schema-design-essentials.md | 9 +- 7 files changed, 201 insertions(+), 50 deletions(-) diff --git a/documentation/concepts/deep-dive/indexes.md b/documentation/concepts/deep-dive/indexes.md index baa0f7d26..7f2b7d861 100644 --- a/documentation/concepts/deep-dive/indexes.md +++ b/documentation/concepts/deep-dive/indexes.md @@ -107,6 +107,9 @@ Consider the following query applied to the above table :::warning +Index capacity applies to **bitmap indexes only**. Posting indexes manage +their own storage layout and do not use this setting. + We strongly recommend to rely on the default index capacity. Misconfiguring this property might lead to worse performance and increased disk usage. @@ -124,8 +127,8 @@ When in doubt, reach out via the QuestDB support channels for advice. ::: -When a symbol column is indexed, an additional **index capacity** can be defined -to specify how many row IDs to store in a single storage block on disk: +When a symbol column has a bitmap index, an additional **index capacity** can be +defined to specify how many row IDs to store in a single storage block on disk: - Server-wide setting: `cairo.index.value.block.size` with a default of `256` - Column-wide setting: The diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index e84aad5ac..55309dc82 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -35,6 +35,8 @@ column files is expensive. ### At table creation +Inline syntax (index defined alongside the column): + ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, @@ -45,12 +47,25 @@ CREATE TABLE trades ( ) TIMESTAMP(timestamp) PARTITION BY DAY WAL; ``` +Out-of-line syntax (index defined separately): + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL, + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE +), INDEX(symbol TYPE POSTING) +TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + ### With covering columns (INCLUDE) ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (exchange, price, timestamp), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (exchange, price), exchange SYMBOL, price DOUBLE, quantity DOUBLE @@ -61,6 +76,23 @@ The `INCLUDE` clause specifies which columns are stored in the index sidecar files. Queries that only read these columns plus the indexed symbol column can be served entirely from the index. +:::tip + +The designated timestamp column is automatically included in the covering +index when an `INCLUDE` clause is present — you do not need to list it +explicitly. This means timestamp-filtered covering queries work out of the +box. + +::: + +:::note + +The `INCLUDE` clause is only supported with inline column syntax and +`ALTER TABLE`. The out-of-line `INDEX(col TYPE POSTING)` syntax does not +support `INCLUDE`. + +::: + ### On an existing table ```questdb-sql @@ -68,6 +100,34 @@ ALTER TABLE trades ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (exchange, price); ``` +### Encoding options + +The posting index supports two internal row ID encoding strategies. In most +cases the default is optimal and no keyword is needed: + +| Syntax | Encoding | Description | +|--------|----------|-------------| +| `INDEX TYPE POSTING` | Adaptive (default) | Trial-encodes delta and flat modes per stride, picks the smaller | +| `INDEX TYPE POSTING EF` | Adaptive (explicit) | Same as above — `EF` makes the choice explicit | +| `INDEX TYPE POSTING DELTA` | Delta-only | Forces per-key delta encoding, skipping flat-mode trial | + +```questdb-sql +-- Default adaptive encoding (recommended) +CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) + TIMESTAMP(ts) PARTITION BY DAY WAL; + +-- Force delta-only encoding +CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) + TIMESTAMP(ts) PARTITION BY DAY WAL; +``` + +:::note + +`CAPACITY` is only supported for bitmap indexes. Using `CAPACITY` with a +posting index will produce an error. + +::: + ## Covering index The covering index is the most powerful feature of the posting index. When all @@ -82,20 +142,20 @@ queries on wide tables. ### Supported column types in INCLUDE -| Type | Supported | Notes | -|------|-----------|-------| -| BOOLEAN, BYTE, SHORT, CHAR | Yes | Fixed-width, 1-2 bytes per value | -| INT, FLOAT, IPv4 | Yes | Fixed-width, 4 bytes per value | -| LONG, DOUBLE, TIMESTAMP, DATE | Yes | Fixed-width, 8 bytes per value | -| GEOBYTE, GEOSHORT, GEOINT, GEOLONG | Yes | Fixed-width, 1-8 bytes depending on precision | -| DECIMAL8, DECIMAL16, DECIMAL32, DECIMAL64 | Yes | Fixed-width, 1-8 bytes depending on precision | -| SYMBOL | Yes | Stored as integer key, resolved at query time | -| VARCHAR | Yes | Variable-width, FSST compressed in sealed partitions | -| STRING | Yes | Variable-width, FSST compressed in sealed partitions | -| BINARY | No | Not yet supported | -| UUID, LONG256 | No | Not yet supported (requires multi-long sidecar format) | -| DECIMAL128, DECIMAL256 | No | Not yet supported | -| Arrays (DOUBLE[][], etc.) | No | Not supported | +All column types except the indexed symbol column itself can be included: + +| Type | Compression | Notes | +|------|-------------|-------| +| BOOLEAN, BYTE, GEOBYTE, DECIMAL8 | Raw copy | 1 byte per value | +| SHORT, CHAR, GEOSHORT, DECIMAL16 | Frame-of-Reference | 2 bytes uncompressed | +| INT, FLOAT, IPv4, GEOINT, DECIMAL32 | FoR (int) / ALP (float) | 4 bytes uncompressed | +| LONG, DOUBLE, TIMESTAMP, DATE, GEOLONG, DECIMAL64 | FoR / ALP / linear prediction | 8 bytes uncompressed | +| SYMBOL | Frame-of-Reference | Stored as integer key, resolved at query time | +| UUID, DECIMAL128 | Raw copy | 16 bytes per value | +| LONG256, DECIMAL256 | Raw copy | 32 bytes per value | +| VARCHAR, STRING | FSST compressed | Variable-width, typically 2-5x compression | +| BINARY | Variable-width sidecar | Stored in offset-based format | +| Arrays (DOUBLE[], INT[], etc.) | Variable-width sidecar | Stored in offset-based format | ### How to choose INCLUDE columns @@ -105,10 +165,10 @@ Include columns that you frequently select together with the indexed symbol: -- If your typical queries look like this: SELECT timestamp, price, quantity FROM trades WHERE symbol = 'AAPL'; --- Then include those columns: +-- Then include those columns (timestamp is auto-included as designated timestamp): CREATE TABLE trades ( timestamp TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (timestamp, price, quantity), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, quantity), exchange SYMBOL, price DOUBLE, quantity DOUBLE, @@ -127,6 +187,26 @@ covering optimization and will be read from column files. ::: +### Inspecting indexes with SHOW COLUMNS + +`SHOW COLUMNS` displays index metadata for each column, including the index +type and covered columns: + +```questdb-sql +SHOW COLUMNS FROM trades; +``` + +| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | +|--------|------|---------|-------------------|-----------|-------------|-------------|----------------|------------|-----------| +| timestamp | TIMESTAMP | false | 0 | | | false | 0 | true | false | +| symbol | SYMBOL | true | 256 | POSTING | exchange,price | true | 128 | false | false | +| exchange | SYMBOL | false | 0 | | | true | 128 | false | false | +| price | DOUBLE | false | 0 | | | false | 0 | false | false | +| quantity | DOUBLE | false | 0 | | | false | 0 | false | false | + +The `indexType` column shows `POSTING`, `BITMAP`, or is empty for +non-indexed columns. The `indexInclude` column lists covered column names. + ### Verifying covering index usage Use `EXPLAIN` to verify that a query uses the covering index: @@ -250,12 +330,16 @@ SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; The posting index itself is very compact (~1 byte per indexed value). The covering sidecar adds storage proportional to the included columns: -- Fixed-width columns (DOUBLE, INT, etc.): exact column size, compressed - with ALP (Adaptive Lossless floating-Point) and Frame-of-Reference bitpacking -- Variable-width columns (VARCHAR, STRING): FSST compressed in sealed +- **Numeric columns** (DOUBLE, FLOAT): compressed with ALP (Adaptive + Lossless floating-Point) and Frame-of-Reference bitpacking +- **Integer columns** (INT, LONG, etc.): Frame-of-Reference bitpacking; + TIMESTAMP additionally uses linear-prediction encoding +- **Small fixed-width types** (BYTE, BOOLEAN, etc.): stored as raw copies +- **Wide fixed-width types** (UUID, LONG256, DECIMAL128/256): stored as + raw copies with a count header +- **Variable-width columns** (VARCHAR, STRING): FSST compressed in sealed partitions, typically 2-5x smaller than raw column data -- The sidecar is typically 0.5-5% of the total column file size for the - included columns +- **BINARY and arrays**: stored in an offset-based variable-width sidecar ### Write performance @@ -317,12 +401,14 @@ sidecar file. Decompression is transparent to the query engine. :::warning -- INCLUDE is only supported for POSTING index type (not BITMAP) -- Array columns (DOUBLE[][], etc.) cannot be included -- BINARY, UUID, LONG256, DECIMAL128, and DECIMAL256 columns cannot yet be included -- SAMPLE BY queries do not currently use the covering index +- `INCLUDE` is only supported for the posting index type (not bitmap) +- `INCLUDE` cannot list the indexed symbol column itself +- `INCLUDE` is not supported with out-of-line `INDEX(col ...)` syntax — + use inline column syntax or `ALTER TABLE` instead +- `CAPACITY` is not supported for posting indexes (bitmap only) +- `SAMPLE BY` queries do not currently use the covering index (they fall back to the regular index path) -- REINDEX on WAL tables requires dropping and re-adding the index +- `REINDEX` on WAL tables requires dropping and re-adding the index (this applies to all index types, not just posting) ::: diff --git a/documentation/query/functions/meta.md b/documentation/query/functions/meta.md index 832f56bb2..d7d450b9f 100644 --- a/documentation/query/functions/meta.md +++ b/documentation/query/functions/meta.md @@ -594,6 +594,10 @@ Returns a `table` with the following columns: - `indexed` - if indexing is applied to this column - `indexBlockCapacity` - how many row IDs to store in a single storage block on disk +- `indexType` - the [index type](/docs/concepts/deep-dive/indexes/) + (`POSTING`, `BITMAP`, or empty) +- `indexInclude` - comma-separated names of columns included in a + [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar - `symbolCached` - whether this `symbol` column is cached - `symbolCapacity` - how many distinct values this column of `symbol` type is expected to have @@ -611,12 +615,12 @@ For more details on the meaning and use of these values, see the table_columns('my_table'); ``` -| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | designated | upsertKey | -| ------ | --------- | ------- | ------------------ | ------------ | -------------- | ---------- | --------- | -| symb | SYMBOL | true | 1048576 | false | 256 | false | false | -| price | DOUBLE | false | 0 | false | 0 | false | false | -| ts | TIMESTAMP | false | 0 | false | 0 | true | false | -| s | VARCHAR | false | 0 | false | 0 | false | false | +| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | +| ------ | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | ---------- | --------- | +| symb | SYMBOL | true | 1048576 | | | false | 256 | false | false | +| price | DOUBLE | false | 0 | | | false | 0 | false | false | +| ts | TIMESTAMP | false | 0 | | | false | 0 | true | false | +| s | VARCHAR | false | 0 | | | false | 0 | false | false | ```questdb-sql title="Get designated timestamp column" SELECT "column", type, designated FROM table_columns('my_table') WHERE designated = true; diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index bff10995f..12d77467a 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -28,6 +28,13 @@ ALTER TABLE trades ALTER COLUMN instrument ADD INDEX; ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING; ``` +An encoding variant can be specified: + +```questdb-sql +-- Force delta-only encoding +ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING DELTA; +``` + ### Adding a posting index with covering columns The `INCLUDE` clause stores additional column values in the index sidecar @@ -35,11 +42,15 @@ files, enabling covering queries that bypass column file reads: ```questdb-sql ALTER TABLE trades - ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity, timestamp); + ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity); ``` +The designated timestamp column is automatically included in the covering +index — you do not need to list it explicitly. + After this, queries that only select columns from the `INCLUDE` list (plus the -indexed symbol column) are served from the index sidecar: +indexed symbol column and designated timestamp) are served from the index +sidecar: ```questdb-sql -- This query reads from the index sidecar, not from column files diff --git a/documentation/query/sql/create-table.md b/documentation/query/sql/create-table.md index 8ec7a0299..6705f7753 100644 --- a/documentation/query/sql/create-table.md +++ b/documentation/query/sql/create-table.md @@ -489,15 +489,26 @@ CREATE TABLE trades ( ### Posting index The posting index offers better compression and read performance than the -default bitmap index. Use `INDEX TYPE POSTING`: +default bitmap index. Use `INDEX TYPE POSTING` with either inline or +out-of-line syntax: ```questdb-sql +-- Inline syntax CREATE TABLE trades ( timestamp TIMESTAMP, symbol SYMBOL INDEX TYPE POSTING, price DOUBLE, amount DOUBLE ) TIMESTAMP(timestamp) PARTITION BY DAY WAL; + +-- Out-of-line syntax +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL, + price DOUBLE, + amount DOUBLE +), INDEX(symbol TYPE POSTING) +TIMESTAMP(timestamp) PARTITION BY DAY WAL; ``` ### Posting index with covering columns (INCLUDE) @@ -509,19 +520,29 @@ served entirely from the index, bypassing column files: ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, timestamp, exchange), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, exchange), exchange SYMBOL, price DOUBLE, amount DOUBLE ) TIMESTAMP(timestamp) PARTITION BY DAY WAL; ``` -With this schema, the following query reads only from the index sidecar: +The designated timestamp column is automatically included — you do not need +to list it in the `INCLUDE` clause. With this schema, the following query +reads only from the index sidecar: ```questdb-sql SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; ``` +:::note + +`INCLUDE` is only supported with inline column syntax (not out-of-line +`INDEX(col ...)`). Use `ALTER TABLE` to add covering columns to an existing +table. + +::: + See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) for a comprehensive guide including supported column types, query patterns, and performance characteristics. @@ -533,6 +554,8 @@ and performance characteristics. settings. - The index capacity value should not be changed, unless a user is aware of all the implications. +- `CAPACITY` is only supported for bitmap indexes — it cannot be used with + posting indexes. ::: diff --git a/documentation/query/sql/show.md b/documentation/query/sql/show.md index 5d14121fe..6bdcf9c12 100644 --- a/documentation/query/sql/show.md +++ b/documentation/query/sql/show.md @@ -57,13 +57,18 @@ SHOW TABLES; SHOW COLUMNS FROM trades; ``` -| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | -| --------- | --------- | ------- | ------------------ | ------------ | -------------- | --------------- | ---------- | --------- | -| symbol | SYMBOL | false | 0 | true | 256 | 42 | false | false | -| side | SYMBOL | false | 0 | true | 256 | 2 | false | false | -| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | -| amount | DOUBLE | false | 0 | false | 0 | 0 | false | false | -| timestamp | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | +| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | +| --------- | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | --------------- | ---------- | --------- | +| symbol | SYMBOL | false | 0 | | | true | 256 | 42 | false | false | +| side | SYMBOL | false | 0 | | | true | 256 | 2 | false | false | +| price | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | +| amount | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | +| timestamp | TIMESTAMP | false | 0 | | | false | 0 | 0 | true | false | + +The `indexType` column shows the index type (`POSTING`, `BITMAP`, or empty for +non-indexed columns). The `indexInclude` column lists the names of columns +included in a [posting index's](/docs/concepts/deep-dive/posting-index/) +covering sidecar, as a comma-separated string. ### SHOW CREATE TABLE @@ -88,6 +93,22 @@ CREATE TABLE trades ( WITH maxUncommittedRows=500000, o3MaxLag=600000000us; ``` +#### Posting index with covering columns + +When a symbol column has a posting index with `INCLUDE`, the DDL reflects +the index type and covered columns: + +```questdb-sql +CREATE TABLE trades ( + symbol SYMBOL CAPACITY 128 CACHE INDEX TYPE POSTING INCLUDE (price, exchange), + exchange SYMBOL CAPACITY 128 CACHE, + price DOUBLE, + amount DOUBLE, + timestamp TIMESTAMP +) timestamp(timestamp) PARTITION BY DAY WAL +WITH maxUncommittedRows=500000, o3MaxLag=600000000us; +``` + #### Per-column Parquet encoding When columns have per-column Parquet encoding or compression overrides, they diff --git a/documentation/schema-design-essentials.md b/documentation/schema-design-essentials.md index ea585a933..b88a6c929 100644 --- a/documentation/schema-design-essentials.md +++ b/documentation/schema-design-essentials.md @@ -91,10 +91,11 @@ CREATE TABLE trades ( -- Posting index with covering columns — best for read-heavy, selective queries CREATE TABLE trades ( ts TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, ts), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price), price DOUBLE, raw_data VARCHAR -- not in INCLUDE, read from column files ) TIMESTAMP(ts) PARTITION BY DAY WAL; +-- The designated timestamp (ts) is automatically included in the covering index. ``` **When to choose each:** @@ -107,8 +108,10 @@ CREATE TABLE trades ( | Wide table, queries select subset | Posting with `INCLUDE` — biggest win | The covering index (`INCLUDE`) lets queries that only select covered columns -read from compact sidecar files instead of full column files. Use `EXPLAIN` to -verify your queries use the `CoveringIndex` plan. +read from compact sidecar files instead of full column files. The designated +timestamp is automatically included, so timestamp-filtered queries benefit +without explicit listing. Use `EXPLAIN` to verify your queries use the +`CoveringIndex` plan. See [Indexes](/docs/concepts/deep-dive/indexes/) and [Posting index](/docs/concepts/deep-dive/posting-index/) for details. From 72b98cada696fbbcbc1ccfbc0ba14ab5c345b054 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:36:29 +0100 Subject: [PATCH 03/19] Add posting index docs to explain, symbol, mat-view, config pages - explain.md: add CoveringIndex and PostingIndex plan node descriptions - symbol.md: add posting index example alongside bitmap in indexing section - alter-mat-view-alter-column-add-index.md: add TYPE POSTING syntax (INCLUDE not supported on materialized views) - _cairo.config.json: add cairo.posting.index.auto.include.timestamp and cairo.posting.index.row.id.encoding config keys; clarify bitmap-only scope of cairo.index.value.block.size and cairo.spin.lock.timeout Co-Authored-By: Claude Opus 4.6 --- documentation/concepts/symbol.md | 14 +++++++++++ .../configuration-utils/_cairo.config.json | 12 ++++++++-- .../alter-mat-view-alter-column-add-index.md | 24 ++++++++++++++++--- documentation/query/sql/explain.md | 6 +++++ 4 files changed, 51 insertions(+), 5 deletions(-) diff --git a/documentation/concepts/symbol.md b/documentation/concepts/symbol.md index 2e5bf740f..bfd7b0be5 100755 --- a/documentation/concepts/symbol.md +++ b/documentation/concepts/symbol.md @@ -117,6 +117,7 @@ ALTER TABLE trades ALTER COLUMN client_id CACHE; For columns frequently used in `WHERE` clauses, add an index: ```questdb-sql +-- Bitmap index (default) — low overhead, good for most cases CREATE TABLE trades ( timestamp TIMESTAMP, symbol SYMBOL INDEX, @@ -124,10 +125,23 @@ CREATE TABLE trades ( ) TIMESTAMP(timestamp) PARTITION BY DAY; ``` +For read-heavy workloads, a [posting index](/docs/concepts/deep-dive/posting-index/) +offers better compression and supports covering queries: + +```questdb-sql +-- Posting index with covering columns — reads from compact sidecar files +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price), + price DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + Or add an index later: ```questdb-sql ALTER TABLE trades ALTER COLUMN symbol ADD INDEX; +-- or: ALTER TABLE trades ALTER COLUMN symbol ADD INDEX TYPE POSTING; ``` See [Indexes](/docs/concepts/deep-dive/indexes/) for more information. diff --git a/documentation/configuration/configuration-utils/_cairo.config.json b/documentation/configuration/configuration-utils/_cairo.config.json index 9fd0f8689..64f928008 100644 --- a/documentation/configuration/configuration-utils/_cairo.config.json +++ b/documentation/configuration/configuration-utils/_cairo.config.json @@ -81,7 +81,15 @@ }, "cairo.index.value.block.size": { "default": "256", - "description": "Approximation of number of rows for a single index key, must be power of 2." + "description": "Approximation of number of rows for a single index key, must be power of 2. Applies to bitmap indexes only; posting indexes manage their own block layout." + }, + "cairo.posting.index.auto.include.timestamp": { + "default": "true", + "description": "When `true`, the designated timestamp column is automatically added to the covering index when a [posting index](/docs/concepts/deep-dive/posting-index/) is created with an `INCLUDE` clause." + }, + "cairo.posting.index.row.id.encoding": { + "default": "posting", + "description": "Default row ID encoding for posting indexes. Valid values: `posting` (adaptive delta/flat trial encoding) and `posting_delta` (delta-only encoding)." }, "cairo.max.swap.file.count": { "default": "30", @@ -105,7 +113,7 @@ }, "cairo.spin.lock.timeout": { "default": "1000", - "description": "Timeout when attempting to get BitmapIndexReaders in millisecond." + "description": "Timeout in milliseconds when attempting to acquire index readers (bitmap and posting)." }, "cairo.character.store.capacity": { "default": "1024", diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index d8b5d2b27..d866e788b 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -12,6 +12,7 @@ query performance for filtered lookups. ``` ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX [ CAPACITY n ] +ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX TYPE POSTING ``` ## Parameters @@ -20,7 +21,8 @@ ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX [ CAPACITY n | --------- | ----------- | | `viewName` | Name of the materialized view | | `columnName` | Name of the `SYMBOL` column to index | -| `CAPACITY` | Optional index capacity (advanced; use default unless you understand implications) | +| `CAPACITY` | Optional index capacity for bitmap indexes (advanced; use default unless you understand implications) | +| `TYPE POSTING` | Use a [posting index](/docs/concepts/deep-dive/posting-index/) instead of the default bitmap index | ## When to use @@ -30,13 +32,29 @@ Add an index when: - The column has high cardinality (many distinct values) - Query performance on the materialized view needs improvement -## Example +## Examples -```questdb-sql title="Add index to symbol column" +### Adding a bitmap index (default) + +```questdb-sql title="Add bitmap index to symbol column" ALTER MATERIALIZED VIEW trades_hourly ALTER COLUMN symbol ADD INDEX; ``` +### Adding a posting index + +```questdb-sql title="Add posting index to symbol column" +ALTER MATERIALIZED VIEW trades_hourly + ALTER COLUMN symbol ADD INDEX TYPE POSTING; +``` + +:::note + +The `INCLUDE` clause for covering indexes is not supported on materialized +views. Use a posting index without `INCLUDE` for faster filtered lookups. + +::: + ## Behavior | Aspect | Description | diff --git a/documentation/query/sql/explain.md b/documentation/query/sql/explain.md index d3858d806..cabe99f72 100644 --- a/documentation/query/sql/explain.md +++ b/documentation/query/sql/explain.md @@ -76,6 +76,12 @@ The following list contains some plan node types: `INTERSECT`). - `Index forward/backward scan` - scans all row ids associated with a given `symbol` value from start to finish or vice versa. +- `CoveringIndex` - reads data from a + [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar + files instead of main column files. Appears when all selected columns are + covered by the `INCLUDE` clause. +- `PostingIndex` - uses a posting index for accelerated operations such as + `DISTINCT` on a symbol column. - `Limit` - standalone node implementing the `LIMIT` keyword. Other nodes can implement `LIMIT` internally, e.g. the `Sort` node. - `Row forward/backward scan` - scans data frame (usually partitioned) records From 630384d8ed85cf5485443b06dbb7747aa9e585c5 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:41:31 +0100 Subject: [PATCH 04/19] Clarify encoding trade-offs: delta vs adaptive modes Delta encoding compresses best for regular, evenly-distributed data and is faster for large scans. The adaptive (default) mode additionally trial-encodes a flat layout that compresses better for irregular distributions and is faster for point queries. Co-Authored-By: Claude Opus 4.6 --- .../concepts/deep-dive/posting-index.md | 47 +++++++++++++------ 1 file changed, 33 insertions(+), 14 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 55309dc82..9c13396d0 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -102,21 +102,36 @@ ALTER TABLE trades ### Encoding options -The posting index supports two internal row ID encoding strategies. In most -cases the default is optimal and no keyword is needed: - -| Syntax | Encoding | Description | -|--------|----------|-------------| -| `INDEX TYPE POSTING` | Adaptive (default) | Trial-encodes delta and flat modes per stride, picks the smaller | -| `INDEX TYPE POSTING EF` | Adaptive (explicit) | Same as above — `EF` makes the choice explicit | -| `INDEX TYPE POSTING DELTA` | Delta-only | Forces per-key delta encoding, skipping flat-mode trial | +The posting index supports two row ID encoding strategies with different +performance characteristics: + +| Syntax | Encoding | Best for | +|--------|----------|----------| +| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both modes per stride, picks the smaller | +| `INDEX TYPE POSTING DELTA` | Delta-only | Regular, evenly-distributed data — faster large scans | + +**Delta encoding** stores per-key deltas between consecutive row IDs with +Frame-of-Reference bitpacking. It compresses best when row IDs for each +symbol key are evenly spaced (e.g. round-robin or time-ordered ingestion +of a fixed set of symbols) and is faster for queries that scan large +ranges of matching rows. + +The **adaptive (default)** encoding additionally trial-encodes a +stride-wide flat layout and picks whichever is smaller. This mode +compresses better for irregular data distributions (e.g. bursty or +skewed symbol frequencies) and produces a layout that is faster for +point queries and selective lookups. + +For most workloads the default adaptive encoding is the best choice. +Use `DELTA` only when you know your data arrives in a regular pattern +and your queries predominantly scan large result sets. ```questdb-sql --- Default adaptive encoding (recommended) +-- Default adaptive encoding (recommended for most workloads) CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) TIMESTAMP(ts) PARTITION BY DAY WAL; --- Force delta-only encoding +-- Delta-only encoding (regular data, large scans) CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) TIMESTAMP(ts) PARTITION BY DAY WAL; ``` @@ -379,13 +394,17 @@ generations are **sealed** into a single dense generation with stride-indexed layout for optimal read performance. Sealing happens automatically when the generation count reaches the maximum -(125) or when the partition is closed. Sealed data uses two encoding modes -per stride (256 keys): +(125) or when the partition is closed. With the default adaptive encoding, +sealed data uses two encoding modes per stride (256 keys): -- **Delta mode**: per-key delta encoding with bitpacking -- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking +- **Delta mode**: per-key delta encoding with bitpacking — compresses best + for regular, evenly-distributed row IDs and is faster for large scans +- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking — + compresses better for irregular distributions and is faster for point + queries The encoder trial-encodes both modes and picks the smaller one per stride. +With `POSTING DELTA`, only delta mode is used. ### FSST compression for strings From e187aac19b90b83cffb83097ddf1defe3d1bc7a3 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:42:37 +0100 Subject: [PATCH 05/19] Add EF encoding as distinct option, clarify all three modes - POSTING DELTA: regular data, better compression for even distributions, faster for large sequential scans - POSTING EF: Elias-Fano encoding, better compression for irregular distributions, faster for point queries - POSTING (default): adaptive, trial-encodes both per stride, picks smaller Co-Authored-By: Claude Opus 4.6 --- .../concepts/deep-dive/posting-index.md | 54 ++++++++++--------- 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 9c13396d0..e6d53d900 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -102,13 +102,14 @@ ALTER TABLE trades ### Encoding options -The posting index supports two row ID encoding strategies with different -performance characteristics: +The posting index supports three row ID encoding options with different +compression and query performance characteristics: | Syntax | Encoding | Best for | |--------|----------|----------| -| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both modes per stride, picks the smaller | -| `INDEX TYPE POSTING DELTA` | Delta-only | Regular, evenly-distributed data — faster large scans | +| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both EF and delta per stride, picks the smaller | +| `INDEX TYPE POSTING EF` | Elias-Fano | Irregular data distributions, point queries and selective lookups | +| `INDEX TYPE POSTING DELTA` | Delta | Regular, evenly-distributed data, large sequential scans | **Delta encoding** stores per-key deltas between consecutive row IDs with Frame-of-Reference bitpacking. It compresses best when row IDs for each @@ -116,23 +117,27 @@ symbol key are evenly spaced (e.g. round-robin or time-ordered ingestion of a fixed set of symbols) and is faster for queries that scan large ranges of matching rows. -The **adaptive (default)** encoding additionally trial-encodes a -stride-wide flat layout and picks whichever is smaller. This mode -compresses better for irregular data distributions (e.g. bursty or -skewed symbol frequencies) and produces a layout that is faster for -point queries and selective lookups. +**Elias-Fano (EF) encoding** uses a stride-wide flat layout with +Frame-of-Reference bitpacking across all keys in a stride. It compresses +better for irregular data distributions (e.g. bursty or skewed symbol +frequencies) and is faster for point queries and selective lookups. -For most workloads the default adaptive encoding is the best choice. -Use `DELTA` only when you know your data arrives in a regular pattern -and your queries predominantly scan large result sets. +The **adaptive (default)** encoding trial-encodes both EF and delta modes +per stride and picks whichever produces the smaller output. This is the +best choice when you are unsure about your data distribution or have a +mixed query workload. ```questdb-sql -- Default adaptive encoding (recommended for most workloads) CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) TIMESTAMP(ts) PARTITION BY DAY WAL; +-- EF encoding (irregular data, point queries) +CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING EF) + TIMESTAMP(ts) PARTITION BY DAY WAL; + -- Delta-only encoding (regular data, large scans) -CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) +CREATE TABLE t3 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) TIMESTAMP(ts) PARTITION BY DAY WAL; ``` @@ -394,17 +399,18 @@ generations are **sealed** into a single dense generation with stride-indexed layout for optimal read performance. Sealing happens automatically when the generation count reaches the maximum -(125) or when the partition is closed. With the default adaptive encoding, -sealed data uses two encoding modes per stride (256 keys): - -- **Delta mode**: per-key delta encoding with bitpacking — compresses best - for regular, evenly-distributed row IDs and is faster for large scans -- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking — - compresses better for irregular distributions and is faster for point - queries - -The encoder trial-encodes both modes and picks the smaller one per stride. -With `POSTING DELTA`, only delta mode is used. +(125) or when the partition is closed. Sealed data uses two encoding modes +per stride (256 keys): + +- **Delta mode** (`POSTING DELTA`): per-key delta encoding with bitpacking — + compresses best for regular, evenly-distributed row IDs and is faster for + large sequential scans +- **Elias-Fano mode** (`POSTING EF`): stride-wide Frame-of-Reference with + contiguous bitpacking — compresses better for irregular distributions and + is faster for point queries + +With the default adaptive encoding (`POSTING`), the encoder trial-encodes +both modes per stride and picks the smaller one. ### FSST compression for strings From 96b89305abb0cf47747cc1ffb3d81733ae13000f Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:40:52 +0100 Subject: [PATCH 06/19] Correct posting/covering index facts verified against live instance MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit posting-index.md: - Auto-include of designated timestamp applies to any posting index, not only when an INCLUDE clause is present (verified in source and EXPLAIN). Document SHOW CREATE TABLE round-trip with the expanded list. - Note that bare INDEX INCLUDE (...) auto-promotes to POSTING. - Replace "max 125 generations" with the actual seal threshold of 16 (cairo.posting.seal.gen.threshold). - Distinguish the two seal-time sub-layouts (Delta sub-layout and Flat sub-layout, both internal to delta+FoR) from the SQL DELTA / EF encoding variants — they were previously conflated. - Note the native AVX2 fast path for 8/16/32-bit widths. - Bitmap storage size: ~15 B/value (PR benchmark figure) instead of the older 8-16 B/value range. - Write-perf comparison baselined against bitmap (~9% slower for the index path itself) instead of vs. no-index. - FSST symbol table is ~2.3 KB and L1-resident; drop the unverified ~70 KB per-reader figure. - Generalise the SAMPLE BY limitation: covering needs a filter on the indexed symbol, otherwise unfiltered LATEST ON / SAMPLE BY / GROUP BY fall back to a regular page-frame scan. - Refresh EXPLAIN snippets to match real output: IN-list filter rendering, LATEST ON without SelectedRecord wrapper, DISTINCT as PostingIndex op: distinct, Async Filter layered on top of CoveringIndex for AND filters on covered columns. - Tighten architecture: .pv encoding depends on variant (delta+FoR or EF); .pcN sidecars carry txn-segment suffixes on disk and the auto-included timestamp gets its own sidecar. _cairo.config.json: - cairo.posting.index.row.id.encoding: default is `adaptive`, valid values are `adaptive`, `delta`, `ef` (not the previous `posting`/`posting_delta`). - cairo.posting.index.auto.include.timestamp: clarify that it applies to any posting index, including bare INDEX TYPE POSTING. - Add cairo.posting.seal.gen.threshold (default 16). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../concepts/deep-dive/posting-index.md | 173 ++++++++++++------ .../configuration-utils/_cairo.config.json | 10 +- 2 files changed, 129 insertions(+), 54 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index e6d53d900..653b136bc 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -79,9 +79,13 @@ can be served entirely from the index. :::tip The designated timestamp column is automatically included in the covering -index when an `INCLUDE` clause is present — you do not need to list it -explicitly. This means timestamp-filtered covering queries work out of the -box. +index — even when no explicit `INCLUDE` clause is given. So a bare +`INDEX TYPE POSTING` already covers `SELECT timestamp, sym FROM t WHERE +sym = 'X'`. The expanded list is what `SHOW CREATE TABLE` round-trips, so +`INCLUDE (exchange, price)` renders back as +`INCLUDE (exchange, price, timestamp)` after creation. Controlled by the +`cairo.posting.index.auto.include.timestamp` server property +(default `true`). ::: @@ -91,6 +95,10 @@ The `INCLUDE` clause is only supported with inline column syntax and `ALTER TABLE`. The out-of-line `INDEX(col TYPE POSTING)` syntax does not support `INCLUDE`. +Writing `INDEX INCLUDE (...)` (no explicit `TYPE`) is also accepted and +implicitly creates a posting index — `INCLUDE` is only valid with +`POSTING`, so the parser promotes the type for you. + ::: ### On an existing table @@ -243,22 +251,58 @@ SelectedRecord filter: symbol='AAPL' ``` +`IN`-list filters render as `filter: symbol IN ['AAPL','GOOGL','MSFT']`. +`LATEST ON` queries that hit the covering path show an `op: latest` +annotation and have no `SelectedRecord` wrapper: + +``` +CoveringIndex op: latest on: symbol with: timestamp, price + filter: symbol='AAPL' +``` + +`SELECT DISTINCT` does not need to read covered values, so it shows up as +`PostingIndex op: distinct` rather than `CoveringIndex`: + +``` +PostingIndex op: distinct on: symbol + Frame forward scan on: trades +``` + +When you add a filter on a covered column, an `Async Filter` is layered +above the covering index — the predicate values are read from the sidecar, +not the column file: + +``` +SelectedRecord + Async Filter workers: N + filter: 100 Date: Tue, 5 May 2026 08:47:57 +0100 Subject: [PATCH 07/19] Refine posting/covering index facts after deeper source review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit posting-index.md: - Encoding-options: Elias-Fano is per-key Elias-Fano coding (low/high bit split), not "stride-wide flat layout with FoR". Rewrite the EF description to match the actual algorithm and recharacterise the three SQL variants as choices that pick the per-key encoding the writer uses, with the explicit DELTA/EF variants positioned for benchmarking rather than tied to vague data-distribution claims. - INCLUDE type table: BOOLEAN/BYTE/etc. use Frame-of-Reference bitpacking, not raw copies. Split FLOAT/DOUBLE into their own rows (both ALP) and TIMESTAMP into its own row (linear-prediction + FoR). BINARY/arrays are length-prefixed raw bytes, not "offset-based sidecar". - Trade-offs storage section: same correction — small fixed-width types use FoR bitpacking; only UUID / LONG256 / DECIMAL128/256 are raw copies. - SHOW COLUMNS example: column order now matches live output (indexType / indexInclude come last, after upsertKey), and adds the symbolTableSize column. The indexInclude value shows exchange,price,timestamp to reflect auto-include of the timestamp. meta.md: - table_columns(): description list adds symbolTableSize and reorders indexType / indexInclude to the end (where they actually appear). - Example table column order matches live output and includes symbolTableSize. show.md: - SHOW COLUMNS example: column order corrected (indexType / indexInclude at end, symbolCached / symbolCapacity / symbolTableSize before designated). Mention POSTING DELTA / POSTING EF as possible indexType values. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../concepts/deep-dive/posting-index.md | 114 ++++++++++-------- documentation/query/functions/meta.md | 22 ++-- documentation/query/sql/show.md | 25 ++-- 3 files changed, 88 insertions(+), 73 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 653b136bc..4d44e1246 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -113,27 +113,30 @@ ALTER TABLE trades The posting index supports three row ID encoding options with different compression and query performance characteristics: -| Syntax | Encoding | Best for | -|--------|----------|----------| -| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both EF and delta per stride, picks the smaller | -| `INDEX TYPE POSTING EF` | Elias-Fano | Irregular data distributions, point queries and selective lookups | -| `INDEX TYPE POSTING DELTA` | Delta | Regular, evenly-distributed data, large sequential scans | - -**Delta encoding** stores per-key deltas between consecutive row IDs with -Frame-of-Reference bitpacking. It compresses best when row IDs for each -symbol key are evenly spaced (e.g. round-robin or time-ordered ingestion -of a fixed set of symbols) and is faster for queries that scan large -ranges of matching rows. - -**Elias-Fano (EF) encoding** uses a stride-wide flat layout with -Frame-of-Reference bitpacking across all keys in a stride. It compresses -better for irregular data distributions (e.g. bursty or skewed symbol -frequencies) and is faster for point queries and selective lookups. - -The **adaptive (default)** encoding trial-encodes both EF and delta modes -per stride and picks whichever produces the smaller output. This is the -best choice when you are unsure about your data distribution or have a -mixed query workload. +| Syntax | Encoding | Notes | +|--------|----------|-------| +| `INDEX TYPE POSTING` | Adaptive (default) | Trials delta + Frame-of-Reference and Elias-Fano per key per stride and keeps the smaller output | +| `INDEX TYPE POSTING EF` | Elias-Fano only | Forces Elias-Fano even when delta + FoR would be smaller — useful for benchmarking | +| `INDEX TYPE POSTING DELTA` | Delta + Frame-of-Reference only | Forces delta + FoR even when Elias-Fano would be smaller — useful for benchmarking | + +**Delta + Frame-of-Reference encoding** stores each key's row IDs as +per-key deltas, split into blocks of 64 with per-block Frame-of-Reference +bitpacking. Round-robin or periodic distributions produce constant +deltas (bitwidth 0), so this mode compresses them to near-zero. The +trade-off is a per-key block-header overhead that hurts low-cardinality +keys. + +**Elias-Fano (EF) encoding** is a classic monotonic-sequence encoding: +each key's sorted row IDs are split into low and high bit halves, with +the high half stored as a unary-coded bit array and the low half as a +fixed-width packed array. This typically produces denser output for +keys with few values per stride and avoids the block-header overhead. + +The **adaptive (default)** encoding trial-encodes each key with both +delta + Frame-of-Reference and Elias-Fano per stride and picks whichever +produces the smaller output. This is the right choice for almost all +workloads — the explicit `DELTA` / `EF` variants exist mainly for +benchmarking. ```questdb-sql -- Default adaptive encoding (recommended for most workloads) @@ -174,16 +177,19 @@ All column types except the indexed symbol column itself can be included: | Type | Compression | Notes | |------|-------------|-------| -| BOOLEAN, BYTE, GEOBYTE, DECIMAL8 | Raw copy | 1 byte per value | -| SHORT, CHAR, GEOSHORT, DECIMAL16 | Frame-of-Reference | 2 bytes uncompressed | -| INT, FLOAT, IPv4, GEOINT, DECIMAL32 | FoR (int) / ALP (float) | 4 bytes uncompressed | -| LONG, DOUBLE, TIMESTAMP, DATE, GEOLONG, DECIMAL64 | FoR / ALP / linear prediction | 8 bytes uncompressed | -| SYMBOL | Frame-of-Reference | Stored as integer key, resolved at query time | +| BOOLEAN, BYTE, GEOBYTE, DECIMAL8 | Frame-of-Reference bitpacking | ≤1 byte per value (worst case) | +| SHORT, CHAR, GEOSHORT, DECIMAL16 | Frame-of-Reference bitpacking | ≤2 bytes per value | +| INT, IPv4, GEOINT, DECIMAL32 | Frame-of-Reference bitpacking | ≤4 bytes per value | +| FLOAT | ALP (Adaptive Lossless floating-Point) | Lossless float compression | +| LONG, DATE, GEOLONG, DECIMAL64 | Frame-of-Reference bitpacking | ≤8 bytes per value | +| TIMESTAMP | Linear-prediction + Frame-of-Reference | Designed for monotonic timestamps | +| DOUBLE | ALP (Adaptive Lossless floating-Point) | Lossless float compression | +| SYMBOL | Frame-of-Reference bitpacking | Stored as integer key, resolved at query time | | UUID, DECIMAL128 | Raw copy | 16 bytes per value | | LONG256, DECIMAL256 | Raw copy | 32 bytes per value | -| VARCHAR, STRING | FSST compressed | Variable-width, typically 2-5x compression | -| BINARY | Variable-width sidecar | Stored in offset-based format | -| Arrays (DOUBLE[], INT[], etc.) | Variable-width sidecar | Stored in offset-based format | +| VARCHAR, STRING | FSST compressed (≥4 KB strides) | Typically 2–5× compression on repetitive text | +| BINARY | Length-prefixed raw bytes | Variable-width, no compression | +| Arrays (DOUBLE[], INT[], etc.) | Length-prefixed raw bytes | Variable-width, no compression | ### How to choose INCLUDE columns @@ -224,16 +230,17 @@ type and covered columns: SHOW COLUMNS FROM trades; ``` -| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | -|--------|------|---------|-------------------|-----------|-------------|-------------|----------------|------------|-----------| -| timestamp | TIMESTAMP | false | 0 | | | false | 0 | true | false | -| symbol | SYMBOL | true | 256 | POSTING | exchange,price | true | 128 | false | false | -| exchange | SYMBOL | false | 0 | | | true | 128 | false | false | -| price | DOUBLE | false | 0 | | | false | 0 | false | false | -| quantity | DOUBLE | false | 0 | | | false | 0 | false | false | +| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | indexType | indexInclude | +|-----------|-----------|---------|--------------------|--------------|----------------|-----------------|------------|-----------|-----------|---------------------------| +| timestamp | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | | | +| symbol | SYMBOL | true | 256 | true | 256 | 0 | false | false | POSTING | exchange,price,timestamp | +| exchange | SYMBOL | false | 256 | true | 256 | 0 | false | false | | | +| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| quantity | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | -The `indexType` column shows `POSTING`, `BITMAP`, or is empty for -non-indexed columns. The `indexInclude` column lists covered column names. +The `indexType` column shows `POSTING`, `POSTING DELTA`, `POSTING EF`, +`BITMAP`, or is empty for non-indexed columns. The `indexInclude` column +lists covered column names — note the auto-included designated timestamp. ### Verifying covering index usage @@ -393,19 +400,24 @@ SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; ### Storage -The posting index itself is very compact (~1 byte per indexed value). -The covering sidecar adds storage proportional to the included columns: - -- **Numeric columns** (DOUBLE, FLOAT): compressed with ALP (Adaptive - Lossless floating-Point) and Frame-of-Reference bitpacking -- **Integer columns** (INT, LONG, etc.): Frame-of-Reference bitpacking; - TIMESTAMP additionally uses linear-prediction encoding -- **Small fixed-width types** (BYTE, BOOLEAN, etc.): stored as raw copies -- **Wide fixed-width types** (UUID, LONG256, DECIMAL128/256): stored as - raw copies with a count header -- **Variable-width columns** (VARCHAR, STRING): FSST compressed in sealed - partitions, typically 2-5x smaller than raw column data -- **BINARY and arrays**: stored in an offset-based variable-width sidecar +The posting index itself is very compact (~1 byte per indexed value, vs. +~15 bytes per value for the bitmap index). The covering sidecar adds +storage proportional to the included columns: + +- **DOUBLE, FLOAT**: ALP (Adaptive Lossless floating-Point), backed by + Frame-of-Reference bitpacking with an exception list for outliers. +- **TIMESTAMP**: linear-prediction header with Frame-of-Reference residual + bitpacking — designed for monotonic timestamp data. +- **Other fixed-width integer types** (BOOLEAN, BYTE, SHORT, CHAR, INT, + LONG, DATE, IPv4, GEO\*, DECIMAL8–DECIMAL64, SYMBOL keys): + Frame-of-Reference bitpacking sized to the column's natural width, so + the worst case is the column-file byte size and typical case is much + smaller. +- **UUID, LONG256, DECIMAL128, DECIMAL256**: stored raw at full width + with a small count header. +- **VARCHAR, STRING**: FSST-compressed once a stride exceeds 4 KB of raw + data; typically 2–5× smaller than the column file. +- **BINARY and arrays**: length-prefixed raw bytes (no compression). ### Write performance diff --git a/documentation/query/functions/meta.md b/documentation/query/functions/meta.md index d978206ab..e02f30f6b 100644 --- a/documentation/query/functions/meta.md +++ b/documentation/query/functions/meta.md @@ -356,17 +356,19 @@ Returns a `table` with the following columns: - `indexed` - if indexing is applied to this column - `indexBlockCapacity` - how many row IDs to store in a single storage block on disk (bitmap indexes only) -- `indexType` - the [index type](/docs/concepts/deep-dive/indexes/) - (`POSTING`, `POSTING DELTA`, `POSTING EF`, `BITMAP`, or empty) -- `indexInclude` - comma-separated names of columns included in a - [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar - `symbolCached` - whether this `symbol` column is cached - `symbolCapacity` - how many distinct values this column of `symbol` type is expected to have +- `symbolTableSize` - current number of distinct values stored in this + `symbol` column's table - `designated` - if this is set as the designated timestamp column for this table - `upsertKey` - if this column is a part of UPSERT KEYS list for table [deduplication](/docs/concepts/deduplication) +- `indexType` - the [index type](/docs/concepts/deep-dive/indexes/) + (`POSTING`, `POSTING DELTA`, `POSTING EF`, `BITMAP`, or empty) +- `indexInclude` - comma-separated names of columns included in a + [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar For more details on the meaning and use of these values, see the [CREATE TABLE](/docs/query/sql/create-table/) documentation. @@ -377,12 +379,12 @@ For more details on the meaning and use of these values, see the table_columns('my_table'); ``` -| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | -| ------ | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | ---------- | --------- | -| symb | SYMBOL | true | 1048576 | BITMAP | | false | 256 | false | false | -| price | DOUBLE | false | 0 | | | false | 0 | false | false | -| ts | TIMESTAMP | false | 0 | | | false | 0 | true | false | -| s | VARCHAR | false | 0 | | | false | 0 | false | false | +| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | indexType | indexInclude | +| ------ | --------- | ------- | ------------------ | ------------ | -------------- | --------------- | ---------- | --------- | --------- | ------------ | +| symb | SYMBOL | true | 1048576 | false | 256 | 0 | false | false | BITMAP | | +| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| ts | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | | | +| s | VARCHAR | false | 0 | false | 0 | 0 | false | false | | | ```questdb-sql title="Get designated timestamp column" SELECT "column", type, designated FROM table_columns('my_table') WHERE designated = true; diff --git a/documentation/query/sql/show.md b/documentation/query/sql/show.md index c1d59d0ae..77d432c1f 100644 --- a/documentation/query/sql/show.md +++ b/documentation/query/sql/show.md @@ -71,18 +71,19 @@ SHOW TABLES; SHOW COLUMNS FROM trades; ``` -| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | -| --------- | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | --------------- | ---------- | --------- | -| symbol | SYMBOL | false | 0 | | | true | 256 | 42 | false | false | -| side | SYMBOL | false | 0 | | | true | 256 | 2 | false | false | -| price | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | -| amount | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | -| timestamp | TIMESTAMP | false | 0 | | | false | 0 | 0 | true | false | - -The `indexType` column shows the index type (`POSTING`, `BITMAP`, or empty for -non-indexed columns). The `indexInclude` column lists the names of columns -included in a [posting index's](/docs/concepts/deep-dive/posting-index/) -covering sidecar, as a comma-separated string. +| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | indexType | indexInclude | +| --------- | --------- | ------- | ------------------ | ------------ | -------------- | --------------- | ---------- | --------- | --------- | ------------ | +| symbol | SYMBOL | false | 0 | true | 256 | 42 | false | false | | | +| side | SYMBOL | false | 0 | true | 256 | 2 | false | false | | | +| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| amount | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| timestamp | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | | | + +The `indexType` column shows the index type (`POSTING`, `POSTING DELTA`, +`POSTING EF`, `BITMAP`, or empty for non-indexed columns). The +`indexInclude` column lists the names of columns included in a +[posting index's](/docs/concepts/deep-dive/posting-index/) covering +sidecar, as a comma-separated string. ### SHOW CREATE TABLE From e18de5694d2912c62e3c4e0797432b929cb33be2 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:51:25 +0100 Subject: [PATCH 08/19] Drop speculative data-distribution claims from encoding example block MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The table above the example block was already corrected to drop the unverified "irregular data, point queries" / "regular data, large scans" claims about EF and DELTA. Update the example block's inline comments to match — both explicit variants are positioned as benchmarking-only. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 4d44e1246..539ba4077 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -143,11 +143,11 @@ benchmarking. CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) TIMESTAMP(ts) PARTITION BY DAY WAL; --- EF encoding (irregular data, point queries) +-- Force Elias-Fano only (benchmarking) CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING EF) TIMESTAMP(ts) PARTITION BY DAY WAL; --- Delta-only encoding (regular data, large scans) +-- Force delta + Frame-of-Reference only (benchmarking) CREATE TABLE t3 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) TIMESTAMP(ts) PARTITION BY DAY WAL; ``` From a3f9cfb770de39d40d7a3ffe3c94e5f05f00c838 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:54:01 +0100 Subject: [PATCH 09/19] Clarify auto-include of timestamp on ALTER ADD INDEX alter-table-alter-column-add-index.md: - State explicitly that bare ALTER ... ADD INDEX TYPE POSTING (no INCLUDE clause) already covers timestamp + symbol queries because the designated timestamp is auto-included. - Add the EF variant alongside DELTA in the encoding-variant example. alter-mat-view-alter-column-add-index.md: - Replace the "INCLUDE not supported, use posting without INCLUDE" note with a more accurate explanation: the parser rejects an explicit INCLUDE clause on materialized views, but the view's designated timestamp is still auto-added, so the bare form produces a covering index over timestamp. Verified live via table_columns(). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../alter-mat-view-alter-column-add-index.md | 8 +++++-- .../sql/alter-table-alter-column-add-index.md | 21 ++++++++++++------- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index d866e788b..eb98d3629 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -50,8 +50,12 @@ ALTER MATERIALIZED VIEW trades_hourly :::note -The `INCLUDE` clause for covering indexes is not supported on materialized -views. Use a posting index without `INCLUDE` for faster filtered lookups. +An explicit `INCLUDE` clause for covering indexes is not currently +accepted on materialized views — the parser rejects it. The view's +designated timestamp is still auto-added, so `INDEX TYPE POSTING` on a +view's symbol column produces a covering index over the timestamp, +which is enough to accelerate `WHERE symbol = … LATEST ON ts` and +similar timestamp-only covering queries. ::: diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index 4c63811ca..2df5159f5 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -30,11 +30,18 @@ ALTER TABLE trades ALTER COLUMN side ADD INDEX; ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING; ``` -An encoding variant can be specified: +The designated timestamp is auto-included as a covered column even when +no explicit `INCLUDE` clause is given, so the bare form above already +covers `SELECT timestamp, instrument FROM trades WHERE instrument = 'X'`. + +An encoding variant can also be forced: ```questdb-sql --- Force delta-only encoding +-- Force delta + Frame-of-Reference (benchmarking) ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING DELTA; + +-- Force Elias-Fano (benchmarking) +ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING EF; ``` ### Adding a posting index with covering columns @@ -47,12 +54,10 @@ ALTER TABLE trades ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity); ``` -The designated timestamp column is automatically included in the covering -index — you do not need to list it explicitly. - -After this, queries that only select columns from the `INCLUDE` list (plus the -indexed symbol column and designated timestamp) are served from the index -sidecar: +The designated timestamp is appended to the `INCLUDE` list automatically. +After this, queries that only select columns from the `INCLUDE` list (plus +the indexed symbol column and designated timestamp) are served from the +index sidecar: ```questdb-sql -- This query reads from the index sidecar, not from column files From 3b0df234673bd5f79428d12b46f4e771644f89fd Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:58:33 +0100 Subject: [PATCH 10/19] Distinguish filtered vs unfiltered LATEST ON in bitmap/posting comparison Live verification: bitmap uses LatestByAllIndexed for unfiltered LATEST ON (index-accelerated), while posting falls back to LatestByDeferredListValuesFiltered for the unfiltered case. The previous "LATEST ON | Yes | Yes" row hid this difference. Split into two rows so readers see that bitmap retains the edge for unfiltered LATEST ON, while posting wins on filtered LATEST ON via the covering path. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 539ba4077..ee76081e4 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -300,7 +300,8 @@ doesn't filter on the indexed symbol. | Covering index (INCLUDE) | No | Yes | | DISTINCT acceleration | No | Yes | | Write overhead | Low | Low (without INCLUDE), moderate with INCLUDE | -| LATEST ON optimization | Yes | Yes | +| Filtered LATEST ON | Yes | Yes (covering) | +| Unfiltered LATEST ON | Yes (`LatestByAllIndexed`) | Falls back to deferred-list scan | | `CAPACITY` clause | Yes | No (parse error) | | Syntax | `INDEX` or `INDEX TYPE BITMAP` | `INDEX TYPE POSTING` | From 83f84f5e4ba6ccd80580485698dea727a4524522 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 09:01:44 +0100 Subject: [PATCH 11/19] Tighten .pci description and COUNT example comment posting-index.md: - .pci was described as "per-column header" but it's a single index- level header listing all covered columns by writer index (PCI1 magic, count, writerIndex array). Reword accordingly. - COUNT example comment said "uses index" but the actual plan is Count over CoveringIndex with no column data read. Make the comment describe what the plan node actually says. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index ee76081e4..cb4580d07 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -361,7 +361,7 @@ SELECT DISTINCT symbol FROM trades WHERE timestamp > '2024-01-01'; ### COUNT queries ```questdb-sql --- Uses index to scan only matching rows instead of full table +-- Plan: Count over CoveringIndex, no column data read SELECT COUNT(*) FROM trades WHERE symbol = 'AAPL'; ``` @@ -455,11 +455,12 @@ The posting index stores data in three file types per partition: Frame-of-Reference bitpacking or Elias-Fano (depending on the index's encoding variant), organised into stride-indexed generations. - **`.pci` + `.pc0`, `.pc1`, …** — Sidecar files: covered column values - stored alongside the posting list. `.pci` holds the per-column header - (including the `coverCount`); each `.pcN` (with txn-segment suffix on - disk, e.g. `s.pc0.0.0`) holds the encoded data for one `INCLUDE` - column. The auto-included designated timestamp counts as one of the - covered columns and gets its own `.pcN` file. + stored alongside the posting list. The single `.pci` header lists the + covered columns by writer index (`PCI1` magic, plus the `coverCount` + used by readers to size their sidecar mappings). Each `.pcN` (with + txn-segment suffix on disk, e.g. `s.pc0.0.0`) holds the encoded data + for one `INCLUDE` column. The auto-included designated timestamp + counts as one of the covered columns and gets its own `.pcN` file. ### Generations and sealing From 12abb60d69a44e0d3ed6a86415fb0a40076d7eb7 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 09:53:24 +0100 Subject: [PATCH 12/19] Reflect auto-included timestamp in SHOW CREATE TABLE posting example Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/query/sql/show.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/documentation/query/sql/show.md b/documentation/query/sql/show.md index 77d432c1f..b71252116 100644 --- a/documentation/query/sql/show.md +++ b/documentation/query/sql/show.md @@ -111,12 +111,15 @@ WITH maxUncommittedRows=500000, o3MaxLag=600000000us; #### Posting index with covering columns When a symbol column has a posting index with `INCLUDE`, the DDL reflects -the index type and covered columns: +the index type and covered columns. The designated timestamp is appended +to the `INCLUDE` list automatically, so a table created with +`INCLUDE (price, exchange)` round-trips as +`INCLUDE (price, exchange, timestamp)`: ```questdb-sql CREATE TABLE trades ( - symbol SYMBOL CAPACITY 128 CACHE INDEX TYPE POSTING INCLUDE (price, exchange), - exchange SYMBOL CAPACITY 128 CACHE, + symbol SYMBOL CAPACITY 256 CACHE INDEX TYPE POSTING INCLUDE (price, exchange, timestamp), + exchange SYMBOL CAPACITY 256 CACHE, price DOUBLE, amount DOUBLE, timestamp TIMESTAMP From 5d68b3a4fb720fc683348f7994eac142524f28d6 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 15 May 2026 14:39:21 +0100 Subject: [PATCH 13/19] Address @javier review comments on posting/covering index docs - alter-table-add-index: split syntax into bitmap + posting blocks with CAPACITY, INCLUDE, and DELTA/EF variants. - alter-mat-view-add-index: same syntax split, surface the INCLUDE-on- matview rejection inline, add DELTA/EF to syntax and parameters. - create-table column indexes: drop the leftover indexDef wording, replace the bitmap-only top syntax block with inline + out-of-line forms for both bitmap and posting (covering inline-only), add an inline bitmap example, link the posting-index intro and the INCLUDE clause to the deep-dive page, and add a WAL/BYPASS-WAL note. - posting-index: clarify ALTER MATERIALIZED VIEW also rejects INCLUDE, drop the redundant CAPACITY-only-for-bitmap note, collapse the duplicated SHOW COLUMNS example into a pointer to show.md and table_columns(), move the bitmap-vs-posting comparison to indexes.md, replace the full no_covering / no_index section with a pointer to sql-optimizer-hints#index-hints, and drop the non-limitation bullet about INCLUDE not being valid on bitmap. - indexes: add a "Choosing an index type" section that receives the moved comparison table and a short prose guide. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/indexes.md | 21 +++++ .../concepts/deep-dive/posting-index.md | 80 ++++++------------- .../alter-mat-view-alter-column-add-index.md | 24 +++++- .../sql/alter-table-alter-column-add-index.md | 12 ++- documentation/query/sql/create-table.md | 56 ++++++++++--- 5 files changed, 122 insertions(+), 71 deletions(-) diff --git a/documentation/concepts/deep-dive/indexes.md b/documentation/concepts/deep-dive/indexes.md index 7f2b7d861..154451648 100644 --- a/documentation/concepts/deep-dive/indexes.md +++ b/documentation/concepts/deep-dive/indexes.md @@ -24,6 +24,27 @@ QuestDB supports two index types: See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) for the detailed guide on the posting index and its covering query capabilities. +## Choosing an index type + +| Feature | Bitmap index | Posting index | +|---------|-------------|---------------| +| Storage size | ~15 bytes/value | ~1 byte/value | +| Covering index (`INCLUDE`) | No | Yes | +| `DISTINCT` acceleration | No | Yes | +| Write overhead | Low | Low (without `INCLUDE`), moderate with `INCLUDE` | +| Filtered `LATEST ON` | Yes | Yes (covering path) | +| Unfiltered `LATEST ON` | Yes (`LatestByAllIndexed`) | Falls back to deferred-list scan | +| `CAPACITY` clause | Yes | No (parse error) | +| Syntax | `INDEX` or `INDEX TYPE BITMAP` | `INDEX TYPE POSTING` | + +Use the **bitmap index** when you want a low-overhead general-purpose +index, or when your hottest query shape is unfiltered `LATEST ON … +PARTITION BY sym` (bitmap retains the edge there). + +Use the **posting index** when reads dominate writes, queries are +selective on the indexed symbol, and you can list the columns you +typically select alongside the symbol in `INCLUDE` for covering reads. + ## Index creation and deletion The following are ways to index a `symbol` column: diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index cb4580d07..932deafe6 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -91,9 +91,10 @@ sym = 'X'`. The expanded list is what `SHOW CREATE TABLE` round-trips, so :::note -The `INCLUDE` clause is only supported with inline column syntax and -`ALTER TABLE`. The out-of-line `INDEX(col TYPE POSTING)` syntax does not -support `INCLUDE`. +The `INCLUDE` clause is only supported with inline column syntax in +`CREATE TABLE` and with `ALTER TABLE`. The out-of-line +`INDEX(col TYPE POSTING)` syntax and `ALTER MATERIALIZED VIEW` both reject +`INCLUDE`. Writing `INDEX INCLUDE (...)` (no explicit `TYPE`) is also accepted and implicitly creates a posting index — `INCLUDE` is only valid with @@ -152,13 +153,6 @@ CREATE TABLE t3 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) TIMESTAMP(ts) PARTITION BY DAY WAL; ``` -:::note - -`CAPACITY` is only supported for bitmap indexes. Using `CAPACITY` with a -posting index will produce an error. - -::: - ## Covering index The covering index is the most powerful feature of the posting index. When all @@ -223,24 +217,16 @@ covering optimization and will be read from column files. ### Inspecting indexes with SHOW COLUMNS -`SHOW COLUMNS` displays index metadata for each column, including the index -type and covered columns: - -```questdb-sql -SHOW COLUMNS FROM trades; -``` +[`SHOW COLUMNS`](/docs/query/sql/show/#show-columns) reports the index type +and covered column list per column: -| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | indexType | indexInclude | -|-----------|-----------|---------|--------------------|--------------|----------------|-----------------|------------|-----------|-----------|---------------------------| -| timestamp | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | | | -| symbol | SYMBOL | true | 256 | true | 256 | 0 | false | false | POSTING | exchange,price,timestamp | -| exchange | SYMBOL | false | 256 | true | 256 | 0 | false | false | | | -| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | -| quantity | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +- `indexType` is `POSTING`, `POSTING DELTA`, `POSTING EF`, `BITMAP`, or + empty for non-indexed columns. +- `indexInclude` lists covered column names — including the auto-included + designated timestamp. -The `indexType` column shows `POSTING`, `POSTING DELTA`, `POSTING EF`, -`BITMAP`, or is empty for non-indexed columns. The `indexInclude` column -lists covered column names — note the auto-included designated timestamp. +The [`table_columns()`](/docs/query/functions/meta/#table_columns) function +exposes the same fields programmatically. ### Verifying covering index usage @@ -294,21 +280,15 @@ doesn't filter on the indexed symbol. ## Comparison with bitmap index -| Feature | Bitmap index | Posting index | -|---------|-------------|---------------| -| Storage size | ~15 bytes/value | ~1 byte/value | -| Covering index (INCLUDE) | No | Yes | -| DISTINCT acceleration | No | Yes | -| Write overhead | Low | Low (without INCLUDE), moderate with INCLUDE | -| Filtered LATEST ON | Yes | Yes (covering) | -| Unfiltered LATEST ON | Yes (`LatestByAllIndexed`) | Falls back to deferred-list scan | -| `CAPACITY` clause | Yes | No (parse error) | -| Syntax | `INDEX` or `INDEX TYPE BITMAP` | `INDEX TYPE POSTING` | +For a side-by-side feature comparison and guidance on choosing between the +two index types, see +[Choosing an index type](/docs/concepts/deep-dive/indexes/#choosing-an-index-type) +on the indexes overview page. In end-to-end benchmarks (geomean across five workloads, sealed indexes), the posting index is roughly 13× smaller than the bitmap index and 1.3–1.5× faster on point, range, and full-scan reads. Writes are ~9% slower than the -bitmap index for the index part itself; sidecar writes add overhead +bitmap index for the index path itself; sidecar writes add overhead proportional to the number and type of `INCLUDE` columns. ## Query patterns accelerated @@ -377,25 +357,14 @@ WHERE symbol = 'AAPL'; ## SQL optimizer hints -Two hints control index usage: +Two hints opt a query out of the covering and/or index paths for +benchmarking or troubleshooting: -### no_covering +- `no_covering` — read from column files instead of the covering sidecar +- `no_index` — disable index usage entirely (implies `no_covering`) -Forces the query to read from column files instead of the covering index -sidecar. Useful for benchmarking or when the covering path has an issue. - -```questdb-sql -SELECT /*+ no_covering */ price FROM trades WHERE symbol = 'AAPL'; -``` - -### no_index - -Completely disables index usage, falling back to a full table scan with -filter. Also implies `no_covering`. - -```questdb-sql -SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; -``` +See [Index hints](/docs/concepts/deep-dive/sql-optimizer-hints/#index-hints) +for syntax, semantics, and examples. ## Trade-offs @@ -506,9 +475,6 @@ file. Decompression is transparent to the query engine. :::warning -- `INCLUDE` is only supported for the posting index type (not bitmap). - Writing `INDEX TYPE BITMAP INCLUDE (...)` errors with - `INCLUDE is only supported for POSTING index type`. - `INCLUDE` cannot list the indexed symbol column itself. - `INCLUDE` is not supported with out-of-line `INDEX(col ...)` syntax — use inline column syntax or `ALTER TABLE` instead. diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index eb98d3629..6cf4dc44e 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -10,11 +10,26 @@ query performance for filtered lookups. ## Syntax +Bitmap index (default): + +```questdb-sql +ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX [CAPACITY n] ``` -ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX [ CAPACITY n ] -ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX TYPE POSTING + +[Posting index](/docs/concepts/deep-dive/posting-index/), with optional +encoding variant: + +```questdb-sql +ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName + ADD INDEX TYPE POSTING [DELTA | EF] ``` +An explicit `INCLUDE` clause is not accepted on materialized views — the +parser rejects it. The view's designated timestamp is still auto-included, +so the bare `INDEX TYPE POSTING` form produces a covering index over the +timestamp. See the [note below](#materialized-view-include-restriction) for +details. + ## Parameters | Parameter | Description | @@ -23,6 +38,7 @@ ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX TYPE POSTING | `columnName` | Name of the `SYMBOL` column to index | | `CAPACITY` | Optional index capacity for bitmap indexes (advanced; use default unless you understand implications) | | `TYPE POSTING` | Use a [posting index](/docs/concepts/deep-dive/posting-index/) instead of the default bitmap index | +| `DELTA` / `EF` | Force a row-ID encoding variant — see [encoding options](/docs/concepts/deep-dive/posting-index/#encoding-options) | ## When to use @@ -48,6 +64,8 @@ ALTER MATERIALIZED VIEW trades_hourly ALTER COLUMN symbol ADD INDEX TYPE POSTING; ``` +### Materialized view INCLUDE restriction + :::note An explicit `INCLUDE` clause for covering indexes is not currently @@ -55,7 +73,7 @@ accepted on materialized views — the parser rejects it. The view's designated timestamp is still auto-added, so `INDEX TYPE POSTING` on a view's symbol column produces a covering index over the timestamp, which is enough to accelerate `WHERE symbol = … LATEST ON ts` and -similar timestamp-only covering queries. +similar timestamp-only covering queries against the view itself. ::: diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index 2df5159f5..77c71b71e 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -8,8 +8,18 @@ Indexes an existing [`symbol`](/docs/concepts/symbol/) column. ## Syntax +Bitmap index (default): + +```questdb-sql +ALTER TABLE tableName ALTER COLUMN columnName ADD INDEX [CAPACITY n] +``` + +[Posting index](/docs/concepts/deep-dive/posting-index/), with optional +covering columns and encoding variant: + ```questdb-sql -ALTER TABLE tableName ALTER COLUMN columnName ADD INDEX; +ALTER TABLE tableName ALTER COLUMN columnName + ADD INDEX TYPE POSTING [DELTA | EF] [INCLUDE (col, ...)] ``` Adding an [index](/docs/concepts/deep-dive/indexes/) is an atomic, non-blocking, and diff --git a/documentation/query/sql/create-table.md b/documentation/query/sql/create-table.md index a04398b27..d715f2805 100644 --- a/documentation/query/sql/create-table.md +++ b/documentation/query/sql/create-table.md @@ -633,16 +633,32 @@ CREATE TABLE test AS ( ## Column indexes -Index definitions (`indexDef`) are used to create an -[index](/docs/concepts/deep-dive/indexes/) for a table column. The referenced table column -must be of type [symbol](/docs/concepts/symbol/). +Index definitions are used to create an +[index](/docs/concepts/deep-dive/indexes/) for a table column. The +referenced column must be of type [symbol](/docs/concepts/symbol/). + +Each index can be declared either **inline** (on the column itself) or +**out-of-line** (in a trailing `INDEX(...)` clause): ```questdb-sql -INDEX (columnRef [CAPACITY valueBlockSize]) +-- Bitmap (default) +columnRef SYMBOL INDEX [CAPACITY n] +INDEX (columnRef [CAPACITY n]) + +-- Posting (with optional covering and encoding variant) +columnRef SYMBOL INDEX TYPE POSTING [DELTA | EF] [INCLUDE (col, ...)] +INDEX (columnRef TYPE POSTING [DELTA | EF]) ``` +`INCLUDE` is only valid with the inline form — see +[Posting index with covering columns (INCLUDE)](#posting-index-with-covering-columns-include) +below. + ### Bitmap index (default) +Out-of-line syntax (one or more trailing `INDEX(...)` clauses after the +column list): + ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, @@ -652,11 +668,22 @@ CREATE TABLE trades ( ), INDEX(symbol) TIMESTAMP(timestamp); ``` +Inline syntax (declared on the column): + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX, + price DOUBLE, + amount DOUBLE +) TIMESTAMP(timestamp); +``` + ### Posting index -The posting index offers better compression and read performance than the -default bitmap index. Use `INDEX TYPE POSTING` with either inline or -out-of-line syntax: +The [posting index](/docs/concepts/deep-dive/posting-index/) offers better +compression and read performance than the default bitmap index. Use +`INDEX TYPE POSTING` with either inline or out-of-line syntax: ```questdb-sql -- Inline syntax @@ -679,9 +706,10 @@ TIMESTAMP(timestamp) PARTITION BY DAY WAL; ### Posting index with covering columns (INCLUDE) -The `INCLUDE` clause stores additional column values in the index sidecar -files. Queries that only need these columns plus the indexed symbol can be -served entirely from the index, bypassing column files: +The [`INCLUDE` clause](/docs/concepts/deep-dive/posting-index/#covering-index) +stores additional column values in the index sidecar files. Queries that +only need these columns plus the indexed symbol can be served entirely +from the index, bypassing column files: ```questdb-sql CREATE TABLE trades ( @@ -709,6 +737,14 @@ table. ::: +:::tip + +Posting indexes (with or without `INCLUDE`) work on both WAL and `BYPASS WAL` +tables. The examples above use `WAL` because it is the recommended default, +but `BYPASS WAL` tables can declare posting indexes in exactly the same way. + +::: + See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) for a comprehensive guide including supported column types, query patterns, and performance characteristics. From dca778d353c7f7d579f397df15fd17789236b441 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 15 May 2026 14:45:02 +0100 Subject: [PATCH 14/19] Cover questdb master posting-index changes landed after the last PR push - cairo-engine: document cairo.mat.view.covering.index.enabled (default false), which disables the covering-index path for materialized view refresh queries (questdb#7065). Ad-hoc queries against the view still use covering when eligible. - alter-mat-view-add-index: add a warning under the existing INCLUDE restriction note so readers do not assume their TYPE POSTING on a view's symbol column accelerates the refresh. - cairo-engine: document cairo.posting.index.indexer.spill.bytes.max (default 256 MiB), the new indexer-phase back-pressure budget for one-shot build paths (questdb#7080). - posting-index limitations: note the new var-size INCLUDE hard-fail on ALTER ADD INDEX under tight RSS, with remediation steps. - posting-index generations/sealing: clarify that the cairo.posting.seal.gen.threshold of 16 mostly governs partition retirement; while a partition is active the WAL fast-lag path (questdb#7075) lets it accumulate up to the internal 143-generation cap before flushAllPending forces an inline seal. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../concepts/deep-dive/posting-index.md | 28 +++++++++++++++---- documentation/configuration/cairo-engine.md | 28 +++++++++++++++++++ .../alter-mat-view-alter-column-add-index.md | 11 ++++++++ 3 files changed, 61 insertions(+), 6 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 932deafe6..530db4b5b 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -438,12 +438,19 @@ generation contains a sparse block of key→rowID mappings. Periodically, generations are **sealed** into a single dense generation with stride-indexed layout for optimal read performance. -Sealing happens automatically when the active generation count reaches a -threshold (`cairo.posting.seal.gen.threshold`, default 16) or when a -partition is closed. Sealed data is written stride-by-stride (256 keys per -stride). Within the delta + Frame-of-Reference family, the writer -trial-encodes each stride in two sub-layouts and keeps whichever produces -fewer bytes: +Sealing happens automatically in two cases. When a partition is **closed** +(retired by the next partition becoming active), it is compacted if it +carries more than `cairo.posting.seal.gen.threshold` (default 16) unsealed +generations. While a partition is still **active**, WAL fast-lag commits +append a new sparse generation to the live `.pv` rather than re-sealing, +so the active partition can accumulate up to the internal cap of 143 +generations before `flushAllPending` forces an inline seal. As a result, +the 16-generation threshold mostly governs partition retirement, and the +active partition typically only reaches it once it is closed. + +Sealed data is written stride-by-stride (256 keys per stride). Within the +delta + Frame-of-Reference family, the writer trial-encodes each stride in +two sub-layouts and keeps whichever produces fewer bytes: - **Delta sub-layout** — per-key delta encoding, then per-block Frame-of-Reference bitpacking. Wins when there are roughly ten or more @@ -486,5 +493,14 @@ file. Decompression is transparent to the query engine. regular page-frame scan. - `REINDEX` on WAL tables requires dropping and re-adding the index (this applies to all index types, not just posting). +- `ALTER TABLE … ADD INDEX TYPE POSTING INCLUDE (col, …)` may hard-fail + when the partition is large enough that the seal phase would need + more native memory than `RSS_MEM_LIMIT` allows **and** at least one + `INCLUDE` column is variable-width (`STRING`, `VARCHAR`, `BINARY`). + Fixed-width `INCLUDE` columns (numerics, `UUID`, `TIMESTAMP`, etc.) + stream through transparently. Workarounds: drop the variable-width + `INCLUDE` column from the index, reduce partition size, or raise + `RSS_MEM_LIMIT`. The indexer spill budget is tunable via + [`cairo.posting.index.indexer.spill.bytes.max`](/docs/configuration/cairo-engine/#cairopostingindexindexerspillbytesmax). ::: diff --git a/documentation/configuration/cairo-engine.md b/documentation/configuration/cairo-engine.md index 23a98b8e1..78948112c 100644 --- a/documentation/configuration/cairo-engine.md +++ b/documentation/configuration/cairo-engine.md @@ -278,6 +278,18 @@ Approximation of the number of rows for a single index key. Must be a power of 2. Applies to bitmap indexes only; posting indexes manage their own block layout. +### cairo.mat.view.covering.index.enabled + +- **Default**: `false` +- **Reloadable**: no + +When `false`, the SQL planner skips the covering-index path for +[materialized view](/docs/concepts/materialized-views/) refresh queries +and uses the regular plan instead. Set to `true` to opt the refresh +back into covering for setups where the covering path is faster (small, +highly selective filters with `INCLUDE` columns). Ad-hoc queries against +the view are unaffected and use covering when eligible. + ### cairo.parallel.index.threshold - **Default**: `100000` @@ -303,6 +315,22 @@ covering index for any [posting index](/docs/concepts/deep-dive/posting-index/), including bare `INDEX TYPE POSTING` declarations with no `INCLUDE` clause. +### cairo.posting.index.indexer.spill.bytes.max + +- **Default**: `268435456` (256 MiB) +- **Reloadable**: no + +Maximum bytes the per-key spill arena may hold during one-shot +[posting index](/docs/concepts/deep-dive/posting-index/) build paths +(`ALTER ADD INDEX`, `REINDEX`, snapshot restore, and O3 partition +rewrites on posting-indexed wide columns). When exceeded, the writer +drains pending state into a fresh sparse generation and frees the +arena. A seal-phase streaming fallback bounds peak heap to the largest +single key's row count. Set to 0 or a negative value to disable the +back-pressure entirely and recover the legacy "accumulate until seal" +behaviour. Steady-state WAL ingestion is unaffected — back-pressure +only kicks in on the one-shot build paths above. + ### cairo.posting.index.row.id.encoding - **Default**: `adaptive` diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index 6cf4dc44e..460c705e5 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -77,6 +77,17 @@ similar timestamp-only covering queries against the view itself. ::: +:::warning + +**Covering index is disabled for view refresh by default.** Even when a +posting index on a view's symbol column produces a covering layout, the +SQL planner skips that path during the view's refresh queries unless +[`cairo.mat.view.covering.index.enabled`](/docs/configuration/cairo-engine/#cairomatviewcoveringindexenabled) +is set to `true`. Ad-hoc queries you issue against the materialized +view still use covering when eligible. + +::: + ## Behavior | Aspect | Description | From deb1b5ce778f0726c7ce778c8c9994c55106b1f3 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 15 May 2026 14:51:13 +0100 Subject: [PATCH 15/19] Tighten posting-index extras after re-reading @javier feedback patterns - alter-mat-view: split the previously-merged "INCLUDE restriction" section into two clearly-labelled headings (INCLUDE restriction; Covering index during refresh) so the new refresh-time warning sits under its own topic rather than under an INCLUDE-syntax heading. Update the inline pointer at the top of the page to link both. - posting-index sealing: drop the internal `.pv` / `flushAllPending` references and the literal 143-generation cap; lead with a two-bullet description and a one-line takeaway that the 16-threshold mostly governs partition retirement. - posting-index limitations: tighten the var-size INCLUDE hard-fail bullet to match the size of its siblings. - cairo-engine: tighten cairo.posting.index.indexer.spill.bytes.max to the level of detail of surrounding entries (drop the seal-phase streaming-fallback internal). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../concepts/deep-dive/posting-index.md | 36 ++++++++++--------- documentation/configuration/cairo-engine.md | 17 ++++----- .../alter-mat-view-alter-column-add-index.md | 17 +++++---- 3 files changed, 36 insertions(+), 34 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 530db4b5b..41d8c38a7 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -438,15 +438,18 @@ generation contains a sparse block of key→rowID mappings. Periodically, generations are **sealed** into a single dense generation with stride-indexed layout for optimal read performance. -Sealing happens automatically in two cases. When a partition is **closed** -(retired by the next partition becoming active), it is compacted if it -carries more than `cairo.posting.seal.gen.threshold` (default 16) unsealed -generations. While a partition is still **active**, WAL fast-lag commits -append a new sparse generation to the live `.pv` rather than re-sealing, -so the active partition can accumulate up to the internal cap of 143 -generations before `flushAllPending` forces an inline seal. As a result, -the 16-generation threshold mostly governs partition retirement, and the -active partition typically only reaches it once it is closed. +Sealing happens automatically in two cases: + +- When a partition is **closed** — retired by the next partition becoming + active — it is compacted if it carries more than + `cairo.posting.seal.gen.threshold` (default 16) unsealed generations. +- While a partition is **active**, WAL fast-lag commits append a new + sparse generation in place rather than re-sealing. The active partition + can therefore carry many more generations than the threshold; an inline + seal is forced only when an internal generation cap is reached. + +In practice the 16-generation threshold mostly governs partition +retirement. Sealed data is written stride-by-stride (256 keys per stride). Within the delta + Frame-of-Reference family, the writer trial-encodes each stride in @@ -493,14 +496,13 @@ file. Decompression is transparent to the query engine. regular page-frame scan. - `REINDEX` on WAL tables requires dropping and re-adding the index (this applies to all index types, not just posting). -- `ALTER TABLE … ADD INDEX TYPE POSTING INCLUDE (col, …)` may hard-fail - when the partition is large enough that the seal phase would need - more native memory than `RSS_MEM_LIMIT` allows **and** at least one - `INCLUDE` column is variable-width (`STRING`, `VARCHAR`, `BINARY`). - Fixed-width `INCLUDE` columns (numerics, `UUID`, `TIMESTAMP`, etc.) - stream through transparently. Workarounds: drop the variable-width - `INCLUDE` column from the index, reduce partition size, or raise - `RSS_MEM_LIMIT`. The indexer spill budget is tunable via +- `ALTER TABLE … ADD INDEX TYPE POSTING INCLUDE (col, …)` can fail on + very large partitions when at least one `INCLUDE` column is + variable-width (`STRING`, `VARCHAR`, `BINARY`) and the seal phase + would exceed `RSS_MEM_LIMIT`. Fixed-width `INCLUDE` columns stream + through transparently. Remedies: drop the variable-width column + from the `INCLUDE` list, reduce partition size, or raise + `RSS_MEM_LIMIT`. See [`cairo.posting.index.indexer.spill.bytes.max`](/docs/configuration/cairo-engine/#cairopostingindexindexerspillbytesmax). ::: diff --git a/documentation/configuration/cairo-engine.md b/documentation/configuration/cairo-engine.md index 78948112c..1b30db6c0 100644 --- a/documentation/configuration/cairo-engine.md +++ b/documentation/configuration/cairo-engine.md @@ -320,16 +320,13 @@ covering index for any - **Default**: `268435456` (256 MiB) - **Reloadable**: no -Maximum bytes the per-key spill arena may hold during one-shot -[posting index](/docs/concepts/deep-dive/posting-index/) build paths -(`ALTER ADD INDEX`, `REINDEX`, snapshot restore, and O3 partition -rewrites on posting-indexed wide columns). When exceeded, the writer -drains pending state into a fresh sparse generation and frees the -arena. A seal-phase streaming fallback bounds peak heap to the largest -single key's row count. Set to 0 or a negative value to disable the -back-pressure entirely and recover the legacy "accumulate until seal" -behaviour. Steady-state WAL ingestion is unaffected — back-pressure -only kicks in on the one-shot build paths above. +Caps the per-key spill arena used by one-shot +[posting index](/docs/concepts/deep-dive/posting-index/) build paths — +`ALTER ADD INDEX`, `REINDEX`, snapshot restore, and O3 partition rewrites +on posting-indexed wide columns. When the cap is reached, the writer +drains pending state into a fresh sparse generation and continues. +Steady-state WAL ingestion is unaffected. Set to `0` or a negative value +to disable back-pressure entirely. ### cairo.posting.index.row.id.encoding diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index 460c705e5..2ba32305c 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -27,7 +27,8 @@ ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName An explicit `INCLUDE` clause is not accepted on materialized views — the parser rejects it. The view's designated timestamp is still auto-included, so the bare `INDEX TYPE POSTING` form produces a covering index over the -timestamp. See the [note below](#materialized-view-include-restriction) for +timestamp. See [INCLUDE restriction](#include-restriction) and +[Covering index during refresh](#covering-index-during-refresh) for details. ## Parameters @@ -64,7 +65,7 @@ ALTER MATERIALIZED VIEW trades_hourly ALTER COLUMN symbol ADD INDEX TYPE POSTING; ``` -### Materialized view INCLUDE restriction +### INCLUDE restriction :::note @@ -77,14 +78,16 @@ similar timestamp-only covering queries against the view itself. ::: +### Covering index during refresh + :::warning -**Covering index is disabled for view refresh by default.** Even when a -posting index on a view's symbol column produces a covering layout, the -SQL planner skips that path during the view's refresh queries unless +The covering-index path is disabled for view refresh queries by default +— even when the posting index on the view's symbol column produces a +covering layout, the planner skips it during refresh unless [`cairo.mat.view.covering.index.enabled`](/docs/configuration/cairo-engine/#cairomatviewcoveringindexenabled) -is set to `true`. Ad-hoc queries you issue against the materialized -view still use covering when eligible. +is set to `true`. Ad-hoc queries you issue against the view itself are +unaffected and use covering when eligible. ::: From c6c82330716264cfe18a1cbb57733058d7267b33 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 15 May 2026 15:35:21 +0100 Subject: [PATCH 16/19] Flag known async group-by / filter slowdown on the covering path The async group-by and filter code paths through the covering index are currently slower than the regular plan in some workloads. A follow-up release will close that gap; until then, users who see a query slow down after EXPLAIN shows it picking the covering path should opt out with /*+ no_covering */ or /*+ no_index */. - posting-index Covering index section: top-of-section caution admonition pointing readers at the optimizer-hint workaround. - sql-optimizer-hints Index hints intro: companion note so the remedy page surfaces the same context next to the no_covering / no_index entries. - alter-mat-view-add-index Covering index during refresh warning: augment the existing default-off explanation with the underlying why (same async group-by / filter perf gap), so the rationale is visible right where the operator chooses whether to flip cairo.mat.view.covering.index.enabled. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 14 ++++++++++++++ .../concepts/deep-dive/sql-optimizer-hints.md | 11 +++++++++++ .../sql/alter-mat-view-alter-column-add-index.md | 7 +++++-- 3 files changed, 30 insertions(+), 2 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 41d8c38a7..b4f506885 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -165,6 +165,20 @@ columns in a query's `SELECT` list are either: the main column files entirely. This is significantly faster for selective queries on wide tables. +:::caution + +The async group-by and filter code paths through the covering index are +currently slower than the regular plan in some workloads. A follow-up +release will close this gap, and the optimizer will continue to improve +as more feedback comes in. + +If you notice a query slowdown after [`EXPLAIN`](/docs/query/sql/explain/) +shows it has started picking the covering path, opt that query out with +[`/*+ no_covering */` or `/*+ no_index */`](/docs/concepts/deep-dive/sql-optimizer-hints/#index-hints) +while the optimizations land. + +::: + ### Supported column types in INCLUDE All column types except the indexed symbol column itself can be included: diff --git a/documentation/concepts/deep-dive/sql-optimizer-hints.md b/documentation/concepts/deep-dive/sql-optimizer-hints.md index 7a989a54b..33ae7f70b 100644 --- a/documentation/concepts/deep-dive/sql-optimizer-hints.md +++ b/documentation/concepts/deep-dive/sql-optimizer-hints.md @@ -366,6 +366,17 @@ your symbol set is high-cardinality. These hints control whether the query optimizer uses indexes (bitmap or posting) for symbol column lookups. +:::note + +The async group-by and filter code paths through the covering index are +currently slower than the regular plan in some workloads. A follow-up +release will close this gap. In the meantime, if +[`EXPLAIN`](/docs/query/sql/explain/) shows a query has started picking +the covering path and you observe a slowdown, apply `no_covering` (or +`no_index` to disable indexing entirely) on the affected query. + +::: + ### no_covering Disables the [covering index](/docs/concepts/deep-dive/posting-index/) diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index 2ba32305c..979e9d466 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -83,8 +83,11 @@ similar timestamp-only covering queries against the view itself. :::warning The covering-index path is disabled for view refresh queries by default -— even when the posting index on the view's symbol column produces a -covering layout, the planner skips it during refresh unless +because the async group-by and filter paths through the covering index +are currently slower than the regular plan in some workloads — opting +the refresh out preserves predictable refresh latency until that gap is +closed. Even when the posting index on the view's symbol column +produces a covering layout, the planner skips it during refresh unless [`cairo.mat.view.covering.index.enabled`](/docs/configuration/cairo-engine/#cairomatviewcoveringindexenabled) is set to `true`. Ad-hoc queries you issue against the view itself are unaffected and use covering when eligible. From 8dc3e1477f4f48530f044d04c361797930799793 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 15 May 2026 15:41:13 +0100 Subject: [PATCH 17/19] Add posting/INCLUDE options to top-of-page create-table syntax blocks The main schema syntax block had no INDEX clause at all, and the CREATE TABLE AS SELECT block only showed the bitmap form. The inline SYMBOL type definition also only documented bitmap. Update all three so the top-of-page syntax matches the per-variant examples added in the Column indexes section. - Main schema syntax: add [, INDEX (columnRef [CAPACITY n | TYPE POSTING [DELTA | EF]]) ...] with a pointer to Type definition and Column indexes; mention that inline indexes (including INCLUDE) live on the column itself in columnTypeDef. - CREATE TABLE AS SELECT: expand the existing INDEX line to cover the posting variants too. - SYMBOL columnTypeDef: expand the inline [INDEX ...] suffix to the full grammar including TYPE POSTING [DELTA | EF] [INCLUDE (col, ...)]. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/query/sql/create-table.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/documentation/query/sql/create-table.md b/documentation/query/sql/create-table.md index d715f2805..22048cb33 100644 --- a/documentation/query/sql/create-table.md +++ b/documentation/query/sql/create-table.md @@ -34,6 +34,7 @@ The first two modes accept the same set of optional clauses: CREATE [ATOMIC | BATCH n [o3MaxLag value]] TABLE [IF NOT EXISTS] tableName (columnName columnTypeDef [, columnName columnTypeDef ...]) -- see Type definition + [, INDEX (columnRef [CAPACITY n | TYPE POSTING [DELTA | EF]]) ...] -- see Column indexes [TIMESTAMP (columnName) [PARTITION BY { NONE | YEAR | MONTH | DAY | HOUR } [BYPASS WAL | WAL] @@ -45,12 +46,16 @@ TABLE [IF NOT EXISTS] tableName [OWNED BY ownerName]; ``` +Inline indexes (including covering indexes with `INCLUDE`) are declared +on the column itself in `columnTypeDef` — see [Type definition](#type-definition) +and [Column indexes](#column-indexes). + ```questdb-sql title="Create from a query (CREATE TABLE AS SELECT)" CREATE [ATOMIC | BATCH n [o3MaxLag value]] TABLE [IF NOT EXISTS] tableName AS (selectQuery) [, cast(columnRef AS columnTypeDef) ...] -- see Type definition - [, INDEX (columnRef [CAPACITY n]) ...] + [, INDEX (columnRef [CAPACITY n | TYPE POSTING [DELTA | EF]]) ...] -- see Column indexes [TIMESTAMP (columnName) [PARTITION BY { NONE | YEAR | MONTH | DAY | HOUR } [BYPASS WAL | WAL] @@ -427,7 +432,8 @@ columnTypeDef ::= | DOUBLE[][]... -- array: one [] pair per dimension | GEOHASH() | SYMBOL [CAPACITY distinctValueEstimate] [CACHE | NOCACHE] - [INDEX [CAPACITY valueBlockSize]] + [INDEX [ CAPACITY valueBlockSize + | TYPE POSTING [DELTA | EF] [INCLUDE (col, ...)] ]] -- Simple types | BINARY | BOOLEAN | BYTE | CHAR | DATE | DOUBLE | FLOAT | INT | IPV4 | LONG | LONG256 | SHORT | STRING From 46154b37e154f8745506cd1753e56f0d2655eb2d Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 15 May 2026 15:45:57 +0100 Subject: [PATCH 18/19] Clarify timestamp auto-include is a no-op on tables without a designated timestamp MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The existing tip explained that the designated timestamp gets folded into the covering index for free, but did not say what happens when the table has no designated timestamp at all. Some BYPASS WAL tables omit the TIMESTAMP(...) clause, and readers might infer from the tip that posting + INCLUDE only works on tables that do have one. Add a sentence stating that posting indexes and INCLUDE work normally on such tables — the auto-include just becomes a no-op and the covered list contains only the explicitly listed columns. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index b4f506885..7a7cabdb8 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -87,6 +87,12 @@ sym = 'X'`. The expanded list is what `SHOW CREATE TABLE` round-trips, so `cairo.posting.index.auto.include.timestamp` server property (default `true`). +This auto-include applies only when the table has a designated timestamp. +Tables without one (typically `BYPASS WAL` tables that omit the +`TIMESTAMP(...)` clause) can still use posting indexes and `INCLUDE` +normally — the auto-include simply becomes a no-op and the covered +list contains only the columns you list. + ::: :::note From 62b3d0986d0f92003ff3e88594b68acb701589e6 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 15 May 2026 15:55:16 +0100 Subject: [PATCH 19/19] Correct timestamp auto-include semantics: INCLUDE-only, no bare-form covering MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Earlier commits on this branch claimed that the designated timestamp is auto-included for any posting index, including bare INDEX TYPE POSTING declarations with no INCLUDE clause — that claim was wrong. A bare INDEX TYPE POSTING produces a posting-style index for fast symbol filtering only; it has no covering layer at all. The cairo.posting.index.auto.include.timestamp behavior only kicks in when the user already supplied an INCLUDE clause, appending the designated timestamp to that list if it was omitted. - cairo-engine: rewrite the cairo.posting.index.auto.include.timestamp description to make the INCLUDE-only scope explicit. - posting-index Covering with INCLUDE tip: drop the "bare INDEX TYPE POSTING already covers SELECT timestamp, sym" claim; reframe the tip around appending into a user-supplied INCLUDE list. Keep the no-designated-timestamp clarification, now wired to the corrected semantics. - alter-table ADD INDEX TYPE POSTING example: replace the "bare form already covers timestamp + symbol" paragraph with a direct statement that the bare form has no covering layer and a pointer to the INCLUDE variant. - alter-mat-view INCLUDE restriction note: matviews reject INCLUDE, so ADD INDEX TYPE POSTING on a view's symbol gives fast filtering only, not covering. Direct readers to the base-table option if they want a covering layout. - alter-mat-view Covering index during refresh warning: reframe as base-table covering being skipped during refresh, not view-level covering (the view never has one). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../concepts/deep-dive/posting-index.md | 24 ++++++----- documentation/configuration/cairo-engine.md | 9 +++-- .../alter-mat-view-alter-column-add-index.md | 40 +++++++++++-------- .../sql/alter-table-alter-column-add-index.md | 7 ++-- 4 files changed, 45 insertions(+), 35 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 7a7cabdb8..4bc9ecc89 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -78,20 +78,22 @@ can be served entirely from the index. :::tip -The designated timestamp column is automatically included in the covering -index — even when no explicit `INCLUDE` clause is given. So a bare -`INDEX TYPE POSTING` already covers `SELECT timestamp, sym FROM t WHERE -sym = 'X'`. The expanded list is what `SHOW CREATE TABLE` round-trips, so -`INCLUDE (exchange, price)` renders back as -`INCLUDE (exchange, price, timestamp)` after creation. Controlled by the +When you supply an `INCLUDE` clause, the designated timestamp is +automatically appended to it if you did not already list it — you do +not need to type it explicitly. `INCLUDE (exchange, price)` renders +back as `INCLUDE (exchange, price, timestamp)` in `SHOW CREATE TABLE` +after creation. Controlled by the `cairo.posting.index.auto.include.timestamp` server property (default `true`). -This auto-include applies only when the table has a designated timestamp. -Tables without one (typically `BYPASS WAL` tables that omit the -`TIMESTAMP(...)` clause) can still use posting indexes and `INCLUDE` -normally — the auto-include simply becomes a no-op and the covered -list contains only the columns you list. +This auto-append only applies when an `INCLUDE` clause is given **and** +the table has a designated timestamp. A bare `INDEX TYPE POSTING` +(no `INCLUDE`) has no covering layer at all — `SELECT timestamp, sym +FROM t WHERE sym = 'X'` reads the timestamp from the column file in +that case. Tables without a designated timestamp (typically `BYPASS WAL` +tables that omit the `TIMESTAMP(...)` clause) still work normally with +posting indexes and `INCLUDE`; the auto-append simply has nothing to +add. ::: diff --git a/documentation/configuration/cairo-engine.md b/documentation/configuration/cairo-engine.md index 1b30db6c0..7112f46b6 100644 --- a/documentation/configuration/cairo-engine.md +++ b/documentation/configuration/cairo-engine.md @@ -310,10 +310,11 @@ Enables parallel indexation. Works in conjunction with - **Default**: `true` - **Reloadable**: no -When `true`, the designated timestamp column is automatically added to the -covering index for any -[posting index](/docs/concepts/deep-dive/posting-index/), including bare -`INDEX TYPE POSTING` declarations with no `INCLUDE` clause. +When `true` and the user supplies an `INCLUDE` clause on a +[posting index](/docs/concepts/deep-dive/posting-index/), the designated +timestamp is automatically appended to the `INCLUDE` list if not already +present. Has no effect on bare `INDEX TYPE POSTING` declarations — those +have no covering layer regardless of this setting. ### cairo.posting.index.indexer.spill.bytes.max diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index 979e9d466..6d4171f9a 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -25,9 +25,11 @@ ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ``` An explicit `INCLUDE` clause is not accepted on materialized views — the -parser rejects it. The view's designated timestamp is still auto-included, -so the bare `INDEX TYPE POSTING` form produces a covering index over the -timestamp. See [INCLUDE restriction](#include-restriction) and +parser rejects it. `ADD INDEX TYPE POSTING` on a view's symbol column +therefore creates a posting index for fast symbol filtering, but the +view itself does not get a covering layer; selecting columns beyond the +indexed symbol still reads from the view's column files. See +[INCLUDE restriction](#include-restriction) and [Covering index during refresh](#covering-index-during-refresh) for details. @@ -69,12 +71,15 @@ ALTER MATERIALIZED VIEW trades_hourly :::note -An explicit `INCLUDE` clause for covering indexes is not currently -accepted on materialized views — the parser rejects it. The view's -designated timestamp is still auto-added, so `INDEX TYPE POSTING` on a -view's symbol column produces a covering index over the timestamp, -which is enough to accelerate `WHERE symbol = … LATEST ON ts` and -similar timestamp-only covering queries against the view itself. +An explicit `INCLUDE` clause is not currently accepted on materialized +views — the parser rejects it. Without `INCLUDE`, an `INDEX TYPE POSTING` +on a view's symbol column gives you fast filtering on that symbol but +**no covering layer**: any query that selects columns beyond the indexed +symbol still reads them from the view's column files. + +If a covering layout is what you want, build the covering posting index +on the **base table** instead and let the view inherit the acceleration +during refresh (subject to the gate described below). ::: @@ -82,15 +87,16 @@ similar timestamp-only covering queries against the view itself. :::warning -The covering-index path is disabled for view refresh queries by default -because the async group-by and filter paths through the covering index -are currently slower than the regular plan in some workloads — opting -the refresh out preserves predictable refresh latency until that gap is -closed. Even when the posting index on the view's symbol column -produces a covering layout, the planner skips it during refresh unless +If the *base table* feeding this view has a covering posting index +(declared with an `INCLUDE` clause), the SQL planner skips the covering +path during view refresh by default because the async group-by and +filter paths through the covering index are currently slower than the +regular plan in some workloads — opting the refresh out preserves +predictable refresh latency until that gap is closed. To re-enable +covering for refresh queries, set [`cairo.mat.view.covering.index.enabled`](/docs/configuration/cairo-engine/#cairomatviewcoveringindexenabled) -is set to `true`. Ad-hoc queries you issue against the view itself are -unaffected and use covering when eligible. +to `true`. Ad-hoc queries you issue against base tables that have +covering indexes are not affected by this flag. ::: diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index 77c71b71e..c55bbb641 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -40,9 +40,10 @@ ALTER TABLE trades ALTER COLUMN side ADD INDEX; ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING; ``` -The designated timestamp is auto-included as a covered column even when -no explicit `INCLUDE` clause is given, so the bare form above already -covers `SELECT timestamp, instrument FROM trades WHERE instrument = 'X'`. +The bare form has no covering layer — queries selecting columns other +than `instrument` still read from the column files. Add an +[`INCLUDE` clause](#adding-a-posting-index-with-covering-columns) to +build a covering index. An encoding variant can also be forced: