Skip to content

Support commit ts in TiFlash#10716

Merged
ti-chi-bot[bot] merged 7 commits intopingcap:feature/release-8.5-materialized-viewfrom
xzhangxian1008:commit-ts
Feb 14, 2026
Merged

Support commit ts in TiFlash#10716
ti-chi-bot[bot] merged 7 commits intopingcap:feature/release-8.5-materialized-viewfrom
xzhangxian1008:commit-ts

Conversation

@xzhangxian1008
Copy link
Contributor

@xzhangxian1008 xzhangxian1008 commented Feb 11, 2026

What problem does this PR solve?

Issue Number: close #10715

Problem Summary:

What is changed and how it works?


Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Summary by CodeRabbit

  • New Features
    • Treat the extra hidden commit-timestamp column as a version column across schema generation, remote requests, table-scan projection and filter resolution.
  • Bug Fixes
    • Ensure correct casting of the commit-timestamp column to the expected compute-layer type after table-scan and downstream projection.
  • Tests
    • Added unit tests covering aliasing, casting, and filter handling for the hidden commit-timestamp column.
  • Chores
    • Minor refactor to streamline projection logic.

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 11, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds support for a hidden commit timestamp column by introducing ExtraCommitTSColumnID = -5 and wiring its aliasing, schema emission, and casting across coprocessor, storage filter, and execution paths.

Changes

Cohort / File(s) Summary
Column ID Definition
dbms/src/Storages/KVStore/Types.h
Add static constexpr ColumnID ExtraCommitTSColumnID = -5 as a reserved column ID.
Coprocessor: schema & naming
dbms/src/Flash/Coprocessor/DAGStorageInterpreter.cpp, dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp, dbms/src/Flash/Coprocessor/RemoteRequest.cpp
Map ExtraCommitTSColumnID to the internal version column (MutableSupport::version_column_name / VersionColumnID) and emit a version-column ColumnDefine or schema entry for the hidden commit_ts.
Coprocessor: casting logic
dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cpp
Add post-table-scan cast handling for ExtraCommitTSColumnID: resolve computing-layer type, append cast when types differ, update source column name/type and mark cast applied.
Execution planning
dbms/src/Flash/Planner/Plans/PhysicalTableScan.cpp
Refactor to compute a local sample_block and pass it to buildTableScanProjectionCols in both DAGPipeline and PipelineExecGroupBuilder flows (no behavior change).
Storage: filter & attribute resolution
dbms/src/Storages/DeltaMerge/Filter/PushDownFilter.cpp, dbms/src/Storages/DeltaMerge/Filter/RSOperator.cpp, dbms/src/Storages/DeltaMerge/FilterParser/FilterParser.cpp
Alias ExtraCommitTSColumnIDVersionColumnID when building push-down filters, resolving operator attributes, and parsing filters so TiDB hidden commit_ts is normalized to the internal version column.
Tests
dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp
Add tests exercising aliasing and casting for the hidden commit_ts across PushDownFilter, post-table-scan casting, and rough-set operator behavior.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Query Request
    participant Coproc as Coprocessor
    participant Storage as Storage/Filter
    participant Exec as Executor

    Client->>Coproc: Request TableScan with ExtraCommitTSColumnID (-5)
    Coproc->>Coproc: Map -5 → VersionColumnID / version column name
    Coproc->>Coproc: Emit schema entry for version column
    Coproc->>Coproc: DAGExpressionAnalyzer adds cast to computing-layer type if needed
    Coproc->>Storage: Send schema and filters (version column represented)
    Storage->>Storage: PushDownFilter / RSOperator alias -5 → VersionColumnID
    Storage->>Exec: Return resolved column definitions
    Exec->>Exec: PhysicalTableScan applies projection (uses sample_block)
    Exec-->>Client: Return rows including commit_ts column
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

release-note

Suggested reviewers

  • kolafish
  • JinheLin
  • CalvinNeo

Poem

🐰 A tiny timestamp I did spy,
From -5 mapped neat and shy.
Cast and named, I hop the plan,
From storage up to execution span.
🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely incomplete. While it references issue #10715, it lacks a problem summary, implementation details (the commit-message section is empty), and does not check the required Unit test or Integration test boxes, only Manual test. Complete the 'Problem Summary' section, fill in the 'commit-message' section with implementation details, and either check Unit test or Integration test boxes, or document why tests were not feasible.
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Support commit ts in TiFlash' clearly and concisely summarizes the main objective of the PR, which is to add commit timestamp support to TiFlash.
Linked Issues check ✅ Passed The code changes successfully implement commit timestamp support across multiple components (column mapping, casting, filtering, and schema definitions), directly addressing the objective to support commit ts in TiFlash.
Out of Scope Changes check ✅ Passed All code changes are directly related to supporting commit timestamp handling in TiFlash. Changes span column ID definition, schema generation, type casting, filtering, and testing, all aligned with the stated objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@dbms/src/Flash/Planner/Plans/PhysicalTableScan.cpp`:
- Line 205: Replace the use of the member sample_block when constructing the
DAGExpressionAnalyzer with the local header from the group_builder so the
analyzer uses storage-side column types; specifically, in PhysicalTableScan.cpp
change the DAGExpressionAnalyzer creation to pass header (the
group_builder/header variable) instead of sample_block, mirroring the
DAGPipeline path that uses header and ensuring correct cast generation for the
version column; after the change, rebuild and run relevant planner/execution
tests to verify behavior.

In `@dbms/src/Storages/KVStore/Types.h`:
- Line 63: Rename the inconsistent constant ExtraCommitTSID to
ExtraCommitTSColumnID to match the existing naming convention (e.g.,
TiDBPkColumnID, VersionColumnID); update the declaration in Types.h and replace
all references/usages of ExtraCommitTSID throughout the codebase (tests,
headers, sources) to the new symbol ExtraCommitTSColumnID to avoid build breaks
and keep naming consistent.
🧹 Nitpick comments (2)
dbms/src/Flash/Planner/Plans/PhysicalTableScan.cpp (2)

175-184: Note: sample_block variable shadowing on line 182.

Line 167 declares auto sample_block = ... (shadows member), then line 182 declares const auto & sample_block = ... (shadows the line-167 local). While the shadowing is intentional here — you need the post-cast sample block to extract the new column name — this nested shadowing of the same name makes the code fragile and hard to reason about.

Consider using a distinct name like cast_sample_block for the inner declaration.

Proposed rename
-        const auto & sample_block = schema_project->getSampleBlock();
-        schema_project_cols[version_col_idx].first = sample_block.getByPosition(sample_block.columns() - 1).name;
+        const auto & cast_sample_block = schema_project->getSampleBlock();
+        schema_project_cols[version_col_idx].first = cast_sample_block.getByPosition(cast_sample_block.columns() - 1).name;

207-208: Same shadowing issue here — use a distinct name for the post-cast sample block.

Same concern as in the DAGPipeline overload: sample_block on line 207 shadows the member variable. Using a distinct name improves clarity, especially given the bug on line 205 where sample_block (member) was used unintentionally.

Proposed rename
-        const auto & sample_block = schema_actions->getSampleBlock();
-        schema_project_cols[version_col_idx].first = sample_block.getByPosition(sample_block.columns() - 1).name;
+        const auto & cast_sample_block = schema_actions->getSampleBlock();
+        schema_project_cols[version_col_idx].first = cast_sample_block.getByPosition(cast_sample_block.columns() - 1).name;

// Convert version column to nullable when needed
if (version_col_idx != -1)
{
DAGExpressionAnalyzer analyzer(sample_block, context);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Bug: sample_block (member) used instead of header (current group_builder header) for DAGExpressionAnalyzer.

On line 205, DAGExpressionAnalyzer analyzer(sample_block, context) references the class member sample_block (constructed from the schema types at plan-build time). It should use the local header variable (line 194), which reflects the actual storage column types from the group builder — the same pattern as the DAGPipeline path on line 177.

The member sample_block has the TiDB-side types for the version column, but the analyzer needs the storage-side types to generate the correct cast expression.

Proposed fix
-        DAGExpressionAnalyzer analyzer(sample_block, context);
+        DAGExpressionAnalyzer analyzer(header, context);
🤖 Prompt for AI Agents
In `@dbms/src/Flash/Planner/Plans/PhysicalTableScan.cpp` at line 205, Replace the
use of the member sample_block when constructing the DAGExpressionAnalyzer with
the local header from the group_builder so the analyzer uses storage-side column
types; specifically, in PhysicalTableScan.cpp change the DAGExpressionAnalyzer
creation to pass header (the group_builder/header variable) instead of
sample_block, mirroring the DAGPipeline path that uses header and ensuring
correct cast generation for the version column; after the change, rebuild and
run relevant planner/execution tests to verify behavior.

@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 12, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp`:
- Line 21: Replace the local-quote include for the KV types with the
project-style angle-bracket path: change the include statement referencing
"Storages/KVStore/Types.h" to use the angle-bracket form with the relative path
from dbms/src (i.e., use <Storages/KVStore/Types.h>), ensuring the include in
GenSchemaAndColumn.cpp follows the project's include convention.
- Around line 126-133: genNamesAndTypes currently falls through to the default
for ExtraCommitTSColumnID causing inconsistent schema entries; add a case for
ExtraCommitTSColumnID in genNamesAndTypes that mirrors
genColumnDefinesForDisaggregatedRead by mapping to VersionColumnID and using
MutableSupport::version_column_name and MutableSupport::version_column_type
instead of the generic column_info.name and
getDataTypeByColumnInfoForComputingLayer so the produced name and type match the
disaggregated read path.
🧹 Nitpick comments (1)
dbms/src/Flash/Planner/Plans/PhysicalTableScan.cpp (1)

158-167: Local sample_block shadows the member — consider renaming for clarity.

Line 160 introduces a local sample_block that shadows the class member of the same name. While functionally correct (the local correctly uses the storage-side header), shadowing can confuse future readers.

Suggested rename
-    auto sample_block = pipeline.firstStream()->getHeader();
-    auto schema_project_cols = buildTableScanProjectionCols(tidb_table_scan.getLogicalTableID(), schema, sample_block);
+    auto storage_header = pipeline.firstStream()->getHeader();
+    auto schema_project_cols = buildTableScanProjectionCols(tidb_table_scan.getLogicalTableID(), schema, storage_header);

#include <TiDB/Decode/TypeMapping.h>
#include <TiDB/Schema/TiDB.h>

#include "Storages/KVStore/Types.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use angle-bracket include path per project convention.

As per coding guidelines, includes should use relative paths from dbms/src with angle brackets.

Proposed fix
-#include "Storages/KVStore/Types.h"
+#include <Storages/KVStore/Types.h>
🤖 Prompt for AI Agents
In `@dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp` at line 21, Replace the
local-quote include for the KV types with the project-style angle-bracket path:
change the include statement referencing "Storages/KVStore/Types.h" to use the
angle-bracket form with the relative path from dbms/src (i.e., use
<Storages/KVStore/Types.h>), ensuring the include in GenSchemaAndColumn.cpp
follows the project's include convention.

Comment on lines 126 to 133
case ExtraCommitTSColumnID:
{
column_defines->emplace_back(DM::ColumnDefine{
VersionColumnID,
output_name, // MutableSupport::version_column_name
MutableSupport::version_column_type});
break;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for all callers of genNamesAndTypes and genNamesAndTypesForTableScan
rg -n 'genNamesAndTypes' dbms/src/Flash/Coprocessor/ --type=cpp -B2 -A5

Repository: pingcap/tiflash

Length of output: 5903


🏁 Script executed:

#!/bin/bash
# Read the full implementation of both functions
sed -n '66,145p' dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp

Repository: pingcap/tiflash

Length of output: 3252


🏁 Script executed:

#!/bin/bash
# Search for where ExtraCommitTSColumnID is used/defined
rg -n 'ExtraCommitTSColumnID' dbms/src/Flash/Coprocessor/ --type=cpp -B2 -A2

Repository: pingcap/tiflash

Length of output: 2026


🏁 Script executed:

#!/bin/bash
# Search for where ExtraCommitTSColumnID gets added to the column list
rg -n 'ExtraCommitTSColumnID' dbms/src --type=cpp | head -20

echo "---"

# Check if there's any filtering in DAGStorageInterpreter before calling genNamesAndTypesForTableScan
rg -n 'genNamesAndTypesForTableScan\|ExtraCommitTSColumnID' dbms/src/Flash/Coprocessor/DAGStorageInterpreter.cpp -B5 -A5

Repository: pingcap/tiflash

Length of output: 962


🏁 Script executed:

#!/bin/bash
# Check DAGStorageInterpreter.cpp context around genNamesAndTypesForTableScan calls
sed -n '415,430p' dbms/src/Flash/Coprocessor/DAGStorageInterpreter.cpp
echo "---"
sed -n '520,540p' dbms/src/Flash/Coprocessor/DAGStorageInterpreter.cpp

echo "---"
# Check if table_scan.getColumns() ever includes ExtraCommitTSColumnID by looking at where it's populated
rg -n 'getColumns\|ExtraCommitTSColumnID' dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cpp -B3 -A3 | head -40

Repository: pingcap/tiflash

Length of output: 1628


🏁 Script executed:

#!/bin/bash
# Check if ExtraCommitTSColumnID is ever added to table_scan columns
# Look at how TiDBTableScan gets its columns
rg -n 'class TiDBTableScan' dbms/src/Flash --type=cpp -A20 | head -40

echo "---"

# Check DAGExpressionAnalyzer context where ExtraCommitTSColumnID is checked
sed -n '1190,1210p' dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cpp

Repository: pingcap/tiflash

Length of output: 3122


🏁 Script executed:

#!/bin/bash
# Check TiDBTableScan constructor to see how columns are populated
rg -n 'TiDBTableScan::TiDBTableScan' dbms/src/Flash/Coprocessor/ --type=cpp -A50 | head -80

echo "---"

# Check if there's any filtering of ExtraCommitTSColumnID before calling genNamesAndTypes
rg -n 'ExtraCommitTSColumnID' dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cpp -B10 -A10

Repository: pingcap/tiflash

Length of output: 6820


Add case for ExtraCommitTSColumnID in genNamesAndTypes.

genNamesAndTypes (line 66) does not handle ExtraCommitTSColumnID, whereas genColumnDefinesForDisaggregatedRead (line 126) explicitly maps it to VersionColumnID with the correct type. When ExtraCommitTSColumnID appears in the column list (which it can, as evidenced by explicit checks in DAGExpressionAnalyzer.cpp:1194 and RemoteRequest.cpp:76), it falls through to the default case in genNamesAndTypes, using the generic column_info.name and getDataTypeByColumnInfoForComputingLayer. This produces an inconsistent schema entry. Add a matching case for ExtraCommitTSColumnID to use the correct name (MutableSupport::version_column_name) and type (MutableSupport::version_column_type).

🤖 Prompt for AI Agents
In `@dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp` around lines 126 - 133,
genNamesAndTypes currently falls through to the default for
ExtraCommitTSColumnID causing inconsistent schema entries; add a case for
ExtraCommitTSColumnID in genNamesAndTypes that mirrors
genColumnDefinesForDisaggregatedRead by mapping to VersionColumnID and using
MutableSupport::version_column_name and MutableSupport::version_column_type
instead of the generic column_info.name and
getDataTypeByColumnInfoForComputingLayer so the produced name and type match the
disaggregated read path.

@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 12, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp`:
- Line 33: Typo in the inline comment: change "registed" to "registered" in the
comment "// Maybe another test has already registed, ignore exception here." so
it reads "// Maybe another test has already registered, ignore exception
here."—locate this comment in gtest_hidden_commit_ts_column.cpp and update the
spelling.

}
catch (DB::Exception &)
{
// Maybe another test has already registed, ignore exception here.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo: "registed" → "registered".

-            // Maybe another test has already registed, ignore exception here.
+            // Maybe another test has already registered, ignore exception here.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Maybe another test has already registed, ignore exception here.
// Maybe another test has already registered, ignore exception here.
🤖 Prompt for AI Agents
In `@dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp` at line 33, Typo
in the inline comment: change "registed" to "registered" in the comment "//
Maybe another test has already registed, ignore exception here." so it reads "//
Maybe another test has already registered, ignore exception here."—locate this
comment in gtest_hidden_commit_ts_column.cpp and update the spelling.

Copy link
Contributor

@windtalker windtalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Feb 13, 2026
@xzhangxian1008
Copy link
Contributor Author

@gengliqi

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Feb 14, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gengliqi, windtalker

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [gengliqi,windtalker]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 14, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-02-13 08:36:40.833660507 +0000 UTC m=+180353.913651202: ☑️ agreed by windtalker.
  • 2026-02-14 05:20:54.058450674 +0000 UTC m=+73021.804844708: ☑️ agreed by gengliqi.

@ti-chi-bot ti-chi-bot bot merged commit f666956 into pingcap:feature/release-8.5-materialized-view Feb 14, 2026
5 checks passed
@xzhangxian1008 xzhangxian1008 deleted the commit-ts branch February 14, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments