Skip to content

test: fix ci on tidb v8.1.x/v7.5.x#4411

Open
lidezhu wants to merge 11 commits intomasterfrom
ldz/fix-batch-add-table
Open

test: fix ci on tidb v8.1.x/v7.5.x#4411
lidezhu wants to merge 11 commits intomasterfrom
ldz/fix-batch-add-table

Conversation

@lidezhu
Copy link
Collaborator

@lidezhu lidezhu commented Mar 10, 2026

What problem does this PR solve?

Issue Number: close #4406

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling for CREATE TABLE DDL operations with query count mismatches, replacing panics with proper error propagation.
  • Tests

    • Added test coverage for CREATE TABLE query count mismatch scenarios.
  • Chores

    • Updated table creation configuration settings and added version compatibility check for TiDB v8.5.0+.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 10, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 10, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign wlwilliamx for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the schema store's DDL event processing for CREATE TABLES operations. It introduces a more resilient method to parse and generate SQL queries, particularly when a single DDL statement might create multiple tables or when the provided query string is ambiguous. The changes ensure that the system can accurately derive individual CREATE TABLE statements from table metadata, thereby preventing potential panics and enhancing the reliability of DDL event propagation.

Highlights

  • Robust CREATE TABLES DDL Handling: Introduced a new function, splitCreateTablesQueries, to enhance the processing of CREATE TABLES DDL events. This function intelligently handles cases where the initial DDL query string might not perfectly align with the number of table information objects, by reconstructing individual CREATE TABLE statements from the available table metadata.
  • Error Prevention: Replaced log.Panic calls with a more graceful error handling and logging mechanism when the count of split queries does not match the number of table information objects, improving system stability.
  • New Unit Test: Added a dedicated unit test, TestBuildDDLEventForCreateTablesQueryCountMismatch, to validate the new query reconstruction logic, ensuring its correctness and robustness in edge cases.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • logservice/schemastore/persist_storage_ddl_handlers.go
    • Refactored buildDDLEventForCreateTables to utilize a new helper function for splitting queries.
    • Added splitCreateTablesQueries function to reconstruct CREATE TABLE statements from TableInfo when the initial query split is mismatched or erroneous.
    • Imported necessary packages (bytes, executor, autoid, mock) to support query reconstruction.
  • logservice/schemastore/persist_storage_test.go
    • Added TestBuildDDLEventForCreateTablesQueryCountMismatch to verify the new logic for handling CREATE TABLES DDL events where the query count initially mismatches the table count.
Activity
  • The pull request is currently marked as '[WIP]', indicating it is a Work In Progress and may not yet be ready for a full review.
  • The description body is a standard template, suggesting no specific human-written comments or detailed context have been added by the author yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ti-chi-bot ti-chi-bot bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 10, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Changes introduce storage-aware query splitting for multi-table CREATE TABLE DDLs by replacing direct parsing with a new splitCreateTablesQueries helper function that validates query counts. Error handling is improved by replacing panic behavior with explicit error returns. Test infrastructure updates disable fast table creation globally, with version guards added for compatibility.

Changes

Cohort / File(s) Summary
Core DDL Handler Logic
logservice/schemastore/persist_storage_ddl_handlers.go
Replaces direct query parsing with storage-aware splitCreateTablesQueries function for multi-table CREATE TABLEs. Converts panic-based error handling to error returns when query/table counts mismatch. Updates buildDDLEventForRenameTables to use new splitting helper.
DDL Handler Tests
logservice/schemastore/persist_storage_test.go
Adds TestBuildDDLEventForCreateTablesQueryCountMismatch to validate error handling when query count diverges from table count. Introduces parser/mysql import for column state handling.
Test Infrastructure Settings
tests/integration_tests/_utils/run_sql, tests/integration_tests/_utils/run_sql_file, tests/integration_tests/_utils/run_sql_ignore_error
Consistently adds tidb_enable_fast_create_table = OFF to global prepare commands alongside existing clustered index settings.
Version Guard
tests/integration_tests/batch_add_table/run.sh
Adds TiDB version check (>= v8.5.0) to skip fast-create-table test execution on earlier versions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • wk989898
  • wlwilliamx
  • tenfyzhong

Poem

🐰 A storage-wise splitting spree,
Where queries now match tables with glee!
Errors no longer panic and shout,
With version guards keeping bugs out,
Fast tables disabled, tests running free! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description is incomplete and largely follows the template with empty sections. While an issue number is referenced (close #4406), critical sections like 'What is changed and how it works?' are entirely empty, and the release note is not provided. Author must fill in the 'What is changed and how it works?' section with concrete implementation details, answer the Questions section about performance/compatibility/documentation, and provide an appropriate release note.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: fixing CI issues on specific TiDB versions (v8.1.x/v7.5.x), which aligns with the code changes that disable fast_create_table and add version guards.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ldz/fix-batch-add-table
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
logservice/schemastore/persist_storage_ddl_handlers.go (1)

2797-2832: The query reconstruction logic is well-structured.

The splitCreateTablesQueries function handles the edge case where the original query cannot be properly split to match the table count. The approach of using ConstructResultOfShowCreateTable to rebuild queries from table info is reasonable.

A few observations:

  1. The variable name querys should conventionally be queries (proper English plural), though this is consistent with usage elsewhere in the file.
  2. Good defensive check for nil tableInfo at line 2816-2818.
  3. Proper error wrapping with cerror.WrapError.
Minor: Consider consistent naming with English plural
-func splitCreateTablesQueries(rawEvent *PersistedDDLEvent) ([]string, error) {
-	querys, err := commonEvent.SplitQueries(rawEvent.Query)
-	if err == nil && len(querys) == len(rawEvent.MultipleTableInfos) {
-		return querys, nil
+func splitCreateTablesQueries(rawEvent *PersistedDDLEvent) ([]string, error) {
+	queries, err := commonEvent.SplitQueries(rawEvent.Query)
+	if err == nil && len(queries) == len(rawEvent.MultipleTableInfos) {
+		return queries, nil
 	}
 
 	fields := []zap.Field{
-		zap.Int("queryCount", len(querys)),
+		zap.Int("queryCount", len(queries)),
 		zap.Int("tableCount", len(rawEvent.MultipleTableInfos)),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@logservice/schemastore/persist_storage_ddl_handlers.go` around lines 2797 -
2832, In splitCreateTablesQueries rename the misspelled local variable querys to
queries (and update all uses within the function including the len(querys)
checks and zap.Int("queryCount", ...) reference) to use the correct English
plural; ensure you update the early-return return queries, nil and the
rebuiltQuerys logic so names are consistent and compile cleanly, keeping all
error handling and behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@logservice/schemastore/persist_storage_ddl_handlers.go`:
- Line 2813: Replace the test-only mocked context usage: remove
mock.NewContext() and instead obtain a real session/context for production use
(for example, create a session from the TiDB domain/session pool or use
session.CreateSession and its returned context) so that
ConstructResultOfShowCreateTable() can access real session state (e.g.,
ctx.GetSessionVars().SQLMode); alternatively, if this handler is for external
callers, execute SHOW CREATE TABLE as SQL and use that result rather than
calling ConstructResultOfShowCreateTable() with a mocked ctx. Ensure the new
code uses the domain/session pool API to create or borrow a session, sets any
required SessionVars, calls ConstructResultOfShowCreateTable() (or runs SHOW
CREATE TABLE) with that real context, and properly closes/returns the session.

---

Nitpick comments:
In `@logservice/schemastore/persist_storage_ddl_handlers.go`:
- Around line 2797-2832: In splitCreateTablesQueries rename the misspelled local
variable querys to queries (and update all uses within the function including
the len(querys) checks and zap.Int("queryCount", ...) reference) to use the
correct English plural; ensure you update the early-return return queries, nil
and the rebuiltQuerys logic so names are consistent and compile cleanly, keeping
all error handling and behavior unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 45d1d5f3-fdc9-42bb-ae81-9207a76979d0

📥 Commits

Reviewing files that changed from the base of the PR and between 6423bf1 and a6cecb6.

📒 Files selected for processing (2)
  • logservice/schemastore/persist_storage_ddl_handlers.go
  • logservice/schemastore/persist_storage_test.go

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the robustness of handling CREATE TABLES DDLs by introducing a fallback mechanism. When splitting the DDL query string fails or results in a mismatched number of queries, it now reconstructs the CREATE TABLE statements from the table metadata. This is a solid improvement that makes the system more resilient to unexpected DDL query formats. I've added one suggestion to make the fallback logic even more robust.

@lidezhu
Copy link
Collaborator Author

lidezhu commented Mar 12, 2026

/test all-v8-1

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 12, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
logservice/schemastore/persist_storage_ddl_handlers.go (2)

2729-2733: ⚠️ Potential issue | 🟡 Minor

Guard MultipleTableInfos before consulting the filter.

Line 2731 dereferences info.Name.O before the new nil-check in splitCreateTablesQueries(). A malformed persisted event with a nil entry and a non-nil tableFilter will panic here instead of returning ErrTiDBUnexpectedJobMeta.

🩹 Proposed fix
 	for _, info := range rawEvent.MultipleTableInfos {
+		if info == nil {
+			return commonEvent.DDLEvent{}, false, cerror.WrapError(
+				cerror.ErrTiDBUnexpectedJobMeta,
+				errors.New("nil table info in create tables ddl"),
+			)
+		}
 		if tableFilter != nil && tableFilter.ShouldDiscardDDL(
 			rawEvent.SchemaName, info.Name.O, model.ActionType(rawEvent.Type), common.WrapTableInfo(rawEvent.SchemaName, info)) {
 			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@logservice/schemastore/persist_storage_ddl_handlers.go` around lines 2729 -
2733, In splitCreateTablesQueries, avoid dereferencing a nil entry in
rawEvent.MultipleTableInfos: before calling tableFilter.ShouldDiscardDDL or
accessing info.Name.O, check that info is non-nil; if a nil entry is found
return ErrTiDBUnexpectedJobMeta. Move the nil-check ahead of the tableFilter
invocation and ensure you still call tableFilter.ShouldDiscardDDL only for
non-nil info, referencing rawEvent.MultipleTableInfos,
tableFilter.ShouldDiscardDDL, info.Name.O and ErrTiDBUnexpectedJobMeta.

2745-2850: ⚠️ Potential issue | 🟠 Major

Keep query reconstruction off the replay path.

This code runs while fetchTableDDLEvents / fetchTableTriggerDDLEvents are replaying persisted events. Turning a query-count mismatch into session.CreateSession(storage) makes replay depend on live kv.Storage again, even though the needed MultipleTableInfos are already stored with the event. That weakens the schema-store “read from disk” path and couples dispatcher fetch latency/availability to upstream storage.

Rebuild and persist the per-table queries when the DDL is handled, or serialize the rebuilt queries alongside the persisted event.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@logservice/schemastore/persist_storage_ddl_handlers.go` around lines 2745 -
2850, The replay path currently calls splitCreateTablesQueries which may call
session.CreateSession(storage) to rebuild per-table CREATE queries from
TableInfo, coupling replay to live kv.Storage; instead precompute and persist
per-table queries when handling the original CREATE TABLES DDL and read them
back on replay. Modify the DDL handling code (the place that builds
PersistedDDLEvent for create-tables) to call
executor.ConstructResultOfShowCreateTable per TableInfo, store the resulting
per-table SQLs (e.g. add a PerTableQueries []string field to PersistedDDLEvent
or serialize them alongside Query), and change splitCreateTablesQueries to
simply return those persisted per-table queries (and only fall back to
SplitQueries(rawEvent.Query) if PerTableQueries is empty) without touching
session.CreateSession/storage; update references to rawEvent.Query and
rawEvent.MultipleTableInfos accordingly and remove the storage-dependent rebuild
logic.
logservice/schemastore/persist_storage.go (1)

523-541: ⚠️ Potential issue | 🟠 Major

Close storageSnap on the error paths here.

storageSnap is only closed at Line 541. If allTargetTs[0] < p.gcTs or p.buildDDLEvent() returns an error, this loop returns early and leaks the Pebble snapshot.

🩹 Proposed fix
 		// ensure the order: get target ts -> get storage snap -> check gc ts
 		storageSnap := p.db.NewSnapshot()
 		p.mu.RLock()
 		if allTargetTs[0] < p.gcTs {
 			p.mu.RUnlock()
+			storageSnap.Close()
 			return nil, fmt.Errorf("startTs %d is smaller than gcTs %d", allTargetTs[0], p.gcTs)
 		}
 		p.mu.RUnlock()
 		for _, ts := range allTargetTs {
 			rawEvent := readPersistedDDLEvent(storageSnap, ts)
 			// the tableID of buildDDLEvent is not used in this function, set it to 0
 			ddlEvent, ok, err := p.buildDDLEvent(&rawEvent, tableFilter, 0)
 			if err != nil {
+				storageSnap.Close()
 				return nil, errors.Trace(err)
 			}
 			if ok {
 				events = append(events, ddlEvent)
 			}
 		}
 		storageSnap.Close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@logservice/schemastore/persist_storage.go` around lines 523 - 541, The
snapshot created by p.db.NewSnapshot() (storageSnap) is only closed at the end
of the function, so early returns (when allTargetTs[0] < p.gcTs or when
p.buildDDLEvent returns an error) leak the Pebble snapshot; fix by ensuring
storageSnap.Close() is called on all error paths (e.g., call storageSnap.Close()
before returning in the gcTs check and before returning on buildDDLEvent errors,
or immediately after creation use defer storageSnap.Close() to guarantee
closure), touching the storageSnap variable and the p.buildDDLEvent /
p.db.NewSnapshot call sites.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@logservice/schemastore/persist_storage_ddl_handlers.go`:
- Around line 2729-2733: In splitCreateTablesQueries, avoid dereferencing a nil
entry in rawEvent.MultipleTableInfos: before calling
tableFilter.ShouldDiscardDDL or accessing info.Name.O, check that info is
non-nil; if a nil entry is found return ErrTiDBUnexpectedJobMeta. Move the
nil-check ahead of the tableFilter invocation and ensure you still call
tableFilter.ShouldDiscardDDL only for non-nil info, referencing
rawEvent.MultipleTableInfos, tableFilter.ShouldDiscardDDL, info.Name.O and
ErrTiDBUnexpectedJobMeta.
- Around line 2745-2850: The replay path currently calls
splitCreateTablesQueries which may call session.CreateSession(storage) to
rebuild per-table CREATE queries from TableInfo, coupling replay to live
kv.Storage; instead precompute and persist per-table queries when handling the
original CREATE TABLES DDL and read them back on replay. Modify the DDL handling
code (the place that builds PersistedDDLEvent for create-tables) to call
executor.ConstructResultOfShowCreateTable per TableInfo, store the resulting
per-table SQLs (e.g. add a PerTableQueries []string field to PersistedDDLEvent
or serialize them alongside Query), and change splitCreateTablesQueries to
simply return those persisted per-table queries (and only fall back to
SplitQueries(rawEvent.Query) if PerTableQueries is empty) without touching
session.CreateSession/storage; update references to rawEvent.Query and
rawEvent.MultipleTableInfos accordingly and remove the storage-dependent rebuild
logic.

In `@logservice/schemastore/persist_storage.go`:
- Around line 523-541: The snapshot created by p.db.NewSnapshot() (storageSnap)
is only closed at the end of the function, so early returns (when allTargetTs[0]
< p.gcTs or when p.buildDDLEvent returns an error) leak the Pebble snapshot; fix
by ensuring storageSnap.Close() is called on all error paths (e.g., call
storageSnap.Close() before returning in the gcTs check and before returning on
buildDDLEvent errors, or immediately after creation use defer
storageSnap.Close() to guarantee closure), touching the storageSnap variable and
the p.buildDDLEvent / p.db.NewSnapshot call sites.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9285bc7a-e913-44d0-82ba-2f0af1f3e2ad

📥 Commits

Reviewing files that changed from the base of the PR and between a6cecb6 and 861fbad.

📒 Files selected for processing (3)
  • logservice/schemastore/persist_storage.go
  • logservice/schemastore/persist_storage_ddl_handlers.go
  • logservice/schemastore/persist_storage_test.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
logservice/schemastore/persist_storage_test.go (1)

3253-3321: Please cover the restart/round-trip case too.

This test only exercises the fresh build path with kvStorage available. The upgrade-risk is on the read path (buildDDLEventForCreateTables over an already-persisted raw event), so a disk round-trip or handcrafted legacy raw-event case is still needed to protect restart compatibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@logservice/schemastore/persist_storage_test.go` around lines 3253 - 3321, The
test currently only exercises the in-memory fresh-build path; add a
restart/round-trip case that persists the raw persisted DDL event to the mock KV
store and then loads it back before calling buildDDLEventForCreateTables to
validate the read/recovery path. Concretely, after creating rawEvent via
buildPersistedDDLEventForCreateTables (using buildPersistedDDLEventFuncArgs),
write the rawEvent into the mockstore (using the same keys/serialization the
production persistence uses), close and re-open the mock store/session (or
simulate a legacy raw-event by constructing the persisted bytes), then call
buildDDLEventForCreateTables on the reloaded raw event and assert the same query
splitting, NeedAddedTables and table-info checks as the existing assertions;
reference buildPersistedDDLEventForCreateTables, buildDDLEventForCreateTables,
mockstore.NewMockStore, and rawEvent to locate where to add the persistence and
reload logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@logservice/schemastore/persist_storage_ddl_handlers.go`:
- Around line 863-865: The code currently panics on rebuildCreateTablesQueries
failures; instead, log the error (include zap.Error(rebuildErr) and
zap.String("query", event.Query)) and propagate the error back through the
normal DDL handling path by returning rebuildErr (or converting it into the
function's error return) so the caller can handle it rather than crashing;
update the branch handling rebuiltQuerys/rebuildErr in
persist_storage_ddl_handlers.go (look for variables rebuiltQuerys, rebuildErr
and the call to rebuildCreateTablesQueries(event.Query,
event.MultipleTableInfos, args.kvStorage)) to remove log.Panic and return or
forward the error.

---

Nitpick comments:
In `@logservice/schemastore/persist_storage_test.go`:
- Around line 3253-3321: The test currently only exercises the in-memory
fresh-build path; add a restart/round-trip case that persists the raw persisted
DDL event to the mock KV store and then loads it back before calling
buildDDLEventForCreateTables to validate the read/recovery path. Concretely,
after creating rawEvent via buildPersistedDDLEventForCreateTables (using
buildPersistedDDLEventFuncArgs), write the rawEvent into the mockstore (using
the same keys/serialization the production persistence uses), close and re-open
the mock store/session (or simulate a legacy raw-event by constructing the
persisted bytes), then call buildDDLEventForCreateTables on the reloaded raw
event and assert the same query splitting, NeedAddedTables and table-info checks
as the existing assertions; reference buildPersistedDDLEventForCreateTables,
buildDDLEventForCreateTables, mockstore.NewMockStore, and rawEvent to locate
where to add the persistence and reload logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ccf3c6fc-1459-429a-baf7-cc5f11a90619

📥 Commits

Reviewing files that changed from the base of the PR and between 861fbad and 155b204.

📒 Files selected for processing (3)
  • logservice/schemastore/persist_storage.go
  • logservice/schemastore/persist_storage_ddl_handlers.go
  • logservice/schemastore/persist_storage_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • logservice/schemastore/persist_storage.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
logservice/schemastore/persist_storage_ddl_handlers.go (1)

862-869: ⚠️ Potential issue | 🟠 Major

Panic on reconstruction failures should be replaced with error propagation.

The log.Panic at line 866 will crash the capture on a single bad DDL instead of surfacing an error through the normal DDL handling path. Since this branch handles an expected mismatch case (query count vs table count), runtime failures during reconstruction should be propagated as errors.

🛠️ Proposed fix to return error instead of panic

The function signature would need to change to return an error. Consider:

-func buildPersistedDDLEventForCreateTables(args buildPersistedDDLEventFuncArgs) PersistedDDLEvent {
+func buildPersistedDDLEventForCreateTables(args buildPersistedDDLEventFuncArgs) (PersistedDDLEvent, error) {
     event := buildPersistedDDLEventCommon(args)
     event.SchemaName = getSchemaName(args.databaseMap, event.SchemaID)
     event.MultipleTableInfos = args.job.BinlogInfo.MultipleTableInfos
     querys, err := commonEvent.SplitQueries(event.Query)
     if err != nil || len(querys) != len(event.MultipleTableInfos) {
         rebuiltQuerys, rebuildErr := rebuildCreateTablesQueries(event.Query, event.MultipleTableInfos, args.kvStorage)
         if rebuildErr != nil {
-            log.Panic("rebuild create tables queries failed", zap.Error(rebuildErr), zap.String("query", event.Query))
+            log.Error("rebuild create tables queries failed", zap.Error(rebuildErr), zap.String("query", event.Query))
+            return PersistedDDLEvent{}, rebuildErr
         }
         event.Query = strings.Join(rebuiltQuerys, "")
     }
-    return event
+    return event, nil
 }

This would require updating the buildPersistedDDLEventFunc type signature and all callers accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@logservice/schemastore/persist_storage_ddl_handlers.go` around lines 862 -
869, The panic on rebuild failure must be converted into error propagation: in
the branch where rebuildCreateTablesQueries(event.Query,
event.MultipleTableInfos, args.kvStorage) returns rebuildErr, return that error
instead of calling log.Panic and stop mutating event.Query; update the function
that contains this code (the buildPersistedDDLEventFunc implementation) to
return an error, adjust its type/signature (buildPersistedDDLEventFunc) and all
callers to handle and propagate the error, and ensure when rebuild succeeds you
still set event.Query = strings.Join(rebuiltQuerys, "") before returning nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@logservice/schemastore/persist_storage_ddl_handlers.go`:
- Around line 2808-2814: splitCreateTablesQueries currently fails hard when
SplitQueries returns an error or counts mismatch, which breaks reads of old
pre-fix events; change it to attempt the same rebuild fallback used by
buildPersistedDDLEventForCreateTables: when SplitQueries fails or len mismatch,
call the rebuild routine (or logic that uses rawEvent.MultipleTableInfos +
kvStorage to reconstruct per-table CREATE queries) and return its results; only
return the wrapped ErrTiDBUnexpectedJobMeta if the rebuild attempt cannot run
(e.g., kvStorage is nil) or also fails, ensuring callers of
buildDDLEventForCreateTables can recover older persisted events.

---

Duplicate comments:
In `@logservice/schemastore/persist_storage_ddl_handlers.go`:
- Around line 862-869: The panic on rebuild failure must be converted into error
propagation: in the branch where rebuildCreateTablesQueries(event.Query,
event.MultipleTableInfos, args.kvStorage) returns rebuildErr, return that error
instead of calling log.Panic and stop mutating event.Query; update the function
that contains this code (the buildPersistedDDLEventFunc implementation) to
return an error, adjust its type/signature (buildPersistedDDLEventFunc) and all
callers to handle and propagate the error, and ensure when rebuild succeeds you
still set event.Query = strings.Join(rebuiltQuerys, "") before returning nil.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 96aafe9a-7103-4788-8fa8-604914dc0eac

📥 Commits

Reviewing files that changed from the base of the PR and between 155b204 and 34945e6.

📒 Files selected for processing (1)
  • logservice/schemastore/persist_storage_ddl_handlers.go

@lidezhu lidezhu force-pushed the ldz/fix-batch-add-table branch from 34945e6 to cbb382e Compare March 12, 2026 09:31
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 12, 2026
@lidezhu lidezhu changed the title [WIP] test: fix ci on tidb v8.1.x/v7.5.x Mar 12, 2026
@ti-chi-bot ti-chi-bot bot removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/needs-linked-issue labels Mar 12, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
logservice/schemastore/persist_storage_ddl_handlers.go (1)

2935-2950: ⚠️ Potential issue | 🟠 Major

This still turns mismatched CreateTables records into unrecoverable read failures.

buildDDLEventForCreateTables now hard-fails whenever the split count doesn't match MultipleTableInfos, and the caller in logservice/schemastore/persist_storage.go at Lines 462-468 propagates that error up the read path. That means older persisted events with mismatched counts are still unreadable. Also, when SplitQueries itself fails, this helper logs the original error but replaces it with a new generic one, so callers lose the actual parse/split cause. Please keep the read-path rebuild fallback here, and if parsing fails, return the original wrapped err instead of synthesizing a new one.

As per coding guidelines, "When an error comes from a third-party or library call in Go, wrap it immediately with errors.Trace(err) or errors.WrapError(...) to attach a stack trace; upstream callers should propagate wrapped errors without wrapping again".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@logservice/schemastore/persist_storage_ddl_handlers.go` around lines 2935 -
2950, splitCreateTablesQueries currently converts mismatched/split errors into a
new generic unrecoverable error; instead preserve the read-path fallback and the
original split error: in splitCreateTablesQueries (used by
buildDDLEventForCreateTables) do not synthesize a new ErrTiDBUnexpectedJobMeta
when SplitQueries returns an error—wrap and return the original error (use
errors.Trace(err) or cerror.WrapError(err) to attach a stack trace) so callers
can inspect the real cause; if SplitQueries succeeds but query count !=
len(rawEvent.MultipleTableInfos), log the mismatch and return a non-fatal result
that lets buildDDLEventForCreateTables perform its rebuild fallback (i.e., avoid
returning a hard failure here).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration_tests/_utils/run_sql_file`:
- Line 8: The shared runner currently forces fast_create_table off by embedding
"set global tidb_enable_fast_create_table = OFF;" into the prepare variable;
remove that hard-coded setting from prepare in
tests/integration_tests/_utils/run_sql_file and instead make it opt-in (e.g.,
accept an optional parameter or environment flag like FAST_CREATE_TABLE or a
prepare_overrides argument to run_sql_file) so callers can choose to set
tidb_enable_fast_create_table on/off; update run_sql_file to append the
fast_create_table SET only when the opt-in flag indicates it, and update callers
(e.g., run_with_fast_create_table in
tests/integration_tests/batch_add_table/run.sh) to pass the flag when they want
the feature enabled.

In `@tests/integration_tests/batch_add_table/run.sh`:
- Line 69: The script currently calls skip_if_tidb_version_less_than which exits
the whole process, preventing later steps like run_with_fast_create_table and
check_logs from running; remove that global skip and instead guard only the
fast-create branch with a local version check that returns or skips that branch
(e.g., replace the call to skip_if_tidb_version_less_than "v8.5.0" with an
if-version check and only avoid calling run_with_fast_create_table when the TiDB
version is less than v8.5.0, while allowing the rest of the script (including
check_logs) to continue).

---

Duplicate comments:
In `@logservice/schemastore/persist_storage_ddl_handlers.go`:
- Around line 2935-2950: splitCreateTablesQueries currently converts
mismatched/split errors into a new generic unrecoverable error; instead preserve
the read-path fallback and the original split error: in splitCreateTablesQueries
(used by buildDDLEventForCreateTables) do not synthesize a new
ErrTiDBUnexpectedJobMeta when SplitQueries returns an error—wrap and return the
original error (use errors.Trace(err) or cerror.WrapError(err) to attach a stack
trace) so callers can inspect the real cause; if SplitQueries succeeds but query
count != len(rawEvent.MultipleTableInfos), log the mismatch and return a
non-fatal result that lets buildDDLEventForCreateTables perform its rebuild
fallback (i.e., avoid returning a hard failure here).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 44d27784-490b-42d2-b7e8-8b9465264a00

📥 Commits

Reviewing files that changed from the base of the PR and between 34945e6 and cbb382e.

📒 Files selected for processing (6)
  • logservice/schemastore/persist_storage_ddl_handlers.go
  • logservice/schemastore/persist_storage_test.go
  • tests/integration_tests/_utils/run_sql
  • tests/integration_tests/_utils/run_sql_file
  • tests/integration_tests/_utils/run_sql_ignore_error
  • tests/integration_tests/batch_add_table/run.sh

set -eu

prepare="set global tidb_enable_clustered_index = 'int_only';"
prepare="set global tidb_enable_clustered_index = 'int_only'; set global tidb_enable_fast_create_table = OFF;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't force fast_create_table off in the shared SQL-file runner.

This helper now overrides caller-selected global state on every invocation. In tests/integration_tests/batch_add_table/run.sh, run_with_fast_create_table enables the feature at Line 16, but the later run_sql_file calls at Lines 52-53 flip it back off before those SQL files run. Please make this opt-in per test instead of hard-coding it in the shared runner.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration_tests/_utils/run_sql_file` at line 8, The shared runner
currently forces fast_create_table off by embedding "set global
tidb_enable_fast_create_table = OFF;" into the prepare variable; remove that
hard-coded setting from prepare in tests/integration_tests/_utils/run_sql_file
and instead make it opt-in (e.g., accept an optional parameter or environment
flag like FAST_CREATE_TABLE or a prepare_overrides argument to run_sql_file) so
callers can choose to set tidb_enable_fast_create_table on/off; update
run_sql_file to append the fast_create_table SET only when the opt-in flag
indicates it, and update callers (e.g., run_with_fast_create_table in
tests/integration_tests/batch_add_table/run.sh) to pass the flag when they want
the feature enabled.

@lidezhu
Copy link
Collaborator Author

lidezhu commented Mar 12, 2026

/test all-v8-1

@lidezhu
Copy link
Collaborator Author

lidezhu commented Mar 12, 2026

/test all-v8-1

@lidezhu
Copy link
Collaborator Author

lidezhu commented Mar 12, 2026

/test all-v8-1

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 12, 2026
@lidezhu
Copy link
Collaborator Author

lidezhu commented Mar 14, 2026

/test all-v8-1

@lidezhu
Copy link
Collaborator Author

lidezhu commented Mar 14, 2026

/test all-v7-5

@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 14, 2026

@lidezhu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cdc-mysql-integration-light-v7-5 63803eb link false /test pull-cdc-mysql-integration-light-v7-5
pull-cdc-mysql-integration-heavy-v7-5 63803eb link false /test pull-cdc-mysql-integration-heavy-v7-5

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix integration test with tidb v8.1.x/v7.5.x

1 participant