Skip to content

Support full outer join#10778

Open
windtalker wants to merge 12 commits intopingcap:feature/release-8.5-materialized-viewfrom
windtalker:support_fullouter_join_only
Open

Support full outer join#10778
windtalker wants to merge 12 commits intopingcap:feature/release-8.5-materialized-viewfrom
windtalker:support_fullouter_join_only

Conversation

@windtalker
Copy link
Copy Markdown
Contributor

@windtalker windtalker commented Mar 27, 2026

What problem does this PR solve?

Issue Number: close #10777

Problem Summary:

Support full outer join in TiFlash, including protocol / join kind plumbing, nullable schema handling, execution support for non-equal other conditions, and targeted test coverage.

What is changed and how it works?

Support full outer join

- support full outer join protocol / join kind plumbing
- guard unsupported cartesian full outer join cases
- make full join output schemas nullable where needed
- support full join with non-equal other conditions
- fix full join other-condition execution path
- add targeted tests and design notes

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Unit tests:

  • ./cmake-build-debug/dbms/gtests_dbms --gtest_filter='JoinExecutorTestRunner.FullOuterJoinWithOtherCondition'
  • ./cmake-build-debug/dbms/gtests_dbms --gtest_filter='MockDAGRequestTest.SemiJoinColumnPruneKeepsJoinOutputSchema'

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Support full outer join in TiFlash.

@ti-chi-bot ti-chi-bot bot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Mar 27, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 27, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign littlefall for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 27, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 27, 2026

📝 Walkthrough

Walkthrough

Implements Full Outer Join support: protocol mapping, nullable-schema handling, execution-path changes (row-flagged hash map for full+other-condition), probe/post-probe handling for unmatched rows, submodule update, extensive tests, and documentation updates.

Changes

Cohort / File(s) Summary
Documentation & Local Build
AGENTS.md, docs/note/fullouter_join.md
Added local build/test guidance and a design note describing Full Outer Join protocol mapping, semantics, constraints, and test plan.
Submodule & Protocol
contrib/tipb
Updated tipb submodule pointer to include TypeFullOuterJoin.
String Mapping
dbms/src/Flash/Coprocessor/DAGUtils.cpp
Recognize and stringify TypeFullOuterJoin as "FullOuterJoin".
Join Interpreter Helpers & Schema
dbms/src/Flash/Coprocessor/JoinInterpreterHelper.h, dbms/src/Flash/Coprocessor/JoinInterpreterHelper.cpp, dbms/src/Flash/Coprocessor/collectOutputFieldTypes.cpp, dbms/src/Debug/MockExecutor/JoinBinder.cpp
Introduce join-side nullability helpers, centralize nullability decisions, allow Full join in non-equal condition validation, and propagate nullable schema generation for outputs and other-conds.
Execution: Join Core
dbms/src/Interpreters/Join.cpp, dbms/src/Interpreters/JoinPartition.cpp, dbms/src/Interpreters/JoinUtils.h, dbms/src/DataStreams/ScanHashMapAfterProbeBlockInputStream.cpp
Route Full+other-condition to row-flagged hash map path, add full-specific handling for unmatched rows and post-probe scanning, extend row-flagged map selection to Full, and adjust scan-fill behavior for Full kind.
Mock & Test Executors / Unit Tests
dbms/src/Flash/tests/..., dbms/src/Flash/Coprocessor/tests/..., dbms/src/TestUtils/tests/gtest_mock_executors.cpp, dbms/src/Flash/tests/gtest_spill_join.cpp
Add/extend gtests for FullOuterJoin: kind/index mapping, output nullability, other-condition semantics, executor behavior, spill behavior, and schema-preservation tests.
Test Helpers
dbms/src/Debug/MockExecutor/JoinBinder.cpp
Refactor join schema construction using helper utilities to apply nullable rules consistently.
Other
contrib/tipb
Submodule pointer update (single-line change).

Sequence Diagram(s)

sequenceDiagram
    participant Probe as ProbeBlock
    participant Hash as RowFlaggedHashMap
    participant Filter as OtherCondition
    participant Scan as PostProbeScanner
    participant Output as OutputWriter

    Probe->>Hash: probe key(s)
    alt match found
        Hash->>Filter: provide matched build rows + probe row
        Filter->>Filter: evaluate other-condition
        alt filter passes
            Filter->>Output: emit matched row (both sides may be nullable)
            Filter->>Hash: mark build row as used
        else filter fails
            Filter->>Output: (do not emit this matched pair)
            Note over Hash,Scan: build row remains unused for later scan
        end
    else no match
        Hash->>Output: emit probe row with NULL build side
    end

    Note over Hash,Scan: after probing
    Hash->>Scan: scan unused build rows
    Scan->>Output: emit unmatched build rows with NULL probe side
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

size/XL, release-note-none, lgtm

Suggested reviewers

  • zanmato1984
  • JinheLin
  • CalvinNeo

Poem

🐰 I hopped through joins both left and right,
Made nullable fields soft and light,
Marked unused rows for a later peek,
So full outer matches never hide or sneak,
A joyful hop — TiFlash joins unite!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.20% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Support full outer join' directly and clearly summarizes the main objective of the PR, which is to add full outer join support to TiFlash.
Linked Issues check ✅ Passed The PR addresses all six coding requirements from issue #10777: protocol/join-kind plumbing, cartesian full outer join guarding, nullable schema output, non-equal condition support, other-condition execution path fixing, and targeted tests/documentation.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing full outer join support as defined in issue #10777. The PR includes protocol mapping, join handling, schema nullability, condition validation, test cases, and documentation without unrelated modifications.
Description check ✅ Passed The PR description follows the template structure with all required sections completed, including problem statement, solution details, test confirmation, and release notes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@windtalker
Copy link
Copy Markdown
Contributor Author

@pantheon-ai review

@pantheon-ai
Copy link
Copy Markdown

pantheon-ai bot commented Mar 27, 2026

Review completed with partial results (1 of 3 review agents failed after retries).

Findings: 2 issues
Posted: 2
Duplicates/Skipped: 0

Note: review_agent_v1.1 failed after 3 attempts due to infrastructure issues. The other 2 agents completed successfully and found 1 P0 and 1 P1 issue.

ℹ️ Learn more details on Pantheon AI.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
dbms/src/DataStreams/ScanHashMapAfterProbeBlockInputStream.cpp (1)

264-276: Consider consolidating duplicate case handling.

The Full case (lines 264-269) and RightAnti/RightOuter cases (lines 270-276) have identical logic. You could combine them:

case ASTTableJoin::Kind::Full:
case ASTTableJoin::Kind::RightAnti:
case ASTTableJoin::Kind::RightOuter:
    if (parent.has_other_condition)
        fillColumnsUsingCurrentPartition<true, false>(...);
    else
        fillColumnsUsingCurrentPartition<false, false>(...);
    break;

This is optional since the current code is correct and explicit handling may be preferred for clarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbms/src/DataStreams/ScanHashMapAfterProbeBlockInputStream.cpp` around lines
264 - 276, The three switch cases ASTTableJoin::Kind::Full,
ASTTableJoin::Kind::RightAnti, and ASTTableJoin::Kind::RightOuter all perform
identical logic: they check parent.has_other_condition and call
fillColumnsUsingCurrentPartition with either <true,false> or <false,false>;
consolidate these by merging the cases into a single grouped case label (Full,
RightAnti, RightOuter) and keep the existing conditional that calls
fillColumnsUsingCurrentPartition<true,false>(columns_left, columns_right,
row_counter_column) when parent.has_other_condition is true and
fillColumnsUsingCurrentPartition<false,false>(...) otherwise, followed by a
single break.
dbms/src/Flash/tests/gtest_spill_join.cpp (1)

588-619: Exercise the inner_idx = 0 spill path too.

TypeFullOuterJoin is the one equal-join family that keeps inner_idx as the build side. This test only covers inner_idx = 1, so a regression when the left side builds the hash table would still pass here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbms/src/Flash/tests/gtest_spill_join.cpp` around lines 588 - 619, The test
only exercises the build-side = 1 path; add a second pair of requests that set
inner_idx = 0 so the left side becomes the build side for TypeFullOuterJoin.
Duplicate the existing request and request_column_prune blocks (which use
context.scan(...).join(..., tipb::JoinType::TypeFullOuterJoin, ..., 0, false, 1)
/ aggregation) but change the join call's inner_idx argument to 0 (e.g., replace
the trailing 1 with 0) so both the projection test (request) and the aggregation
prune test (request_column_prune) also run with inner_idx = 0; keep the same
scans, join keys, condition (lt(...)), project columns and aggregation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@AGENTS.md`:
- Around line 1-8: Update AGENTS.md to clarify the build/test guidance: replace
the "Codex-owned" phrase with a general ccache recommendation and show how to
configure CCACHE_DIR and CCACHE_TEMPDIR for any user (reference
CCACHE_DIR/CCACHE_TEMPDIR), explain how the ./cmake-build-debug/dbms/gtests_dbms
path is produced (i.e., it’s the default CMake build output for a debug build or
created after running cmake --build) and suggest a discovery alternative (e.g.,
using find or cmake --build output) instead of assuming the path, and add a
short sentence describing what gtests_dbms is (the DBMS unit-test binary) and
when to run it (targeted gtest runs during development).

In `@dbms/src/Debug/MockExecutor/JoinBinder.cpp`:
- Around line 121-124: Rebuild output_schema using the same left/right helpers
used by compileJoin() instead of always appending both children: detect the join
type (TypeSemiJoin, TypeAntiSemiJoin, TypeLeftOuterSemiJoin,
TypeAntiLeftOuterSemiJoin) and for semi/anti-semi use the left-side schema plus
the right-side schema produced by buildRightSideJoinSchema() (preserving the
synthetic match-helper shape), and for other joins continue to use
appendJoinSchema with JoinInterpreterHelper::makeLeftJoinSideNullable(tp) and
makeRightJoinSideNullable(tp); update the logic that currently clears
output_schema and calls appendJoinSchema(children[0]...) and
appendJoinSchema(children[1]...) to call the appropriate helper functions so
output_schema matches compileJoin() behavior.

In `@dbms/src/Interpreters/JoinPartition.cpp`:
- Around line 1640-1654: Probe rows with NULL keys take the has_null_map branch
and currently call RowFlaggedHashMapAdder::addNotFound(), which emits nothing,
causing unmatched NULL probe keys to be dropped for FULL joins; update that
branch so when KIND == ASTTableJoin::Kind::Full it calls
RowFlaggedHashMapAdder::addNotFoundForFull(...) with the same arguments used in
the hash-miss path (num_columns_to_add, added_columns, i, current_offset,
offsets_to_replicate.get(), probe_process_info), otherwise call addNotFound(...)
as before so non-FULL behavior is unchanged.

---

Nitpick comments:
In `@dbms/src/DataStreams/ScanHashMapAfterProbeBlockInputStream.cpp`:
- Around line 264-276: The three switch cases ASTTableJoin::Kind::Full,
ASTTableJoin::Kind::RightAnti, and ASTTableJoin::Kind::RightOuter all perform
identical logic: they check parent.has_other_condition and call
fillColumnsUsingCurrentPartition with either <true,false> or <false,false>;
consolidate these by merging the cases into a single grouped case label (Full,
RightAnti, RightOuter) and keep the existing conditional that calls
fillColumnsUsingCurrentPartition<true,false>(columns_left, columns_right,
row_counter_column) when parent.has_other_condition is true and
fillColumnsUsingCurrentPartition<false,false>(...) otherwise, followed by a
single break.

In `@dbms/src/Flash/tests/gtest_spill_join.cpp`:
- Around line 588-619: The test only exercises the build-side = 1 path; add a
second pair of requests that set inner_idx = 0 so the left side becomes the
build side for TypeFullOuterJoin. Duplicate the existing request and
request_column_prune blocks (which use context.scan(...).join(...,
tipb::JoinType::TypeFullOuterJoin, ..., 0, false, 1) / aggregation) but change
the join call's inner_idx argument to 0 (e.g., replace the trailing 1 with 0) so
both the projection test (request) and the aggregation prune test
(request_column_prune) also run with inner_idx = 0; keep the same scans, join
keys, condition (lt(...)), project columns and aggregation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 264452d6-dc3f-4650-a126-8b040e1b9e4d

📥 Commits

Reviewing files that changed from the base of the PR and between 1c34cbe and 7e5e0d0.

📒 Files selected for processing (16)
  • AGENTS.md
  • contrib/tipb
  • dbms/src/DataStreams/ScanHashMapAfterProbeBlockInputStream.cpp
  • dbms/src/Debug/MockExecutor/JoinBinder.cpp
  • dbms/src/Flash/Coprocessor/DAGUtils.cpp
  • dbms/src/Flash/Coprocessor/JoinInterpreterHelper.cpp
  • dbms/src/Flash/Coprocessor/JoinInterpreterHelper.h
  • dbms/src/Flash/Coprocessor/collectOutputFieldTypes.cpp
  • dbms/src/Flash/Coprocessor/tests/gtest_join_get_kind_and_build_index.cpp
  • dbms/src/Flash/tests/gtest_join_executor.cpp
  • dbms/src/Flash/tests/gtest_spill_join.cpp
  • dbms/src/Interpreters/Join.cpp
  • dbms/src/Interpreters/JoinPartition.cpp
  • dbms/src/Interpreters/JoinUtils.h
  • dbms/src/TestUtils/tests/gtest_mock_executors.cpp
  • docs/note/fullouter_join.md

contrib/tipb Outdated
@@ -1 +1 @@
Subproject commit fe7badb76b660abaedd8706accd809809c475280
Subproject commit 3f0090f4d459d6afefdf0862e582d3432c99c96c
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] contrib/tipb submodule points to non-fetchable commit

Why: Breaks clean checkout/build/CI because git submodule update --init --recursive contrib/tipb fails to fetch the referenced commit from https://github.com/pingcap/tipb.git. The commit 3f0090f4d459d6afefdf0862e582d3432c99c96c is not available on the remote, and CMake unconditionally requires this submodule for builds.

Evidence: contrib/tipb:1 - Subproject commit 3f0090f4d459d6afefdf0862e582d3432c99c96c (fetch error: "remote: not our ref 3f0090f4d459d6afefdf0862e582d3432c99c96c")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Copy Markdown

@pantheon-ai pantheon-ai bot Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! The submodule should now be fetchable for clean builds. ✅

ℹ️ Learn more details on Pantheon AI.

if constexpr (row_flagged_map)
{
block_full = RowFlaggedHashMapAdder<Map>::addNotFound(i, current_offset, offsets_to_replicate.get());
if constexpr (KIND == ASTTableJoin::Kind::Full)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Full join with other_condition silently drops probe rows with NULL join key or failed left filter

Why: Violates full outer join semantics. Probe rows flagged in null_map (due to NULL join key or failed left_filter_column) should appear in the result with NULL-filled right columns, but are silently dropped causing wrong results (missing rows).

Evidence: dbms/src/Interpreters/JoinPartition.cpp:1539 - When Kind == Full and row_flagged_map is true (other_condition path), the code calls RowFlaggedHashMapAdder::addNotFound(...) for rows with (*null_map)[i] set. However, addNotFound (lines 1450-1454) does not increment current_offset, causing 0-replication. The correct behavior requires addNotFoundForFull(...) (lines 1456-1475) which increments offset and fills right-side NULLs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Copy Markdown

@pantheon-ai pantheon-ai bot Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this! The fix looks good - now properly using addNotFoundForFull for KIND::Full to ensure probe rows with NULL join keys or failed left filters are output with right-side NULLs instead of being dropped. ✅

ℹ️ Learn more details on Pantheon AI.

Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant