Skip to content

[ABA-32] test: repro for missing LIKE ESCAPE plumbing (vortex-array + vortex-datafusion)#21

Open
abnobdoss wants to merge 1 commit into
developfrom
fix/aba-32-like-escape-missing
Open

[ABA-32] test: repro for missing LIKE ESCAPE plumbing (vortex-array + vortex-datafusion)#21
abnobdoss wants to merge 1 commit into
developfrom
fix/aba-32-like-escape-missing

Conversation

@abnobdoss
Copy link
Copy Markdown
Owner

Summary

  • Adds two #[ignore]'d regression tests that document the gap described in ABA-32: LikeOptions has no escape_char field, making SQL LIKE … ESCAPE semantics unrepresentable in the Vortex expression layer.
  • vortex-array (scalar_fn/fns/like/mod.rs): issue_aba32_like_options_supports_escape_char — asserts LikeOptions Debug output contains escape_char; fails on develop.
  • vortex-datafusion (convert/exprs.rs): issue_aba32_datafusion_conversion_preserves_like_escape_char — converts a DataFusion LikeExpr via DefaultExpressionConvertor and asserts the resulting LikeOptions exposes escape information; fails on develop.

This is a REPRO-ONLY PR — no production code is changed.

Gap analysis

LikeOptions (vortex-array) only carries negated and case_insensitive. There is no escape_char: Option<char> field. The proto LikeOpts message is equally absent. As a consequence:

  1. Non-DataFusion frontends cannot express any escape char.
  2. The DataFusion physical LikeExpr (DF 53) also has no escape_char field; the logical layer drops it during create_physical_expr. So even if Vortex added the field, the DataFusion conversion path currently has nothing to plumb through.

Relation to PR vortex-data#8038

PR vortex-data#8038 ("Fallback from fsst specialised like expression if there are escape characters in the like string") adds a pattern-byte inspection (pattern.contains(&b'\\')) in the FSST DFA fast-path so it safely bails out when the pattern contains a backslash. This is a correctness guard for the fast path, not an addition of escape-char support. The schema gap (missing escape_char on LikeOptions) is confirmed to still exist on develop.

Open questions

  • Should LikeOptions gain escape_char: Option<char> now (with None meaning the SQL default \), or wait until DataFusion exposes the field on the physical LikeExpr?
  • Once added, the FSST fast-path bailout in dfa/mod.rs should be generalised to check options.escape_char (defaulting to Some('\\')) rather than scanning pattern bytes for \.
  • Proto schema (vortex-proto/proto/expr.protoLikeOpts) will need a new optional field and a migration note.

🤖 Generated with Claude Code

Linear: https://linear.app/abanoubdoss/issue/ABA-32

…usion escape silently dropped

Add two #[ignore]'d regression tests that demonstrate the gap described in
https://linear.app/abanoubdoss/issue/ABA-32:

- vortex-array: `issue_aba32_like_options_supports_escape_char` — asserts
  `LikeOptions` exposes an `escape_char` field; fails on develop because the
  struct only has `negated` and `case_insensitive`.

- vortex-datafusion: `issue_aba32_datafusion_conversion_preserves_like_escape_char`
  — builds a DataFusion `LikeExpr`, converts it through the Vortex path, and
  asserts the resulting `LikeOptions` carries escape information; fails on develop
  for the same reason.

Both tests are REPRO-ONLY. The fix requires adding `escape_char: Option<char>` to
`LikeOptions`, extending the proto schema, and threading the field through the
kernel dispatch (including the FSST fast-path bailout already added by PR vortex-data#8038,
which inspects the pattern string for `\` rather than a `LikeOptions` field).

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: abnobdoss@proton.me <abnobdoss@proton.me>
Signed-off-by: Abanoub Doss <abanoub.doss@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant