Fix: Escape ? and | characters in TokenEscaper#502
Fix: Escape ? and | characters in TokenEscaper#502thakoreh wants to merge 3 commits intoredis:mainfrom
Conversation
Added ? and | to the escape patterns in TokenEscaper to comply with RediSearch query syntax requirements. Changes: - Updated DEFAULT_ESCAPED_CHARS regex to include ? and | - Updated ESCAPED_CHARS_NO_WILDCARD regex to include ? and | - Enabled previously commented-out test cases for pipe and question mark - Updated test expectations for symbols test case Fixes redis#490
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
There was a problem hiding this comment.
Pull request overview
This PR updates TokenEscaper to escape ? and | characters so generated RediSearch queries comply with RediSearch escaping/tokenization rules, and it re-enables/updates unit tests that validate the new escaping behavior.
Changes:
- Added
?and|to theTokenEscaperescape character patterns (both default and “no wildcard” variants). - Enabled/added test cases asserting correct escaping for
pipe|tagandquestion?mark. - Updated an existing “symbols” test expectation to require escaping
?.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
redisvl/utils/token_escaper.py |
Extends the escape regex character classes to include ? and ` |
tests/unit/test_token_escaper.py |
Updates expectations and enables new cases verifying ? and ` |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove ? from ESCAPED_CHARS_NO_WILDCARD to match * behavior - ? is a single-character wildcard in RediSearch queries - Add tests for preserve_wildcards behavior with * and ?
|
Addressed the Bugbot feedback in 699e332: Fixed:
Changes:
@vishal-bala I'll address the URL update in a separate commit if you'd like me to handle that as well. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Same as above but excludes * and ? to allow wildcard patterns | ||
| ESCAPED_CHARS_NO_WILDCARD = r"[,.<>{}\[\]\\\"\':;!@#$%^&()\-+=~\/ |]" |
There was a problem hiding this comment.
PR description says ESCAPED_CHARS_NO_WILDCARD added both ? and |, but the code intentionally excludes ? so it can be preserved when preserve_wildcards=True (only | is added). Please update the PR description (or the code) so they match, since this affects the documented behavior of wildcard queries.
| ("mixed*and|pipe", "mixed*and\|pipe"), | ||
| ("question?and|pipe", "question\?and\|pipe"), # ? escaped when not preserving | ||
| ], | ||
| ids=["star", "question", "both", "star-only", "question-escaped"], | ||
| ) | ||
| def test_escape_preserve_wildcards(escaper, test_input, expected): | ||
| """Test that * and ? are preserved when preserve_wildcards=True.""" | ||
| # These tests verify wildcard preservation behavior | ||
| if "*" in test_input and "?" in test_input: | ||
| result = escaper.escape(test_input, preserve_wildcards=True) | ||
| assert "*" in result and "?" in result | ||
| elif "*" in test_input: | ||
| assert "*" in escaper.escape(test_input, preserve_wildcards=True) | ||
| elif "?" in test_input: | ||
| assert "?" in escaper.escape(test_input, preserve_wildcards=True) |
There was a problem hiding this comment.
test_escape_preserve_wildcards doesn’t actually validate the escaped output: the expected parameter is unused, and the current "*" in result / "?" in result checks will still pass even if */? are escaped (e.g., \* still contains *). This leaves the new preserve_wildcards behavior effectively untested (including whether | is escaped). Use the parametrized expected value and assert the full escaped string for each case; also fix the question?and|pipe expectation to keep ? unescaped when preserve_wildcards=True while still escaping |.
| ("mixed*and|pipe", "mixed*and\|pipe"), | |
| ("question?and|pipe", "question\?and\|pipe"), # ? escaped when not preserving | |
| ], | |
| ids=["star", "question", "both", "star-only", "question-escaped"], | |
| ) | |
| def test_escape_preserve_wildcards(escaper, test_input, expected): | |
| """Test that * and ? are preserved when preserve_wildcards=True.""" | |
| # These tests verify wildcard preservation behavior | |
| if "*" in test_input and "?" in test_input: | |
| result = escaper.escape(test_input, preserve_wildcards=True) | |
| assert "*" in result and "?" in result | |
| elif "*" in test_input: | |
| assert "*" in escaper.escape(test_input, preserve_wildcards=True) | |
| elif "?" in test_input: | |
| assert "?" in escaper.escape(test_input, preserve_wildcards=True) | |
| (r"mixed*and|pipe", r"mixed*and\|pipe"), | |
| (r"question?and|pipe", r"question?and\|pipe"), # ? preserved, | escaped | |
| ], | |
| ids=["star", "question", "both", "star-only", "question-preserved"], | |
| ) | |
| def test_escape_preserve_wildcards(escaper, test_input, expected): | |
| """Test that * and ? are preserved when preserve_wildcards=True.""" | |
| result = escaper.escape(test_input, preserve_wildcards=True) | |
| assert result == expected |
| # Same as above but excludes * and ? to allow wildcard patterns | ||
| ESCAPED_CHARS_NO_WILDCARD = r"[,.<>{}\[\]\\\"\':;!@#$%^&()\-+=~\/ |]" |
There was a problem hiding this comment.
The comment on ESCAPED_CHARS_NO_WILDCARD says it excludes * and ? (so preserve_wildcards=True preserves both), but the escape() docstring still says it only preserves *. Please update the preserve_wildcards parameter documentation to reflect the actual behavior (preserving both * and ?), so callers don’t assume ? will be escaped/preserved incorrectly.
| elif "*" in test_input: | ||
| assert "*" in escaper.escape(test_input, preserve_wildcards=True) | ||
| elif "?" in test_input: | ||
| assert "?" in escaper.escape(test_input, preserve_wildcards=True) |
There was a problem hiding this comment.
Test ignores expected parameter, never verifying actual output
Medium Severity
The test_escape_preserve_wildcards function receives an expected parameter via @pytest.mark.parametrize but never references it. Instead, the test body only performs loose containment checks ("*" in result), which would pass even if the escaping logic were completely wrong. Additionally, the test case "question?and|pipe" has an expected value of "question\?and\|pipe" (with ? escaped), but the test calls escape with preserve_wildcards=True, which preserves ? unescaped — so the expected value is itself incorrect for what the test claims to verify.


Summary
?and|to the escape patterns inTokenEscaperto comply with RediSearch query syntax requirementsProblem
The
TokenEscaperclass was not escaping?and|characters, which are special characters in RediSearch queries that need to be escaped according to Redis documentation.Solution
Updated the regex patterns:
DEFAULT_ESCAPED_CHARS: Added?and|to the character classESCAPED_CHARS_NO_WILDCARD: Added?and|to the character classTesting
Verified locally that:
question?mark→question\?markpipe|tag→pipe\|tag& symbols, like * and ?→\&\ symbols\,\ like\ \*\ and\ \?Fixes #490
Note
Low Risk
Low risk: small regex and test-only changes affecting query escaping for two additional characters; potential behavior change is limited to inputs containing
?/|and wildcard-preservation mode.Overview
Fixes RediSearch token escaping by adding
?and|toTokenEscaper’s escape patterns, and updating the docs link reference.Updates unit tests to assert
?/|are escaped in normal mode, and adds coverage aroundpreserve_wildcards=Trueto ensure*/?can be preserved while still escaping other special characters (e.g.,|).Written by Cursor Bugbot for commit 699e332. This will update automatically on new commits. Configure here.