Fix ManageScript delimiter checking for C# string variants#745
Conversation
…iants
The ManageScript tool had 7 copy-pasted inline lexers that did not handle
verbatim strings (@""), interpolated strings ($""), or raw string literals,
causing false unbalanced braces errors on valid C# code like $"Score: {score}".
Extract a shared CSharpLexer struct that correctly handles all C# string/comment
types, rewrite all callers to use it, and remove redundant CheckScopedBalance
and ValidateBasicStructure methods. Fix IndexOfClassToken to skip matches inside
strings and comments. Fix anchor_insert to use depth-based best-match selection
instead of picking the first regex match (which selected method-level braces
over class-level ones).
On the Python side, add _is_in_string_context and _brace_depth_at_positions
helpers, and rewrite _find_best_closing_brace_match with depth-based scoring
that filters out matches inside strings/comments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…arp_tokens generator _is_in_string_context and _brace_depth_at_positions duplicated ~150 lines of the same C# string/comment lexer. Extract a shared generator and rewrite both as thin wrappers (-59 net lines, same behavior). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Add """...""" and $"""...""" (C# 11 interpolated raw string) support to both C# CSharpLexer and Python _iter_csharp_tokens. Dollar count determines interpolation threshold (N dollars = N consecutive braces to open an interpolation hole). 2. Make anchor_delete and anchor_replace use FindBestAnchorMatch (same as anchor_insert) instead of rx.Match for consistent target selection when multiple anchors match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reviewer's GuideRefactors C# and Python script editing logic to use a shared C# lexer/tokenizer that correctly skips all C# string/comment forms when checking delimiters and anchors, replaces heuristic brace matching with brace-depth analysis, and aligns anchor-based edit operations and validation on the new, more accurate infrastructure with accompanying tests. Class diagram for CSharpLexer-based delimiter and anchor utilitiesclassDiagram
class ManageScript {
+ApplyTextEdits(original, spans, validateMode) object
+CheckBalancedDelimiters(text, line, expected) bool
+TryComputeClassSpanBalanced(source, className, start, length, why) bool
+TryComputeMethodSpan(source, methodName, start, length, why) bool
+TryFindClassInsertionPoint(source, classStart, insertAt, why) bool
+FindBestAnchorMatch(matches, text, pattern) Match
+IndexOfClassToken(s, className) int
+ValidateScriptSyntax(contents, level, errors) bool
}
class CSharpLexer {
-string _text
-int _pos
-int _end
-int _line
-bool _inSingleComment
-bool _inMultiComment
+bool InNonCode
+int Position
+int Line
+CSharpLexer(text, start, end)
+Advance(c) bool
-SkipInterpolatedStringBody(isVerbatim) void
-SkipInterpolatedRawStringBody(dollarCount, quoteCount) void
}
ManageScript *-- CSharpLexer : uses
class ScriptApplyEditsPy {
+_iter_csharp_tokens(text) generator
+_is_in_string_context(text, position) bool
+_brace_depth_at_positions(text, positions) dict
+_find_best_anchor_match(pattern, text, flags, prefer_last)
+_find_best_closing_brace_match(matches, text)
+_apply_edits_locally(original_text, edits) str
}
ScriptApplyEditsPy ..> ManageScript : mirrored_lexing_logic
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughRefactors delimiter and anchor handling across C# editor tooling and Python backend: adds token-aware lexers/tokenizers, brace-depth-aware anchor resolution, unconditional balance validation around edits, reworks edit flows and error handling, and adds comprehensive tests for string/brace edge cases. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- IndexOfClassToken currently instantiates a new CSharpLexer and scans from the start of the file up to each candidate index on every loop iteration; if this is used on larger files, consider refactoring to either precompute non-code ranges once or reuse a single lexer pass to avoid the O(N^2) worst-case behavior.
- The heuristic in FindBestAnchorMatch for detecting "closing brace" patterns (
pattern.Contains("}") && (pattern.Contains("$") || pattern.EndsWith(@"\s*"))) is fairly brittle; it may be worth either making the caller explicitly indicate when a pattern is intended to target closing braces, or tightening this detection to avoid misclassifying unrelated patterns that happen to match those string conditions. - In the Python path, _find_best_closing_brace_match calls _is_in_string_context for each regex match, and _is_in_string_context re-iterates from the start of the file each time; if you expect many candidate matches in large files, consider caching tokenization results or computing brace depths and string/comment ranges in a single pass to reduce repeated work.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- IndexOfClassToken currently instantiates a new CSharpLexer and scans from the start of the file up to each candidate index on every loop iteration; if this is used on larger files, consider refactoring to either precompute non-code ranges once or reuse a single lexer pass to avoid the O(N^2) worst-case behavior.
- The heuristic in FindBestAnchorMatch for detecting "closing brace" patterns (`pattern.Contains("}") && (pattern.Contains("$") || pattern.EndsWith(@"\s*"))`) is fairly brittle; it may be worth either making the caller explicitly indicate when a pattern is intended to target closing braces, or tightening this detection to avoid misclassifying unrelated patterns that happen to match those string conditions.
- In the Python path, _find_best_closing_brace_match calls _is_in_string_context for each regex match, and _is_in_string_context re-iterates from the start of the file each time; if you expect many candidate matches in large files, consider caching tokenization results or computing brace depths and string/comment ranges in a single pass to reduce repeated work.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@MCPForUnity/Editor/Tools/ManageScript.cs`:
- Line 1052: The comment points out that when skipping C-style /* ... */
comments the code unconditionally does `_pos += 2` after the loop which can
overshoot if the closing `*/` was not found; update the logic in the
comment-skipping sites (the inline comment branch and the analogous place inside
SkipInterpolatedRawStringBody) to only advance by 2 when `_pos + 1 < _end` and
`_text[_pos] == '*' && _text[_pos + 1] == '/'` (i.e., detect the actual
terminator) — otherwise do not increment past the end and just exit so Position
remains within range. Ensure both occurrences use the same guarded check.
🧹 Nitpick comments (5)
Server/src/services/tools/script_apply_edits.py (2)
367-374:_is_in_string_contextperforms a full-text scan from position 0 each invocation.In
_find_best_closing_brace_match, this is called once per candidate match (line 543), and then_brace_depth_at_positionsdoes yet another full pass. For k matches in a file of length n, that's O(k·n + n) total tokenizer work. This is acceptable when k is small (typically single-digit), but for adversarial or very long files it could be noticeable.A single-pass approach (scanning once and collecting both "is-code" status at match starts and brace depths) would reduce this to O(n), but is only worth doing if profiling shows a bottleneck.
523-571:_find_best_closing_brace_match— string-context filter checksm.start(), not the}position.Line 543 filters matches where
m.start()is inside a string/comment. However, for a regex like^\s*}\s*$,m.start()is the beginning of the line (whitespace), while the}is later in the match. If a line starts in code but a}appears inside a string on that same line, the filter could miss it.In practice this is very unlikely because
^\s*}\s*$requires the line to contain only whitespace and a}, so quote characters can't be present. Still, a more robust approach would filter based on the actual}position.As a secondary defence, unrecognized
}positions fall back tofloat('inf')depth (line 565), so they're naturally de-prioritised — the impact is limited.Suggested: filter on the actual brace position instead of match start
- # Filter out matches inside strings/comments - matches = [m for m in matches if not _is_in_string_context(text, m.start())] - if not matches: - return None - # Find the position of the '}' character within each match brace_positions: dict[int, object] = {} # brace_pos → match for m in matches: - # The match may contain whitespace around '}'; find the actual '}' for offset in range(m.start(), m.end()): if offset < len(text) and text[offset] == '}': - brace_positions[offset] = m + if not _is_in_string_context(text, offset): + brace_positions[offset] = m break if not brace_positions: - return matches[-1] # fallback + return NoneServer/tests/integration/test_script_apply_edits_local.py (1)
213-258: Async test setup is correct, but consider adding a negative test for_apply_edits_locally.The current tests verify happy-path behavior. A test that expects
RuntimeErrorfor an unknownopor out-of-boundsreplace_rangewould strengthen regression coverage for error paths.TestProjects/UnityMCPTests/Assets/Tests/EditMode/Tools/ManageScriptDelimiterTests.cs (2)
171-201:IndexOfClassTokentests cover the key scenarios — consider adding an edge-case test.The existing tests verify normal code, comment skipping, string skipping, and fallthrough to later matches. One missing scenario worth adding: a class token that appears immediately after a closing string quote (e.g.,
string s = "text";class Foo { }), which exercises the boundary detection discussed in theIndexOfClassTokenreview forManageScript.cs.
205-228: Reflection helpers duplicate those inManageScriptValidationTests.cs.
CallCheckBalancedDelimitersandCallIndexOfClassTokenappear in both test files. Consider extracting them into a shared test utility class to avoid drift if the private method signatures change.Suggested shared helper class
// In a shared file like TestHelpers/ManageScriptReflectionHelpers.cs internal static class ManageScriptReflectionHelpers { public static bool CallCheckBalancedDelimiters(string text, out int line, out char expected) { line = 0; expected = '\0'; var method = typeof(ManageScript).GetMethod("CheckBalancedDelimiters", BindingFlags.NonPublic | BindingFlags.Static); var parameters = new object[] { text, 0, '\0' }; var result = (bool)method.Invoke(null, parameters); line = (int)parameters[1]; expected = (char)parameters[2]; return result; } public static int CallIndexOfClassToken(string source, string className) { var method = typeof(ManageScript).GetMethod("IndexOfClassToken", BindingFlags.NonPublic | BindingFlags.Static); return (int)method.Invoke(null, new object[] { source, className }); } }
1. Guard _pos += 2 after /* comment scanning in both SkipInterpolatedStringBody and SkipInterpolatedRawStringBody — only advance past */ when the terminator was actually found, preventing overshoot past _end on unterminated comments. 2. Filter _find_best_closing_brace_match on the actual } position rather than m.start() (which points at leading whitespace for patterns like ^\s*}\s*$). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
CSharpLexerstruct in C# and_iter_csharp_tokensgenerator in Python, replacing 7 duplicated inline lexer copies that all had the same bugs@""), interpolated ($""), verbatim-interpolated ($@""/@$""), raw ("""), and interpolated raw ($""",$$""") string literals — braces/delimiters inside these are no longer miscountedIndexOfClassTokento skip matches inside strings and commentsCheckScopedBalanceandValidateBasicStructure— redundant with the now-correctCheckBalancedDelimitersanchor_deleteandanchor_replaceuseFindBestAnchorMatch— consistent withanchor_insertinstead of using first-match_find_best_closing_brace_matchin Python to use brace-depth analysis instead of indentation heuristicsTest plan
test_script_apply_edits_local.py) — all passuv run pytest tests/ -v)ManageScriptDelimiterTests.cs) forCheckBalancedDelimitersandIndexOfClassToken$"",@"",$@"",$""",$$"""patterns — zero false positive validation errorsinsert_method,replace_method,anchor_insert,anchor_replace,anchor_delete) all work correctly on scripts containing tricky string patternsTestProjects/UnityMCPTests🤖 Generated with Claude Code
Summary by Sourcery
Improve C# script editing robustness by using a shared lexer for delimiter-aware parsing, tightening delimiter validation and class detection, and making anchor-based edits brace-depth aware across C# and Python tooling.
New Features:
Bug Fixes:
Enhancements:
Tests:
Summary by CodeRabbit
Bug Fixes
Tests