Open
Conversation
…rsing Key optimizations: - Rewrite block string parser: direct string processing instead of sub-lexer + Vec<Vec<Token>> (~10%) - Compact Span: u32 start+len (8 bytes) instead of Range<usize> (16 bytes), add Copy (~3%) - Field: consume-then-check for alias instead of peek(1) (~2%) - Optimize next_if_* methods: peek+consume in single buffer operation (~1%) - Lazy depth_limiter.bump(): only bump when optional elements exist - Add Copy to DepthLimiter + preallocate Vec capacity in DefinitionDocument Also adds benchmarks for ExecutableDocument parsing with a large fixture, and updates downstream crates to use Copy semantics on Span (clone → deref). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
adampetro
reviewed
Mar 20, 2026
Comment on lines
+35
to
+42
| let arguments = if VariableArguments::is_match(tokens) { | ||
| Some(VariableArguments::from_tokens( | ||
| tokens, | ||
| depth_limiter.bump()?, | ||
| )?) | ||
| } else { | ||
| None | ||
| }; |
Contributor
There was a problem hiding this comment.
Should we just change the return type of TryFromTokens::try_from_tokens to return the transposed type and have this be the implementation? I think we call transpose almost every time we call this
Comment on lines
+41
to
+42
| // Check if there are any escaped block quotes | ||
| let has_escapes = raw.contains("\\\"\"\""); |
Contributor
There was a problem hiding this comment.
I wonder if it would be more efficient to set a bool in our scan above?
Comment on lines
+49
to
68
| // Count lines first for pre-allocation | ||
| let mut line_count = 1usize; | ||
| { | ||
| let mut j = 0; | ||
| while j < raw_len { | ||
| if raw_bytes[j] == b'\r' { | ||
| line_count += 1; | ||
| if j + 1 < raw_len && raw_bytes[j + 1] == b'\n' { | ||
| j += 2; | ||
| } else { | ||
| j += 1; | ||
| } | ||
| } else if raw_bytes[j] == b'\n' { | ||
| line_count += 1; | ||
| j += 1; | ||
| } else { | ||
| j += 1; | ||
| } | ||
| Self::Newline => lines.push(Vec::new()), | ||
| } | ||
| } |
Contributor
There was a problem hiding this comment.
should we pull this out into a helper?
| .position(|token| !matches!(token, Self::Whitespace(_))) | ||
| .filter_map(|&(start, end)| { | ||
| let line = &raw[start..end]; | ||
| let indent = line.len() - line.trim_start_matches([' ', '\t']).len(); |
Contributor
There was a problem hiding this comment.
I wonder if it would be faster to do something like
line.as_bytes().iter().position(|b| b != b' ' && b != b'\t')| if let Some((front_offset, end_offset)) = front_offset.zip(end_offset) { | ||
| let start = front_offset; | ||
| let end = lines.len() - end_offset; | ||
| if !has_escapes && first + 1 == last && first == 0 { |
Contributor
There was a problem hiding this comment.
Suggested change
| if !has_escapes && first + 1 == last && first == 0 { | |
| if !has_escapes && first == 0 && last == 1 { |
Comment on lines
+7
to
+8
| start: u32, | ||
| len: u32, |
Contributor
There was a problem hiding this comment.
Why u32 instead of usize?
adampetro
reviewed
Mar 20, 2026
|
|
||
| /// A depth limiter is used to limit the depth of the AST. This is useful to prevent stack overflows. | ||
| /// This intentionally does not implement `Clone` or `Copy` to prevent passing this down the call stack without bumping. | ||
| #[derive(Clone, Copy)] |
Contributor
There was a problem hiding this comment.
actually not implementing these was an explicit design decision, see the comment above
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
~16% faster schema parsing, ~12% faster executable parsing
Key optimizations:
Also adds benchmarks for ExecutableDocument parsing with a large fixture, and updates downstream crates to use Copy semantics on Span (clone → deref).
Note: this is a manually curated (with the help of Claude) and cleaned up version of an
/autoresearchrun