feat(core): Go callee extraction + Rust language support#32
Merged
Conversation
Step 0 of Phase 5: validate tree-sitter-rust grammar node names before building the scanner. All 14 tests confirm expected node types: function_item, struct_item, enum_item, trait_item, impl_item (with type field for both inherent and trait impls), use_declaration, call_expression (for both bare and method calls), macro_invocation, visibility_modifier, line_comment for doc comments, and generic impl blocks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RustScanner extracts functions, structs, enums, traits, impl methods, imports, callees, and doc comments from Rust source files. - impl_item.type field for method naming (works for both impl Type and impl Trait for Type) - Generic type param stripping: Container<T>.show → Container.show - Callee extraction via recursive AST walk (call_expression only, macro_invocation intentionally excluded) - Doc comments via /// prefix (line_comment nodes) - Visibility: pub, pub(crate), pub(super) → exported: true - Async detection via text inspection before fn keyword - Generated file skipping: target/ directory - Malformed file resilience: returns empty, no crash 37 tests: 14 grammar validation + 23 scanner tests covering both fixtures (simple + complex), impl patterns, generics, closures, macros, malformed files, and generated file detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Go pattern rules: if err != nil, defer, goroutine, channel send - Rust pattern rules: try operator, match expression, unsafe block, impl block, trait definition - Add .go and .rs to EXTENSION_TO_LANGUAGE (fixes: Go patterns never fired) - Add 'rust' to supportedLanguages in wasm-matcher - Add go/rust to QUERIES_BY_LANGUAGE map in pattern-analysis-service - Add 'rust' to SUPPORTED_LANGUAGES in copy-wasm.js - Add Rust test file detection (tests/ dir, _test.rs) to test-utils - Fix tests that used .rs as unsupported extension example Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add walkCallNodes to GoScanner for function and method callee extraction.
Uses full selector text ("fmt.Println" not "Println") matching TS scanner.
6 new tests: callee extraction, full selector names, methods, line numbers,
deduplication, no callees on structs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Rust to CLAUDE.md scanner description, website Multi-Language feature list, release notes (v0.12.0), and changeset. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Actually skip macro_invocation in Rust walkCallNodes (prevents capturing calls inside macros like vec![foo()]) - Fix orphaned JSDoc comment in go.ts (was between walkCallNodes and isExported after insertion) - Add comment explaining attribute skip in doc comment extraction Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
W1: Remove source_file anchoring from functions query so functions
inside mod blocks are captured. Filter impl methods by checking
parent chain (declaration_list > impl_item), not just declaration_list.
W3: Fix greedy generic stripping — use split('<')[0] instead of
regex replace. Handles nested generics like Wrapper<Option<T>>.
2 new tests: functions inside mod blocks (pub + private), nested
generic type param stripping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Track two limitations found during manual verification: - Antfly Linear Merge fails at ~6k docs (blocks large repo indexing) - Rust/Go callees don't resolve target files (no cross-file graph edges) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Antfly's merge endpoint fails on large JSON payloads (~6k+ docs). Split documents into chunks of 3,000 before sending. - Extract chunk() utility as a pure function in utils/chunking.ts - AntflyVectorStore.linearMerge splits sorted docs into chunks, runs linearMergeChunk per batch, accumulates results - Progress callbacks report across all chunks - 10 new tests for chunk() (even/uneven splits, edge cases, large arrays) - Update scratchpad with Antfly batch limit and callee file resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Chunking Linear Merge causes each chunk to delete the previous chunk's records (server thinks each subset is the full dataset). Reverted to single-call approach. The Antfly payload size limit (~6k docs) is an Antfly-side issue that needs a fix in the server (raise JSON body limit or support streaming). Tracked in scratchpad. chunk() utility kept — useful elsewhere. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
walkCallNodesfor functions and methods —dev_refsnow traces Go call chainsif err != nil), goroutines, defer, channels.go/.rsin EXTENSION_TO_LANGUAGE,rustin supportedLanguages, copy-wasm, test-utils.gowas missing from EXTENSION_TO_LANGUAGE — Go patterns never firedRust scanner highlights
impl_item.typefield for method naming (works for bothimpl Typeandimpl Trait for Type)modblocks captured (query matches at any depth, parent-chain filter excludes only impl methods)split('<')[0]: handlesContainer<T>,HashMap<String, Vec<u8>>,Wrapper<Option<T>>macro_invocationearly return in walkCallNodes)///prefix, attributes (#[derive]) skipped during backward walkCode review history
modblock support missing (CRITICAL) → fixed (removed source_file anchoring, added parent-chain filter). Greedy generic stripping → fixed (split('<')[0]). Both fixes have dedicated tests.Test plan
Automated (1758 tests, all passing)
resolveLanguage('.go')andresolveLanguage('.rs')Manual verification — Rust (BurntSushi/ripgrep)
Cloned
--depth 1, ran local build against it.dev indexdev map --depth 2crates/searcher/,crates/printer/,crates/core/, etc.dev refs "Searcher"Searcher.newatcrates/searcher/src/searcher/mod.rs:632. Callees:SearcherBuilder::new().build,SearcherBuilder::new.dev search "grep pattern matching"GlobStrategic.is_match,grep-regexREADME,Glob.compile_matcher. Semantic search works on Rust code.Note: Hot paths show 0 refs because tree-sitter callees don't resolve target files (no
filefield). This is a known limitation — the dependency graph only has edges when callees include file paths. Cross-file resolution for tree-sitter languages is tracked as future work.Manual verification — Go (cli/cli)
Cloned
--depth 1, ran local build against it.dev indexdecoding request: json: string unexpected end of JSON input. Scanner works, Antfly batch size is the bottleneck.NewCmdRootcalleesf.Config,fmt.Errorf,heredoc.Doc,versionCmd.Format,cmdutil.IsAuthCheckEnabled. Full selector text preserved.Note: Antfly Linear Merge fails at ~6k docs. Tracked in scratchpad. Fix options: batch into chunks, raise Antfly limit, or stream. This blocks full indexing of medium-large repos but does not affect the scanner itself.
Known limitations (documented in scratchpad)
//!inner doc comments and/** */block doc comments not extracted (v1)#[cfg(test)]inline test modules not detected as test files🤖 Generated with Claude Code