Skip to content

chore: correctly parse and differentiate between indirect & direct dependencies in go.mod files#416

Open
Strum355 wants to merge 1 commit intoguacsec:mainfrom
Strum355:nsc/gomod-indirect
Open

chore: correctly parse and differentiate between indirect & direct dependencies in go.mod files#416
Strum355 wants to merge 1 commit intoguacsec:mainfrom
Strum355:nsc/gomod-indirect

Conversation

@Strum355
Copy link
Member

Description

This PR addresses a number of issues in the handling of go.mod manifests:

  1. The exhortignore marker format caused indirect (aka transitive) dependencies in go.mod files to be considered direct dependencies by the Go tooling. Specifically, // indirect exhortignore should be // indirect; exhortignore instead. The README, test fixtures and code have been updated for the correct formatting.
  2. The code previously assumed that everything mentioned in go.mod files was a direct dependency, and as such included everything listed in the go.mod file as a direct dependency in the SBOM. This is incorrect and has been updated in the code.
  3. The code previously did basic regex-based parsing of the go.mod file. This kind of "parsing" is brittle and difficult to debug, so we should be using proper tree-based LR-parser instead like with requirements.txt

All SBOM fixtures have been updated, the diffs are large mostly due to the array of components being re-arranged. The only material difference is a shorter dependsOn section for the root module due to the fixed indirect dependency handling

Checklist

  • I have followed this repository's contributing guidelines.
  • I will adhere to the project's code of conduct.

@Strum355 Strum355 requested a review from ruromero March 18, 2026 12:52
@qodo-code-review
Copy link

Review Summary by Qodo

Replace regex parsing with tree-sitter for robust go.mod dependency handling and fix indirect dependency classification

✨ Enhancement 🐞 Bug fix 🧪 Tests

Grey Divider

Walkthroughs

Description
• Replaced regex-based go.mod parsing with tree-sitter LR-parser for more robust and maintainable
  dependency extraction
• Fixed indirect dependency handling by properly differentiating between direct and indirect
  dependencies in SBOM output
• Corrected exhortignore marker format from // indirect exhortignore to `// indirect;
  exhortignore` for proper Go tooling compatibility
• Made provideStack() and provideComponent() async functions to support parser initialization
• Added new gomod_parser.js module providing tree-sitter parser initialization with
  WebAssembly-based Go language grammar
• Updated all Go module SBOM test fixtures to reflect correct indirect dependency classification
• Added tree-sitter-gomod dependency and updated build scripts to include WebAssembly parser files
• Updated test cases to use async/await syntax for async provider functions
• Fixed JSDoc type annotations in Python pip provider
Diagram
flowchart LR
  A["go.mod file"] -->|"tree-sitter parser"| B["Parsed AST"]
  B -->|"extract requires"| C["All dependencies"]
  C -->|"go mod edit -json"| D["Direct vs Indirect"]
  D -->|"filter indirect"| E["SBOM with correct classification"]
  F["exhortignore marker"] -->|"semicolon format"| G["Proper Go tooling parsing"]
Loading

Grey Divider

File Changes

1. src/providers/golang_gomodules.js ✨ Enhancement +103/-124

Replace regex parsing with tree-sitter for go.mod handling

• Replaced regex-based go.mod parsing with tree-sitter LR-parser for more robust dependency
 extraction
• Fixed exhortignore marker format validation to use semicolon separator for indirect dependencies
 (// indirect; exhortignore)
• Implemented proper direct vs indirect dependency differentiation using go mod edit -json output
• Made provideStack() and provideComponent() async functions to support parser initialization
• Refactored getIgnoredDeps() and collectAllDepsFromManifest() to use tree-sitter queries
 instead of line-based parsing
• Added filtering logic to exclude indirect dependencies from SBOM when appropriate

src/providers/golang_gomodules.js


2. src/providers/gomod_parser.js ✨ Enhancement +21/-0

New tree-sitter parser module for go.mod files

• New file providing tree-sitter parser initialization for Go module files
• Exports getParser() function to create a configured Parser instance
• Exports getRequireQuery() function with tree-sitter query for extracting require specifications
• Loads WebAssembly-based Go language grammar from tree-sitter-gomod.wasm

src/providers/gomod_parser.js


3. test/providers/golang_gomodules.test.js 🧪 Tests +6/-6

Update tests for async provider functions

• Updated test cases to use async/await syntax for provideStack() and provideComponent() calls
• Added async keyword to test function declarations
• Updated test invocations to await the provider functions

test/providers/golang_gomodules.test.js


View more (16)
4. src/providers/python_pip.js 📝 Documentation +1/-1

Fix JSDoc return type annotation

• Updated JSDoc return type annotation for getIgnoredDependencies() from PackageURL[] to
 Promise<PackageURL[]>

src/providers/python_pip.js


5. test/providers/tst_manifests/golang/go_mod_with_ignore/expected_sbom_stack_analysis.json 🧪 Tests +653/-689

Update SBOM fixture to reflect correct indirect dependency handling

• Removed 39 indirect dependencies from the root module's dependsOn array
• Reordered components array to reflect proper dependency hierarchy
• Moved indirect dependencies to appear only as transitive dependencies of their direct dependents
• Updated SBOM to correctly distinguish between direct and indirect dependencies

test/providers/tst_manifests/golang/go_mod_with_ignore/expected_sbom_stack_analysis.json


6. test/providers/tst_manifests/golang/go_mod_with_all_ignore/expected_sbom_stack_analysis.json Formatting +1/-1

Fix file formatting

• Fixed trailing newline in JSON file

test/providers/tst_manifests/golang/go_mod_with_all_ignore/expected_sbom_stack_analysis.json


7. test/providers/tst_manifests/golang/go_mod_empty/expected_sbom_stack_analysis.json Formatting +1/-1

Fix file formatting

• Fixed trailing newline in JSON file

test/providers/tst_manifests/golang/go_mod_empty/expected_sbom_stack_analysis.json


8. test/providers/tst_manifests/golang/go_mod_no_ignore/expected_sbom_stack_analysis.json 🐞 Bug fix +725/-763

Fix indirect dependency classification in Go SBOM output

• Reorganized component list by moving indirect dependencies to later positions in the array
• Significantly reduced root module's dependsOn array from 43 to 7 direct dependencies
• Removed transitive/indirect dependencies from direct dependency list of root module
• Reordered dependency graph entries to reflect proper direct vs indirect classification

test/providers/tst_manifests/golang/go_mod_no_ignore/expected_sbom_stack_analysis.json


9. test/providers/tst_manifests/golang/go_mod_light_no_ignore/expected_sbom_stack_analysis.json 🐞 Bug fix +16/-17

Correct indirect dependency handling in light Go SBOM

• Removed gopkg.in/yaml.v3@v3.0.1 from root module's direct dependencies
• Moved gopkg.in/yaml.v3 component to later position in components array
• Updated dependency graph to reflect correct indirect dependency handling

test/providers/tst_manifests/golang/go_mod_light_no_ignore/expected_sbom_stack_analysis.json


10. package.json Dependencies +4/-3

Add tree-sitter-gomod parser and update build scripts

• Added tree-sitter-gomod dependency from GitHub with specific commit hash
• Updated pretest script to copy tree-sitter-gomod.wasm file alongside requirements parser
• Updated postcompile script to copy tree-sitter-gomod.wasm to dist directory
• Upgraded web-tree-sitter from ^0.26.6 to ^0.26.7

package.json


11. README.md 📝 Documentation +17/-6

Update Go.mod ignore marker documentation and formatting

• Corrected exhortignore marker format for indirect dependencies from // indirect exhortignore
 to // indirect; exhortignore
• Added note explaining importance of proper semicolon formatting for indirect dependencies
• Improved code formatting and spacing in Go module example
• Added HTML list item tags for better documentation structure

README.md


12. test/providers/tst_manifests/golang/go_mod_with_ignore/go.mod 🧪 Tests +2/-2

Fix exhortignore marker format in Go module test fixture

• Changed // indirect exhortignore to // indirect; exhortignore for proper Go tooling parsing
• Updated two indirect dependencies with corrected marker format using semicolon separator

test/providers/tst_manifests/golang/go_mod_with_ignore/go.mod


13. test/providers/tst_manifests/golang/go_mod_mvs_versions/expected_sbom_stack_analysis.json 🐞 Bug fix +724/-762

Fix indirect dependency handling in Go SBOM fixture

• Removed numerous indirect dependencies from the components array that were incorrectly marked as
 direct dependencies
• Reorganized the components array to properly distinguish between direct and indirect
 dependencies
• Significantly reduced the dependsOn array for the root module
 pkg:golang/github.com/RHEcosystemAppEng/SaaSi/deployer@v0.0.0 from 43 to 7 direct dependencies
• Updated dependency graph to only include actual direct dependencies, moving transitive
 dependencies to their proper parent modules

test/providers/tst_manifests/golang/go_mod_mvs_versions/expected_sbom_stack_analysis.json


14. test/providers/tst_manifests/golang/go_mod_light_no_ignore/expected_sbom_component_analysis.json 🐞 Bug fix +56/-69

Remove indirect dependency from Go SBOM fixture

• Removed indirect dependency gopkg.in/yaml.v3@v3.0.1 from the components array
• Reduced root module dependsOn array from 4 to 3 direct dependencies
• Reformatted JSON with consistent indentation (4 spaces)
• Updated dependency graph to reflect only direct dependencies

test/providers/tst_manifests/golang/go_mod_light_no_ignore/expected_sbom_component_analysis.json


15. test/providers/tst_manifests/golang/go_mod_with_all_ignore/expected_sbom_component_analysis.json Formatting +25/-25

Reformat JSON with consistent indentation

• Reformatted JSON file with consistent indentation (4 spaces instead of tabs)
• No functional changes to the SBOM structure or dependencies

test/providers/tst_manifests/golang/go_mod_with_all_ignore/expected_sbom_component_analysis.json


16. test/providers/tst_manifests/golang/go_mod_no_ignore/expected_sbom_component_analysis.json Additional files +122/-616

...

test/providers/tst_manifests/golang/go_mod_no_ignore/expected_sbom_component_analysis.json


17. test/providers/tst_manifests/golang/go_mod_test_ignore/expected_sbom_component_analysis.json Additional files +135/-460

...

test/providers/tst_manifests/golang/go_mod_test_ignore/expected_sbom_component_analysis.json


18. test/providers/tst_manifests/golang/go_mod_test_ignore/expected_sbom_stack_analysis.json Additional files +397/-422

...

test/providers/tst_manifests/golang/go_mod_test_ignore/expected_sbom_stack_analysis.json


19. test/providers/tst_manifests/golang/go_mod_with_ignore/expected_sbom_component_analysis.json Additional files +109/-577

...

test/providers/tst_manifests/golang/go_mod_with_ignore/expected_sbom_component_analysis.json


Grey Divider

Qodo Logo

@qodo-code-review
Copy link

qodo-code-review bot commented Mar 18, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Remediation recommended

1. Duplicate gomod parser init 🐞 Bug ➹ Performance
Description
getSBOM() calls getParser() and getRequireQuery() in parallel, but both functions
independently run init() which calls Parser.init() and reloads tree-sitter-gomod.wasm. This
duplicates wasm I/O and language initialization per SBOM generation, increasing analysis latency and
memory churn.
Code

src/providers/gomod_parser.js[R7-20]

+async function init() {
+	await Parser.init();
+	const wasmBytes = new Uint8Array(await readFile(wasmUrl));
+	return await Language.load(wasmBytes);
+}
+
+export async function getParser() {
+	const language = await init();
+	return new Parser().setLanguage(language);
+}
+
+export async function getRequireQuery() {
+	const language = await init();
+	return new Query(language, '(require_spec (module_path) @name (version) @version) @spec');
Evidence
gomod_parser.init() performs the expensive initialization work (Parser.init + wasm read/load).
Both getParser() and getRequireQuery() call init(), and golang_gomodules.getSBOM() awaits
both via Promise.all, guaranteeing two initializations per SBOM.

src/providers/gomod_parser.js[7-21]
src/providers/golang_gomodules.js[244-246]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`src/providers/gomod_parser.js` loads/initializes the tree-sitter language separately in `getParser()` and `getRequireQuery()`. Since `getSBOM()` calls both, the wasm is read/loaded twice per run.

### Issue Context
This is on the hot path for every Go analysis and is deterministic overhead.

### Fix Focus Areas
- src/providers/gomod_parser.js[7-21]
- src/providers/golang_gomodules.js[244-246]

### Implementation notes
- Introduce a module-level cached promise/value for the initialized `Language` (e.g., `let languagePromise`), so `init()` runs once per process.
- Optionally also cache the `Query` instance (since the query string is constant), and/or expose a single helper that returns `{ parser, requireQuery }` built from the same cached `Language`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Copy link
Collaborator

@ruromero ruromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. Only a couple of optional comments.

let comments = specNode.children.filter(c => c.type === 'comment')
for (let comment of comments) {
let text = comment.text
if (/^\/\/\s*indirect;\s*exhortignore/.test(text)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be aware that this pattern also matches //indirect;exhortignore (without space)
In the docs you say that it must be //indirect; exhortignore. If you want to enforce the space use \s+ instead.
If not, that's fine. I don't think that's a problem

let allIgnoredDeps = ignoredDeps.map((dep) => dep.toString())
let sbom = new Sbom();
let rows = goGraphOutput.split(getLineSeparatorGolang()).filter(line => !line.includes(' go@'));
let root = getParentVertexFromEdge(goModEditOutput['Module']['Path'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the getParentVertexFromEdge is not used as expected because here it won't receive an edge but a single path with no space.
The result will just be the same goModEditOutput['Module']['Path'], do you mind replacing it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants