Skip to content

feat: implement license resolution and identification#356

Open
soul2zimate wants to merge 2 commits intoguacsec:mainfrom
soul2zimate:license
Open

feat: implement license resolution and identification#356
soul2zimate wants to merge 2 commits intoguacsec:mainfrom
soul2zimate:license

Conversation

@soul2zimate
Copy link
Contributor

feat: implement license resolution and identification

Add license analysis features that detect the project license, check dependency license compatibility, and include license information in generated SBOMs. This mirrors the JavaScript client implementation.

resolve #355

Add license analysis features that detect the project license, check
dependency license compatibility, and include license information in
generated SBOMs. This mirrors the JavaScript client implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@qodo-code-review
Copy link
Contributor

Review Summary by Qodo

Implement license resolution and identification with compatibility checking

✨ Enhancement 🧪 Tests 📝 Documentation

Grey Divider

Walkthroughs

Description
• Implement comprehensive license resolution and identification features for Java client
• Add componentAnalysisWithLicense() API method that performs component analysis with automatic
  license checking
• Implement license detection from ecosystem-specific manifests (pom.xml, package.json, Cargo.toml)
  with LICENSE file fallback
• Add license compatibility checking based on restrictiveness hierarchy (permissive < weak copyleft
  < strong copyleft)
• Integrate license information into generated SBOMs with SPDX normalization
• Add CLI license command to display project license information
• Implement SPDX pattern matching for common licenses (Apache, MIT, GPL, LGPL, AGPL, BSD)
• Create LicenseUtils, LicenseCheck, and ProjectLicense utility classes for license operations
• Update all providers (Maven, JavaScript, Cargo, Go, Python, Gradle) to extract and include license
  information
• Add comprehensive documentation and CLI help for license features
• Update test fixtures and add new test cases for license extraction across ecosystems
Diagram
flowchart LR
  A["Component Analysis"] -->|includes| B["License Detection"]
  B -->|from manifest| C["Ecosystem Providers"]
  B -->|fallback| D["LICENSE File"]
  C -->|Maven| E["pom.xml"]
  C -->|JavaScript| F["package.json"]
  C -->|Cargo| G["Cargo.toml"]
  B -->|backend| H["License Identification"]
  H -->|SPDX| I["Normalized License"]
  I -->|compatibility check| J["LicenseCheck"]
  J -->|result| K["ComponentAnalysisResult"]
  K -->|include in| L["SBOM with License"]
  M["CLI license command"] -->|display| N["Project License Info"]
Loading

Grey Divider

File Changes

1. src/main/java/io/github/guacsec/trustifyda/impl/ExhortApi.java ✨ Enhancement +174/-9

License resolution API endpoints and component analysis with license checking

• Added license-related imports and API endpoint constants for license resolution
• Implemented componentAnalysisWithLicense() method that runs license checks after component
 analysis
• Added getLicenseDetails() and identifyLicense() methods to fetch license information from
 backend
• Refactored HTTP request building into reusable buildGetRequest() and buildPostRequest()
 methods with common header application
• Removed commented-out code and cleaned up formatting

src/main/java/io/github/guacsec/trustifyda/impl/ExhortApi.java


2. src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java ✨ Enhancement +316/-0

License detection and compatibility checking utilities

• Utility class for license detection, normalization, and compatibility checking
• Implements SPDX pattern matching for common licenses (Apache, MIT, GPL, LGPL, AGPL, BSD)
• Provides license file discovery and content-based SPDX detection
• Includes compatibility checking based on license restrictiveness hierarchy (permissive < weak
 copyleft < strong copyleft)
• Extracts license information from analysis reports and normalizes PURL strings

src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java


3. src/main/java/io/github/guacsec/trustifyda/cli/App.java ✨ Enhancement +102/-69

CLI support for license command and license-aware component analysis

• Added LICENSE command to the Command enum for license information display
• Implemented executeLicenseCheck() method to show project license from manifest and LICENSE file
• Updated executeComponentAnalysis() to use componentAnalysisWithLicense() for automatic license
 checking
• Refactored switch statements to modern switch expressions throughout the file
• Added helper method buildLicenseInfo() to fetch and format license details from backend

src/main/java/io/github/guacsec/trustifyda/cli/App.java


View more (44)
4. src/main/java/io/github/guacsec/trustifyda/license/LicenseCheck.java ✨ Enhancement +213/-0

License compatibility checking orchestration and incompatibility detection

• Orchestrates full license check workflow after component analysis
• Resolves project license from manifest and LICENSE file with backend identification support
• Extracts dependency licenses from analysis report and compares with project license
• Identifies incompatible dependencies based on license category restrictiveness
• Handles mismatch detection between manifest and LICENSE file licenses

src/main/java/io/github/guacsec/trustifyda/license/LicenseCheck.java


5. src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java ✨ Enhancement +51/-2

Maven provider license extraction from pom.xml

• Added readLicenseFromManifest() method to parse license from pom.xml <licenses><license><name>
 element
• Implemented XML stream parsing to extract first license name from manifest
• Updated SBOM creation to include license information in root component
• Integrated with LicenseUtils.getLicense() for fallback to LICENSE file detection

src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java


6. src/test/java/io/github/guacsec/trustifyda/providers/LicenseFallbackTest.java 🧪 Tests +87/-0

Tests for LICENSE file fallback detection across ecosystems

• Tests LICENSE file fallback for providers without manifest license support
• Verifies Gradle, Go, and Python providers correctly detect licenses from LICENSE files
• Tests SPDX pattern matching for Apache-2.0, MIT, and BSD-3-Clause licenses
• Validates null return when no LICENSE file exists

src/test/java/io/github/guacsec/trustifyda/providers/LicenseFallbackTest.java


7. src/main/java/io/github/guacsec/trustifyda/providers/CargoProvider.java ✨ Enhancement +21/-1

Cargo provider license extraction from Cargo.toml

• Added readLicenseFromManifest() method to extract license from Cargo.toml package.license
 field
• Implemented TOML parsing with fallback to LICENSE file detection
• Updated SBOM creation to include license information in root component

src/main/java/io/github/guacsec/trustifyda/providers/CargoProvider.java


8. src/main/java/io/github/guacsec/trustifyda/providers/JavaScriptProvider.java ✨ Enhancement +8/-2

JavaScript provider license extraction from package.json

• Added readLicenseFromManifest() method to extract license from package.json license field
• Supports both modern npm format and legacy licenses array format
• Updated SBOM creation to include license information in root component

src/main/java/io/github/guacsec/trustifyda/providers/JavaScriptProvider.java


9. src/test/java/io/github/guacsec/trustifyda/providers/JavaMavenProviderLicenseTest.java 🧪 Tests +71/-0

Tests for Maven pom.xml license extraction

• Tests Maven provider license extraction from pom.xml
• Verifies handling of single and multiple licenses (returns first)
• Tests null return when no licenses section or empty license name

src/test/java/io/github/guacsec/trustifyda/providers/JavaMavenProviderLicenseTest.java


10. src/test/java/io/github/guacsec/trustifyda/cli/AppTest.java 🧪 Tests +6/-3

CLI tests updated for license-aware component analysis

• Updated component analysis test to use componentAnalysisWithLicense() instead of
 componentAnalysis()
• Changed mock return type from AnalysisReport to ComponentAnalysisResult
• Added tests for new LICENSE command enum value
• Verified command enum has 4 values (added LICENSE)

src/test/java/io/github/guacsec/trustifyda/cli/AppTest.java


11. src/test/java/io/github/guacsec/trustifyda/providers/JavaScriptProviderLicenseTest.java 🧪 Tests +65/-0

Tests for JavaScript package.json license extraction

• Tests JavaScript provider license extraction from package.json
• Verifies modern license field and legacy licenses array handling
• Tests null return when no license field present

src/test/java/io/github/guacsec/trustifyda/providers/JavaScriptProviderLicenseTest.java


12. src/main/java/io/github/guacsec/trustifyda/providers/javascript/model/Manifest.java ✨ Enhancement +35/-0

Manifest model extended with license field support

• Added license field to Manifest class
• Implemented loadLicense() method supporting modern npm license field and legacy licenses
 array
• Handles both string and object formats for license field

src/main/java/io/github/guacsec/trustifyda/providers/javascript/model/Manifest.java


13. src/test/java/io/github/guacsec/trustifyda/providers/CargoProviderLicenseTest.java 🧪 Tests +57/-0

Tests for Cargo.toml license extraction

• Tests Cargo provider license extraction from Cargo.toml
• Verifies correct license parsing and null return when no license field

src/test/java/io/github/guacsec/trustifyda/providers/CargoProviderLicenseTest.java


14. src/main/java/io/github/guacsec/trustifyda/Provider.java ✨ Enhancement +13/-0

Base provider license extraction interface with LICENSE file fallback

• Added readLicenseFromManifest() method as default implementation in base Provider class
• Default implementation falls back to LICENSE file detection via LicenseUtils.readLicenseFile()
• Allows subclasses to override for ecosystem-specific manifest license extraction

src/main/java/io/github/guacsec/trustifyda/Provider.java


15. src/main/java/io/github/guacsec/trustifyda/license/ProjectLicense.java ✨ Enhancement +45/-0

Project license resolution from manifest and LICENSE file

• Utility class to resolve project license from manifest and LICENSE file
• Detects mismatch between manifest and file licenses using normalized comparison
• Returns ProjectLicenseInfo record with both sources and mismatch flag

src/main/java/io/github/guacsec/trustifyda/license/ProjectLicense.java


16. src/main/java/io/github/guacsec/trustifyda/sbom/CycloneDXSbom.java ✨ Enhancement +12/-0

SBOM root component license information support

• Extended addRoot() method to accept optional license parameter
• Implemented license resolution using CycloneDX LicenseResolver for SPDX normalization
• Sets resolved license on root component when provided

src/main/java/io/github/guacsec/trustifyda/sbom/CycloneDXSbom.java


17. src/main/java/io/github/guacsec/trustifyda/providers/JavaScriptProviderFactory.java ✨ Enhancement +2/-3

JavaScript provider factory type safety improvement

• Changed return type from generic Provider to specific JavaScriptProvider
• Updated factory map type signature for better type safety

src/main/java/io/github/guacsec/trustifyda/providers/JavaScriptProviderFactory.java


18. src/main/java/io/github/guacsec/trustifyda/providers/GoModulesProvider.java ✨ Enhancement +2/-2

Go modules provider SBOM license information integration

• Updated SBOM creation calls to include license information via readLicenseFromManifest()
• Applied to both graph-based and list-based SBOM building methods

src/main/java/io/github/guacsec/trustifyda/providers/GoModulesProvider.java


19. src/main/java/io/github/guacsec/trustifyda/providers/PythonPipProvider.java ✨ Enhancement +6/-2

Python pip provider SBOM license information integration

• Updated SBOM creation calls to include license information via readLicenseFromManifest()
• Applied to both stack and component analysis SBOM building

src/main/java/io/github/guacsec/trustifyda/providers/PythonPipProvider.java


20. src/main/java/io/github/guacsec/trustifyda/ComponentAnalysisResult.java ✨ Enhancement +22/-0

Component analysis result wrapper with license summary

• New record class to wrap component analysis results with license information
• Contains AnalysisReport and optional LicenseSummary fields

src/main/java/io/github/guacsec/trustifyda/ComponentAnalysisResult.java


21. src/main/java/module-info.java ✨ Enhancement +1/-0

Module exports for license package

• Added export of io.github.guacsec.trustifyda.license package for public API access

src/main/java/module-info.java


22. src/main/java/io/github/guacsec/trustifyda/Api.java ✨ Enhancement +10/-0

API interface extended with license-aware component analysis

• Added componentAnalysisWithLicense() method signature to API interface
• Returns CompletableFuture<ComponentAnalysisResult> with license compatibility checking
• Includes documentation for the new method

src/main/java/io/github/guacsec/trustifyda/Api.java


23. src/main/java/io/github/guacsec/trustifyda/providers/GradleProvider.java ✨ Enhancement +1/-1

Gradle provider SBOM license information integration

• Updated SBOM creation to include license information via readLicenseFromManifest()

src/main/java/io/github/guacsec/trustifyda/providers/GradleProvider.java


24. src/main/java/io/github/guacsec/trustifyda/cli/Command.java ✨ Enhancement +2/-1

CLI command enum extended with LICENSE command

• Added LICENSE enum value to Command enum

src/main/java/io/github/guacsec/trustifyda/cli/Command.java


25. src/main/java/io/github/guacsec/trustifyda/sbom/Sbom.java ✨ Enhancement +2/-0

SBOM interface extended with license parameter support

• Added overloaded addRoot() method signature accepting optional license parameter

src/main/java/io/github/guacsec/trustifyda/sbom/Sbom.java


26. docs/license-resolution-and-compliance.md 📝 Documentation +249/-0

License resolution and compliance documentation

• Comprehensive documentation for license analysis features
• Explains project license detection with manifest and LICENSE file fallback
• Documents compatibility checking algorithm and restrictiveness hierarchy
• Includes CLI usage examples and programmatic API documentation
• Provides configuration options and common scenarios with examples

docs/license-resolution-and-compliance.md


27. src/test/resources/tst_manifests/yarn-berry/deps_with_no_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Test fixture SBOM updated with license information

• Updated expected SBOM to include license information on root component
• Added ISC license with full details including text and URL

src/test/resources/tst_manifests/yarn-berry/deps_with_no_ignore/expected_stack_sbom.json


28. src/test/resources/tst_manifests/yarn-classic/deps_with_no_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Test fixture SBOM updated with license information

• Updated expected SBOM to include license information on root component
• Added ISC license with full details including text and URL

src/test/resources/tst_manifests/yarn-classic/deps_with_no_ignore/expected_stack_sbom.json


29. src/test/resources/tst_manifests/yarn-berry/deps_with_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Test fixture SBOM updated with license information

• Updated expected SBOM to include license information on root component
• Added ISC license with full details including text and URL

src/test/resources/tst_manifests/yarn-berry/deps_with_ignore/expected_stack_sbom.json


30. src/test/resources/tst_manifests/yarn-classic/deps_with_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Test fixture SBOM updated with license information

• Updated expected SBOM to include license information on root component
• Added ISC license with full details including text and URL

src/test/resources/tst_manifests/yarn-classic/deps_with_ignore/expected_stack_sbom.json


31. src/test/resources/tst_manifests/npm/common/deps_with_no_ignore/expected_component_sbom.json 🧪 Tests +22/-0

Test fixture SBOM updated with license information

• Updated expected SBOM to include license information on root component
• Added ISC license with full details including text and URL

src/test/resources/tst_manifests/npm/common/deps_with_no_ignore/expected_component_sbom.json


32. README.md 📝 Documentation +30/-4

README updated with license analysis features and examples

• Added ComponentAnalysisResult import to code example
• Documented componentAnalysisWithLicense() method usage with license summary access
• Added new "License Resolution and Compliance" section explaining features and configuration
• Updated CLI documentation with new license command and examples
• Fixed import statements to use correct package paths

README.md


33. src/main/resources/cli_help.txt 📝 Documentation +11/-3

CLI help text updated with license command documentation

• Added license command documentation with usage and description
• Updated component command help to mention license checking inclusion
• Added TRUSTIFY_DA_LICENSE_CHECK environment variable documentation
• Added license command example to EXAMPLES section

src/main/resources/cli_help.txt


34. src/test/resources/tst_manifests/maven/license/pom_with_empty_license/pom.xml 🧪 Tests +24/-0

Test fixture for Maven empty license handling

• New test fixture for Maven pom.xml with empty license name element

src/test/resources/tst_manifests/maven/license/pom_with_empty_license/pom.xml


35. src/test/resources/tst_manifests/pnpm/deps_with_no_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Add ISC license information to SBOM components

• Added licenses field to package components containing ISC license information
• Includes license ID, base64-encoded license text, content type, and URL reference
• License information added to multiple package entries in the SBOM structure

src/test/resources/tst_manifests/pnpm/deps_with_no_ignore/expected_stack_sbom.json


36. src/test/resources/tst_manifests/npm/common/deps_with_ignore/expected_component_sbom.json 🧪 Tests +22/-0

Add ISC license details to expected SBOM output

• Added licenses field with ISC license details to package components
• Includes license ID, base64-encoded text content, and license URL
• Updates expected SBOM output to reflect license resolution feature

src/test/resources/tst_manifests/npm/common/deps_with_ignore/expected_component_sbom.json


37. src/test/resources/tst_manifests/npm/deps_with_no_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Include ISC license data in expected SBOM structure

• Added licenses array with ISC license information to package entries
• License includes ID, base64-encoded text, content type, and reference URL
• Updates test expectations for license inclusion in generated SBOMs

src/test/resources/tst_manifests/npm/deps_with_no_ignore/expected_stack_sbom.json


38. src/test/resources/tst_manifests/pnpm/deps_with_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Add ISC license metadata to SBOM package components

• Added licenses field containing ISC license metadata to components
• License data includes identifier, encoded text content, and URL reference
• Reflects license resolution implementation in SBOM generation

src/test/resources/tst_manifests/pnpm/deps_with_ignore/expected_stack_sbom.json


39. src/test/resources/tst_manifests/npm/deps_with_ignore/expected_stack_sbom.json 🧪 Tests +22/-0

Add ISC license information to expected SBOM output

• Added licenses field with complete ISC license information to packages
• Includes license ID, base64-encoded license text, content type, and URL
• Updates expected output to match new license resolution feature

src/test/resources/tst_manifests/npm/deps_with_ignore/expected_stack_sbom.json


40. src/test/resources/tst_manifests/maven/license/pom_with_multiple_licenses/pom.xml 🧪 Tests +29/-0

Add Maven POM with multiple licenses test fixture

• Created new Maven POM test file with multiple licenses (MIT and Apache-2.0)
• Includes project metadata and a log4j dependency
• Tests license resolution for projects with multiple license declarations

src/test/resources/tst_manifests/maven/license/pom_with_multiple_licenses/pom.xml


41. src/test/resources/tst_manifests/maven/license/pom_with_license/pom.xml 🧪 Tests +25/-0

Add Maven POM with single license test fixture

• Created new Maven POM test file with single Apache-2.0 license
• Includes basic project configuration and log4j dependency
• Tests license resolution for Maven projects with single license

src/test/resources/tst_manifests/maven/license/pom_with_license/pom.xml


42. src/test/resources/tst_manifests/maven/license/pom_without_license/pom.xml 🧪 Tests +18/-0

Add Maven POM without license test fixture

• Created new Maven POM test file without license declaration
• Contains minimal project configuration with log4j dependency
• Tests license resolution behavior for projects without licenses

src/test/resources/tst_manifests/maven/license/pom_without_license/pom.xml


43. src/test/resources/tst_manifests/npm/license/package_with_legacy_licenses/package.json 🧪 Tests +11/-0

Add npm package with legacy licenses format test fixture

• Created test fixture with legacy licenses array format (Apache-2.0 and MIT)
• Tests backward compatibility with older npm license declaration style
• Includes express dependency for realistic test scenario

src/test/resources/tst_manifests/npm/license/package_with_legacy_licenses/package.json


44. src/test/resources/tst_manifests/npm/license/package_with_license/package.json 🧪 Tests +8/-0

Add npm package with license field test fixture

• Created test fixture with standard license field (MIT)
• Tests license resolution for modern npm package format
• Includes express dependency for realistic test scenario

src/test/resources/tst_manifests/npm/license/package_with_license/package.json


45. src/test/resources/tst_manifests/npm/license/package_without_license/package.json 🧪 Tests +7/-0

Add npm package without license test fixture

• Created test fixture without license declaration
• Tests license resolution behavior for packages without licenses
• Includes express dependency for realistic test scenario

src/test/resources/tst_manifests/npm/license/package_without_license/package.json


46. src/test/resources/tst_manifests/cargo/license/cargo_with_license/Cargo.toml 🧪 Tests +7/-0

Add Cargo.toml with license test fixture

• Created test fixture with MIT license in Cargo.toml
• Tests license resolution for Rust projects with license declaration
• Includes serde dependency for realistic test scenario

src/test/resources/tst_manifests/cargo/license/cargo_with_license/Cargo.toml


47. src/test/resources/tst_manifests/cargo/license/cargo_without_license/Cargo.toml 🧪 Tests +6/-0

Add Cargo.toml without license test fixture

• Created test fixture without license declaration in Cargo.toml
• Tests license resolution behavior for Rust projects without licenses
• Includes serde dependency for realistic test scenario

src/test/resources/tst_manifests/cargo/license/cargo_without_license/Cargo.toml


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Mar 19, 2026

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Action required

1. Non-SPDX breaks license check 🐞 Bug ✓ Correctness
Description
LicenseCheck passes the raw manifest license string into ExhortApi.getLicenseDetails(), but
getLicenseDetails is explicitly keyed by SPDX identifiers; common manifest values like POM license
names (not SPDX IDs) will return null and can cause mismatch/compatibility evaluation to silently
degrade.
Code

src/main/java/io/github/guacsec/trustifyda/license/LicenseCheck.java[R86-97]

+          String manifestSpdx = projectLicense.fromManifest();
+          String fileSpdx = backendFileId != null ? backendFileId : projectLicense.fromFile();
+          boolean mismatch =
+              manifestSpdx != null
+                  && fileSpdx != null
+                  && !LicenseUtils.normalizeSpdx(manifestSpdx)
+                      .equals(LicenseUtils.normalizeSpdx(fileSpdx));
+
+          CompletableFuture<JsonNode> manifestDetailsFuture =
+              manifestSpdx != null
+                  ? api.getLicenseDetails(manifestSpdx)
+                  : CompletableFuture.completedFuture(null);
Evidence
JavaMavenProvider extracts the POM <licenses><license><name> value verbatim, and
LicenseUtils.getLicense returns that string as-is; LicenseCheck then treats it as an SPDX id and
calls getLicenseDetails(manifestSpdx). ExhortApi.getLicenseDetails is documented as fetching by SPDX
identifier (path /api/v5/licenses/{spdx}), so non‑SPDX names/expressions will typically not resolve,
leading to null details/categories and reduced/incorrect results.

src/main/java/io/github/guacsec/trustifyda/license/LicenseCheck.java[86-97]
src/main/java/io/github/guacsec/trustifyda/impl/ExhortApi.java[598-606]
src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java[71-101]
src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java[160-165]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`LicenseCheck`/CLI assume the project license string is already an SPDX identifier and pass it to `ExhortApi.getLicenseDetails()`, but manifest extraction often yields non‑SPDX names/expressions (e.g., Maven `&lt;license&gt;&lt;name&gt;Apache License 2.0&lt;/name&gt;`). This leads to null backend lookups and can silently disable/incorrectly compute mismatch + compatibility.

### Issue Context
- `ExhortApi.getLicenseDetails` is keyed by SPDX id (and calls `/api/v5/licenses/{spdx}`), so callers must supply a real SPDX id.
- Today, providers return raw manifest license strings and `LicenseUtils.getLicense` does no SPDX normalization.

### Fix Focus Areas
- src/main/java/io/github/guacsec/trustifyda/license/LicenseCheck.java[86-110]
- src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java[160-179]
- src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java[71-116]
- src/main/java/io/github/guacsec/trustifyda/cli/App.java[251-270]

### What to change
- Introduce a single normalization step that converts a manifest/license-file string into a best-effort SPDX identifier **before**:
 - calling `getLicenseDetails(...)`
 - comparing for mismatch
- If normalization fails, propagate a clear error message into the license summary / CLI output (instead of silently returning null details), and treat category as UNKNOWN.
- For Maven specifically, consider preferring SPDX-like values when present (or mapping common POM license names to SPDX) so typical POMs work out of the box.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. XML parsing not hardened 🐞 Bug ⛨ Security
Description
JavaMavenProvider.readLicenseFromPom creates an XMLInputFactory with default settings and parses
user-supplied pom.xml content, without explicitly disabling DTD/external entity processing. This is
avoidable risk and should be hardened even if typical StAX defaults are safer than DOM.
Code

src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java[R83-87]

+  private String readLicenseFromPom(Path pomPath) {
+    XMLInputFactory factory = XMLInputFactory.newInstance();
+    try (InputStream is = Files.newInputStream(pomPath)) {
+      XMLStreamReader reader = factory.createXMLStreamReader(is);
+      boolean insideLicenses = false;
Evidence
The CLI accepts an arbitrary manifest path (only existence/regular-file validated) and routes it
into provider logic; JavaMavenProvider then initializes XMLInputFactory with no secure properties
before parsing the POM. Explicitly disabling DTD/external entity support is standard hardening for
untrusted XML inputs.

src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java[83-87]
src/main/java/io/github/guacsec/trustifyda/cli/App.java[186-205]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`JavaMavenProvider.readLicenseFromPom` parses `pom.xml` via `XMLInputFactory` without explicitly disabling DTD/external entities. Even if the default StAX implementation is typically safer, leaving this unspecified is unnecessary risk when parsing potentially untrusted repository content.

### Issue Context
The manifest path comes from CLI/user input and only basic file checks are performed.

### Fix Focus Areas
- src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java[83-116]

### What to change
- Configure `XMLInputFactory` securely before creating the reader, e.g.:
 - disable DTD support (`XMLInputFactory.SUPPORT_DTD` -&gt; false)
 - disable external entities (`javax.xml.stream.isSupportingExternalEntities` -&gt; false)
 - (if applicable) set JAXP external access properties to empty (e.g., `XMLConstants.ACCESS_EXTERNAL_DTD`, `XMLConstants.ACCESS_EXTERNAL_SCHEMA`).
- Keep behavior identical otherwise (still streaming parse, still return first `&lt;license&gt;&lt;name&gt;`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Ignores other license providers 🐞 Bug ✓ Correctness
Description
LicenseUtils.licensesFromReport stops after the first LicenseProviderResult with any data, so it can
miss additional packages/licenses present in subsequent provider results. This can under-report
incompatible dependencies when multiple license providers contribute partial data.
Code

src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java[R285-287]

+      if (!result.isEmpty()) {
+        break;
+      }
Evidence
The method iterates analysisReport.getLicenses() (a list), but breaks immediately once any results
are collected, preventing merging/aggregation across multiple provider results.

src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java[257-287]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`licensesFromReport(...)` breaks out of the outer loop after the first `LicenseProviderResult` that yields any entries. If the backend returns multiple provider results where each contributes different subsets, the current logic will drop data.

### Issue Context
The code structure implies multiple license providers may exist (`analysisReport.getLicenses()` is a list).

### Fix Focus Areas
- src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java[245-291]

### What to change
- Remove the early `break` and aggregate results across all `LicenseProviderResult` entries.
- If duplicates occur, define a deterministic merge strategy (e.g., prefer entries with non-null category/identifiers; or prefer a specific provider by name if exposed).
- Add/extend tests to cover multiple-provider scenarios so incompatibility detection is stable.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +86 to +97
String manifestSpdx = projectLicense.fromManifest();
String fileSpdx = backendFileId != null ? backendFileId : projectLicense.fromFile();
boolean mismatch =
manifestSpdx != null
&& fileSpdx != null
&& !LicenseUtils.normalizeSpdx(manifestSpdx)
.equals(LicenseUtils.normalizeSpdx(fileSpdx));

CompletableFuture<JsonNode> manifestDetailsFuture =
manifestSpdx != null
? api.getLicenseDetails(manifestSpdx)
: CompletableFuture.completedFuture(null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Non-spdx breaks license check 🐞 Bug ✓ Correctness

LicenseCheck passes the raw manifest license string into ExhortApi.getLicenseDetails(), but
getLicenseDetails is explicitly keyed by SPDX identifiers; common manifest values like POM license
names (not SPDX IDs) will return null and can cause mismatch/compatibility evaluation to silently
degrade.
Agent Prompt
### Issue description
`LicenseCheck`/CLI assume the project license string is already an SPDX identifier and pass it to `ExhortApi.getLicenseDetails()`, but manifest extraction often yields non‑SPDX names/expressions (e.g., Maven `<license><name>Apache License 2.0</name>`). This leads to null backend lookups and can silently disable/incorrectly compute mismatch + compatibility.

### Issue Context
- `ExhortApi.getLicenseDetails` is keyed by SPDX id (and calls `/api/v5/licenses/{spdx}`), so callers must supply a real SPDX id.
- Today, providers return raw manifest license strings and `LicenseUtils.getLicense` does no SPDX normalization.

### Fix Focus Areas
- src/main/java/io/github/guacsec/trustifyda/license/LicenseCheck.java[86-110]
- src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java[160-179]
- src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java[71-116]
- src/main/java/io/github/guacsec/trustifyda/cli/App.java[251-270]

### What to change
- Introduce a single normalization step that converts a manifest/license-file string into a best-effort SPDX identifier **before**:
  - calling `getLicenseDetails(...)`
  - comparing for mismatch
- If normalization fails, propagate a clear error message into the license summary / CLI output (instead of silently returning null details), and treat category as UNKNOWN.
- For Maven specifically, consider preferring SPDX-like values when present (or mapping common POM license names to SPDX) so typical POMs work out of the box.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Contributor Author

@soul2zimate soul2zimate Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a known limitation for the moment. For both JS and Java clients, I see the same raw manifest license strings used to get license detail

@soul2zimate soul2zimate requested a review from ruromero March 19, 2026 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement license resolution and identification

1 participant