feat: implement license resolution and identification#356
feat: implement license resolution and identification#356soul2zimate wants to merge 2 commits intoguacsec:mainfrom
Conversation
Add license analysis features that detect the project license, check dependency license compatibility, and include license information in generated SBOMs. This mirrors the JavaScript client implementation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review Summary by QodoImplement license resolution and identification with compatibility checking
WalkthroughsDescription• Implement comprehensive license resolution and identification features for Java client • Add componentAnalysisWithLicense() API method that performs component analysis with automatic license checking • Implement license detection from ecosystem-specific manifests (pom.xml, package.json, Cargo.toml) with LICENSE file fallback • Add license compatibility checking based on restrictiveness hierarchy (permissive < weak copyleft < strong copyleft) • Integrate license information into generated SBOMs with SPDX normalization • Add CLI license command to display project license information • Implement SPDX pattern matching for common licenses (Apache, MIT, GPL, LGPL, AGPL, BSD) • Create LicenseUtils, LicenseCheck, and ProjectLicense utility classes for license operations • Update all providers (Maven, JavaScript, Cargo, Go, Python, Gradle) to extract and include license information • Add comprehensive documentation and CLI help for license features • Update test fixtures and add new test cases for license extraction across ecosystems Diagramflowchart LR
A["Component Analysis"] -->|includes| B["License Detection"]
B -->|from manifest| C["Ecosystem Providers"]
B -->|fallback| D["LICENSE File"]
C -->|Maven| E["pom.xml"]
C -->|JavaScript| F["package.json"]
C -->|Cargo| G["Cargo.toml"]
B -->|backend| H["License Identification"]
H -->|SPDX| I["Normalized License"]
I -->|compatibility check| J["LicenseCheck"]
J -->|result| K["ComponentAnalysisResult"]
K -->|include in| L["SBOM with License"]
M["CLI license command"] -->|display| N["Project License Info"]
File Changes1. src/main/java/io/github/guacsec/trustifyda/impl/ExhortApi.java
|
Code Review by Qodo
1. Non-SPDX breaks license check
|
| String manifestSpdx = projectLicense.fromManifest(); | ||
| String fileSpdx = backendFileId != null ? backendFileId : projectLicense.fromFile(); | ||
| boolean mismatch = | ||
| manifestSpdx != null | ||
| && fileSpdx != null | ||
| && !LicenseUtils.normalizeSpdx(manifestSpdx) | ||
| .equals(LicenseUtils.normalizeSpdx(fileSpdx)); | ||
|
|
||
| CompletableFuture<JsonNode> manifestDetailsFuture = | ||
| manifestSpdx != null | ||
| ? api.getLicenseDetails(manifestSpdx) | ||
| : CompletableFuture.completedFuture(null); |
There was a problem hiding this comment.
1. Non-spdx breaks license check 🐞 Bug ✓ Correctness
LicenseCheck passes the raw manifest license string into ExhortApi.getLicenseDetails(), but getLicenseDetails is explicitly keyed by SPDX identifiers; common manifest values like POM license names (not SPDX IDs) will return null and can cause mismatch/compatibility evaluation to silently degrade.
Agent Prompt
### Issue description
`LicenseCheck`/CLI assume the project license string is already an SPDX identifier and pass it to `ExhortApi.getLicenseDetails()`, but manifest extraction often yields non‑SPDX names/expressions (e.g., Maven `<license><name>Apache License 2.0</name>`). This leads to null backend lookups and can silently disable/incorrectly compute mismatch + compatibility.
### Issue Context
- `ExhortApi.getLicenseDetails` is keyed by SPDX id (and calls `/api/v5/licenses/{spdx}`), so callers must supply a real SPDX id.
- Today, providers return raw manifest license strings and `LicenseUtils.getLicense` does no SPDX normalization.
### Fix Focus Areas
- src/main/java/io/github/guacsec/trustifyda/license/LicenseCheck.java[86-110]
- src/main/java/io/github/guacsec/trustifyda/license/LicenseUtils.java[160-179]
- src/main/java/io/github/guacsec/trustifyda/providers/JavaMavenProvider.java[71-116]
- src/main/java/io/github/guacsec/trustifyda/cli/App.java[251-270]
### What to change
- Introduce a single normalization step that converts a manifest/license-file string into a best-effort SPDX identifier **before**:
- calling `getLicenseDetails(...)`
- comparing for mismatch
- If normalization fails, propagate a clear error message into the license summary / CLI output (instead of silently returning null details), and treat category as UNKNOWN.
- For Maven specifically, consider preferring SPDX-like values when present (or mapping common POM license names to SPDX) so typical POMs work out of the box.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
This is a known limitation for the moment. For both JS and Java clients, I see the same raw manifest license strings used to get license detail
feat: implement license resolution and identification
Add license analysis features that detect the project license, check dependency license compatibility, and include license information in generated SBOMs. This mirrors the JavaScript client implementation.
resolve #355