From a34a5b9a31b70ffefde7dbd59412d20921b3e678 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 4 Jun 2026 09:49:21 +0000 Subject: [PATCH 1/6] Initial plan From da7f6c99f6c9bb923fce718aaffd8d195f36649e Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 4 Jun 2026 09:51:56 +0000 Subject: [PATCH 2/6] docs: add GitLab CI/CD snippet to usage docs and README --- README.md | 25 ++++++++++++++++++++++++- docs/usage.md | 46 +++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 69 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index db8218d..7ef68a1 100644 --- a/README.md +++ b/README.md @@ -74,10 +74,33 @@ pip install git+https://github.com/SoftwareUnderstanding/RsMetaCheck.git ### GitHub Action -RsMetaCheck can be easily integrated into your CI/CD pipelines as a GitHub Action. We have set it up in GitHub Action in the following repository: [rs-metacheck-action](https://github.com/SoftwareUnderstanding/rs-metacheck-action) and is up in GitHub MarketPlace at [rsmetacheck actions](https://github.com/marketplace/actions/rsmetacheck). +RSMetaCheck can be easily integrated into your CI/CD pipelines as a GitHub Action. We have set it up in GitHub Action in the following repository: [rs-metacheck-action](https://github.com/SoftwareUnderstanding/rs-metacheck-action) and is up in GitHub MarketPlace at [rsmetacheck actions](https://github.com/marketplace/actions/rsmetacheck). The action will generate `all_pitfalls_results.json`, along with the `pitfalls/` and `somef_outputs/` directories directly in your workflow workspace. +### GitLab CI/CD + +Add the following snippet to your `.gitlab-ci.yml` to run RSMetaCheck on your GitLab repository: + +```yaml +rsmetacheck: + image: python:3.11 + stage: test + script: + - pip install rsmetacheck + - somef configure -a + - rsmetacheck --input $CI_PROJECT_URL + artifacts: + paths: + - pitfalls_outputs/ + - somef_outputs/ + - analysis_results.json + when: always + expire_in: 1 week +``` + +`$CI_PROJECT_URL` is a built-in GitLab CI/CD variable that automatically resolves to your repository's URL. To avoid GitHub API rate limits when SoMEF fetches metadata, store your GitHub personal access token as a GitLab CI/CD variable named `GITHUB_TOKEN` and pass it via `somef configure -a -t $GITHUB_TOKEN`. + ### Run the Detection Tool locally #### Analyze a Single Repository diff --git a/docs/usage.md b/docs/usage.md index fcbfa29..66952ef 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,6 +1,6 @@ # Usage -RSMetaCheck can be used as a local command-line tool or integrated into your CI/CD pipeline as a GitHub Action. +RSMetaCheck can be used as a local command-line tool or integrated into your CI/CD pipeline as a GitHub Action or a GitLab CI/CD job. ## Command Line Interface @@ -85,3 +85,47 @@ jobs: env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} ``` + +## GitLab CI/CD + +You can integrate RSMetaCheck into your GitLab pipelines by adding the following snippet to your `.gitlab-ci.yml` file: + +```yaml +rsmetacheck: + image: python:3.11 + stage: test + script: + - pip install rsmetacheck + - somef configure -a + - rsmetacheck --input $CI_PROJECT_URL + artifacts: + paths: + - pitfalls_outputs/ + - somef_outputs/ + - analysis_results.json + when: always + expire_in: 1 week +``` + +`$CI_PROJECT_URL` is a [built-in GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html) that automatically resolves to your repository's URL. + +### Providing a GitHub Token (recommended) + +SoMEF fetches repository metadata from GitHub's API. Without a token, anonymous requests are subject to low rate limits. To avoid this, store your GitHub personal access token as a [GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/) named `GITHUB_TOKEN` and pass it to `somef configure`: + +```yaml +rsmetacheck: + image: python:3.11 + stage: test + script: + - pip install rsmetacheck + - somef configure -a -t $GITHUB_TOKEN + - rsmetacheck --input $CI_PROJECT_URL + artifacts: + paths: + - pitfalls_outputs/ + - somef_outputs/ + - analysis_results.json + when: always + expire_in: 1 week +``` From 82820f352ae51a1e69a39b87fad0a6c2d8237805 Mon Sep 17 00:00:00 2001 From: Thomas Vuillaume Date: Thu, 4 Jun 2026 12:14:25 +0200 Subject: [PATCH 3/6] Refine GitHub Action integration instructions Updated the GitHub Action section to provide a more concise integration guide and removed unnecessary code snippets. --- docs/usage.md | 39 ++------------------------------------- 1 file changed, 2 insertions(+), 37 deletions(-) diff --git a/docs/usage.md b/docs/usage.md index 66952ef..2d64af1 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -67,24 +67,9 @@ poetry run rsmetacheck --input https://github.com/example/repo --config ./ci/rsm ## GitHub Action -You can integrate RSMetaCheck into your GitHub workflows: +You can integrate RSMetaCheck into your GitHub workflow to test your own repository and detect issues automatically. +Please refer to our action in the GitHub MarketPlace at [rsmetacheck actions](https://github.com/marketplace/actions/rsmetacheck) for more information. -```yaml -name: RsMetaCheck - -on: [push, pull_request] - -jobs: - check-metadata: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - - uses: SoftwareUnderstanding/RsMetaCheck@v0.2.1 - with: - verbose: "false" - env: - GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} -``` ## GitLab CI/CD @@ -109,23 +94,3 @@ rsmetacheck: `$CI_PROJECT_URL` is a [built-in GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html) that automatically resolves to your repository's URL. -### Providing a GitHub Token (recommended) - -SoMEF fetches repository metadata from GitHub's API. Without a token, anonymous requests are subject to low rate limits. To avoid this, store your GitHub personal access token as a [GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/) named `GITHUB_TOKEN` and pass it to `somef configure`: - -```yaml -rsmetacheck: - image: python:3.11 - stage: test - script: - - pip install rsmetacheck - - somef configure -a -t $GITHUB_TOKEN - - rsmetacheck --input $CI_PROJECT_URL - artifacts: - paths: - - pitfalls_outputs/ - - somef_outputs/ - - analysis_results.json - when: always - expire_in: 1 week -``` From d5e221c64963850004d9f8cc44f16f1e40cdf056 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 4 Jun 2026 10:19:45 +0000 Subject: [PATCH 4/6] docs: replace README Usage section with link to usage documentation --- README.md | 186 +----------------------------------------------------- 1 file changed, 1 insertion(+), 185 deletions(-) diff --git a/README.md b/README.md index 7ef68a1..108e4f8 100644 --- a/README.md +++ b/README.md @@ -72,191 +72,7 @@ pip install git+https://github.com/SoftwareUnderstanding/RsMetaCheck.git ## Usage -### GitHub Action - -RSMetaCheck can be easily integrated into your CI/CD pipelines as a GitHub Action. We have set it up in GitHub Action in the following repository: [rs-metacheck-action](https://github.com/SoftwareUnderstanding/rs-metacheck-action) and is up in GitHub MarketPlace at [rsmetacheck actions](https://github.com/marketplace/actions/rsmetacheck). - -The action will generate `all_pitfalls_results.json`, along with the `pitfalls/` and `somef_outputs/` directories directly in your workflow workspace. - -### GitLab CI/CD - -Add the following snippet to your `.gitlab-ci.yml` to run RSMetaCheck on your GitLab repository: - -```yaml -rsmetacheck: - image: python:3.11 - stage: test - script: - - pip install rsmetacheck - - somef configure -a - - rsmetacheck --input $CI_PROJECT_URL - artifacts: - paths: - - pitfalls_outputs/ - - somef_outputs/ - - analysis_results.json - when: always - expire_in: 1 week -``` - -`$CI_PROJECT_URL` is a built-in GitLab CI/CD variable that automatically resolves to your repository's URL. To avoid GitHub API rate limits when SoMEF fetches metadata, store your GitHub personal access token as a GitLab CI/CD variable named `GITHUB_TOKEN` and pass it via `somef configure -a -t $GITHUB_TOKEN`. - -### Run the Detection Tool locally - -#### Analyze a Single Repository - -```bash -poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse -``` - -#### Analyze a Specific Branch - -You can analyze a specific branch of a repository by using the `--branch` or `-b` flag: - -```bash -poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --branch develop -``` - -#### Analyze Multiple Repositories from a JSON File - -```bash -poetry run rsmetacheck --input repositories.json -``` - -The `repositories.json` file should be structured as follows: - -```json -{ - "repositories": [ - "https://gitlab.com/example/example_repo_1", - "https://gitlab.com/example/example_repo_2", - "https://github.com/example/example_repo_3" - ] -} -``` - -#### Customize Output Paths - -```bash -poetry run rsmetacheck --input repositories.json \ - --somef-output ./results/somef \ - --pitfalls-output ./results/pitfalls \ - --analysis-output ./results/summary.json \ - --notes-output ./results/notes.json -``` - -#### Version Discrepancy Notes - -When a metadata version differs from the release version by a small margin (all version components differ by less than 2, e.g., `0.4.3.dev1` vs `0.4.2`), MetaCheck records a **note** rather than a full pitfall. To capture these observations, use the `--notes-output` flag: - -```bash -poetry run rsmetacheck --input https://github.com/example/repo --notes-output ./notes.json -``` - -The notes file is only created when there are observations to report and the `--notes-output` path is specified. Its structure is: - -```json -{ - "total_notes": 1, - "notes": [ - { - "repository": "example/repo", - "file_name": "repo_output.json", - "code": "P001", - "note": "Version discrepancy: metadata '0.4.3.dev1' vs release '0.4.2'" - } - ] -} -``` - -If the version difference is significant (any component differs by 2 or more, e.g., `0.12.4` vs `0.12.1`), it is still flagged as a pitfall. - -#### Skip SoMEF and Analyze Existing Outputs - -If you've already run SoMEF separately: - -```bash -poetry run rsmetacheck --skip-somef --input somef_outputs/*.json -``` - -Or for multiple paths: - -```bash -poetry run rsmetacheck --skip-somef --input my_somef_outputs_1/*.json my_somef_outputs_2/*.json -``` - -#### Verbose Output for Passed Checks - -By default, the JSON-LD files generated by RsMetaCheck will only contain information about pitfalls and warnings that were actually detected. If you want to include all tests in the final JSON-LD, even tests that the repository successfully passed, use the `--verbose` flag: - -```bash -poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --verbose -``` - -#### Configure Analysis with a Root Config File - -You can configure RsMetaCheck with a TOML file at the repository root named `.rsmetacheck.toml` (auto-detected), or pass a custom path with `--config`. - -Supported options: - -- `ignore`: warnings/pitfalls to ignore (e.g. `P001`, `W002`) -- `exclude_files`: metadata sources to ignore (glob, filename, or substring match) -- `parameters`: per-check parameters for configurable checks -- `profiles`: alternate configurations such as `unstable` or `prerelease` - -Example: - -```toml -ignore = ["W002"] -exclude_files = ["**/generated/**", "tmp_metadata.json"] - -[parameters.P001] -ahead_significant_diff = 2 - -[parameters.W002] -stale_after_days = 3 - -[profiles.unstable] -ignore = ["W002", "P017"] - -[profiles.unstable.parameters.P001] -ahead_significant_diff = 10 - -[profiles.prerelease] -ignore = [] - -[profiles.prerelease.parameters.P001] -ahead_significant_diff = 1 -``` - -Use a specific profile: - -```bash -poetry run rsmetacheck --input https://github.com/example/repo --config-profile unstable -``` - -Use a custom config path: - -```bash -poetry run rsmetacheck --input https://github.com/example/repo --config ./ci/rsmetacheck.toml -``` - -### Output - -The tool will: - -- Process all JSON files in the SoMEF output directory (by default `somef_outputs` created by the tool) -- Display progress messages showing detected pitfalls -- Generate JSON-LD files of detailed Pitfalls and Warnings detected by the tool in `output_1_pitfalls.jsonld`, - `output_2_pitfalls.jsonld`, etc... in `pitfalls` (by default created by the tool) directory -- Generate a comprehensive report in `all_pitfalls_results.json` - -The output file contains: - -- EVERSE standardized JSON-LD output of each repository -- Summary statistics of analyzed repositories -- Count and percentage for each pitfall type -- Language-specific breakdown for repositories with target languages +For full usage instructions — including CLI options, GitHub Action integration, GitLab CI/CD setup, output format, and configuration — please refer to the [usage documentation](https://rsmetacheck.readthedocs.io/en/latest/usage/). ## Troubleshooting From 3534d07d402cc4898c2005f1f27d1b80509311ac Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 4 Jun 2026 10:25:12 +0000 Subject: [PATCH 5/6] =?UTF-8?q?docs:=20make=20usage.md=20exhaustive=20?= =?UTF-8?q?=E2=80=94=20add=20all=20CLI=20flags,=20config=20keys,=20and=20G?= =?UTF-8?q?itLab=20token=20snippet?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/usage.md | 161 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 144 insertions(+), 17 deletions(-) diff --git a/docs/usage.md b/docs/usage.md index 2d64af1..ca646eb 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -12,13 +12,21 @@ poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse ### Analyze a Specific Branch +Use `--branch` (short: `-b`) to target a non-default branch: + ```bash poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --branch develop ``` -### Analyze Multiple Repositories from a JSON File +### Analyze Multiple Repositories + +Pass several URLs directly on the command line: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo_1 https://github.com/example/repo_2 +``` -Create a `repositories.json` file: +Or create a `repositories.json` file and pass it as the input: ```json { @@ -29,48 +37,146 @@ Create a `repositories.json` file: } ``` -Run the analysis: - ```bash poetry run rsmetacheck --input repositories.json ``` +### Customize Output Paths + +By default RSMetaCheck writes its output to the current working directory. Use the flags below to redirect any of the outputs: + +| Flag | Default | Description | +|------|---------|-------------| +| `--somef-output` | `./somef_outputs` | Directory for raw SoMEF JSON files | +| `--pitfalls-output` | `./pitfalls_outputs` | Directory for per-repository pitfall JSON-LD files | +| `--analysis-output` | `./analysis_results.json` | File for the overall summary report | +| `--notes-output` | *(not created)* | File for minor version-discrepancy notes (see below) | + +```bash +poetry run rsmetacheck --input repositories.json \ + --somef-output ./results/somef \ + --pitfalls-output ./results/pitfalls \ + --analysis-output ./results/summary.json \ + --notes-output ./results/notes.json +``` + +### Version Discrepancy Notes + +When a metadata version differs from the release version only slightly (every component differs by less than 2, e.g. `0.4.3.dev1` vs `0.4.2`), RSMetaCheck records a **note** instead of a full pitfall. Notes are only written when `--notes-output` is provided: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --notes-output ./notes.json +``` + +Example notes file: + +```json +{ + "total_notes": 1, + "notes": [ + { + "repository": "example/repo", + "file_name": "repo_output.json", + "code": "P001", + "note": "Version discrepancy: metadata '0.4.3.dev1' vs release '0.4.2'" + } + ] +} +``` + +When the difference is significant (any component differs by 2 or more, e.g. `0.12.4` vs `0.12.1`), the issue is still reported as a pitfall regardless. + +### Skip SoMEF and Analyze Existing Outputs + +If you have already run SoMEF separately, pass `--skip-somef` and point `--input` at the existing JSON files to avoid re-running SoMEF: + +```bash +poetry run rsmetacheck --skip-somef --input somef_outputs/*.json +``` + +Multiple glob patterns are supported: + +```bash +poetry run rsmetacheck --skip-somef --input my_somef_outputs_1/*.json my_somef_outputs_2/*.json +``` + +### Verbose Output + +By default, only detected pitfalls and warnings appear in the output JSON-LD files. Use `--verbose` to also include checks that passed: + +```bash +poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --verbose +``` + +### SoMEF Confidence Threshold + +Use `--threshold` to control how confident SoMEF must be before including a metadata field (default: `0.8`): + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --threshold 0.6 +``` + +### Generate CodeMeta Files + +Use `-c` / `--generate-codemeta` to instruct SoMEF to also produce a `codemeta.json` file for each repository: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --generate-codemeta +``` + ### Configure Analysis Rules -RsMetaCheck can load a root-level `.rsmetacheck.toml` file to customize analysis behavior. +RSMetaCheck automatically detects a `.rsmetacheck.toml` (or `rsmetacheck.toml`) file at the working directory. Alternatively, supply a custom path with `--config`: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --config ./ci/rsmetacheck.toml +``` + +Supported configuration keys: + +- `ignore` — list of pitfall/warning codes to skip (e.g. `"P001"`, `"W002"`) +- `exclude_files` — glob patterns, filenames, or substrings of metadata sources to ignore +- `parameters` — per-check tunable parameters +- `active_profile` — name of the profile to activate automatically when no `--config-profile` flag is passed +- `profiles` — named groups of overrides that can be selected at runtime + +Full example: ```toml ignore = ["W002"] -exclude_files = ["tmp_metadata.json"] +exclude_files = ["**/generated/**", "tmp_metadata.json"] +active_profile = "unstable" [parameters.P001] -ahead_significant_diff = 10 +ahead_significant_diff = 2 -[profiles.prerelease] -ignore = [] +[parameters.W002] +stale_after_days = 3 [profiles.unstable] ignore = ["W002", "P017"] -``` -Use a profile: +[profiles.unstable.parameters.P001] +ahead_significant_diff = 10 -```bash -poetry run rsmetacheck --input https://github.com/example/repo --config-profile unstable +[profiles.prerelease] +ignore = [] + +[profiles.prerelease.parameters.P001] +ahead_significant_diff = 1 ``` -Use an explicit config path: +Activate a profile from the command line (overrides `active_profile`): ```bash -poetry run rsmetacheck --input https://github.com/example/repo --config ./ci/rsmetacheck.toml +poetry run rsmetacheck --input https://github.com/example/repo --config-profile unstable ``` ## GitHub Action -You can integrate RSMetaCheck into your GitHub workflow to test your own repository and detect issues automatically. +You can integrate RSMetaCheck into your GitHub workflow to test your own repository and detect issues automatically. Please refer to our action in the GitHub MarketPlace at [rsmetacheck actions](https://github.com/marketplace/actions/rsmetacheck) for more information. - ## GitLab CI/CD You can integrate RSMetaCheck into your GitLab pipelines by adding the following snippet to your `.gitlab-ci.yml` file: @@ -94,3 +200,24 @@ rsmetacheck: `$CI_PROJECT_URL` is a [built-in GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html) that automatically resolves to your repository's URL. +### Providing a GitHub Token (recommended) + +SoMEF fetches repository metadata from GitHub's API. Without a token, anonymous requests are subject to low rate limits. Store your GitHub personal access token as a [GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/) named `GITHUB_TOKEN` and pass it to `somef configure`: + +```yaml +rsmetacheck: + image: python:3.11 + stage: test + script: + - pip install rsmetacheck + - somef configure -a -t $GITHUB_TOKEN + - rsmetacheck --input $CI_PROJECT_URL + artifacts: + paths: + - pitfalls_outputs/ + - somef_outputs/ + - analysis_results.json + when: always + expire_in: 1 week +``` + From 53bb20de91f0007cb36b046723d096c02f49396b Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 4 Jun 2026 10:26:04 +0000 Subject: [PATCH 6/6] docs: clarify version discrepancy note/pitfall threshold examples --- docs/usage.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/usage.md b/docs/usage.md index ca646eb..16a7aa4 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -62,7 +62,7 @@ poetry run rsmetacheck --input repositories.json \ ### Version Discrepancy Notes -When a metadata version differs from the release version only slightly (every component differs by less than 2, e.g. `0.4.3.dev1` vs `0.4.2`), RSMetaCheck records a **note** instead of a full pitfall. Notes are only written when `--notes-output` is provided: +When a metadata version differs from the release version only slightly (every component differs by less than 2, e.g. `0.4.3.dev1` vs `0.4.2` — the pre-release suffix means it is numerically close), RSMetaCheck records a **note** instead of a full pitfall. Notes are only written when `--notes-output` is provided: ```bash poetry run rsmetacheck --input https://github.com/example/repo --notes-output ./notes.json @@ -84,7 +84,7 @@ Example notes file: } ``` -When the difference is significant (any component differs by 2 or more, e.g. `0.12.4` vs `0.12.1`), the issue is still reported as a pitfall regardless. +When the difference is significant (any component differs by 2 or more, e.g. `0.12.3` vs `0.12.1`), the issue is still reported as a pitfall regardless. ### Skip SoMEF and Analyze Existing Outputs