diff --git a/README.md b/README.md index db8218d..108e4f8 100644 --- a/README.md +++ b/README.md @@ -72,168 +72,7 @@ pip install git+https://github.com/SoftwareUnderstanding/RsMetaCheck.git ## Usage -### GitHub Action - -RsMetaCheck can be easily integrated into your CI/CD pipelines as a GitHub Action. We have set it up in GitHub Action in the following repository: [rs-metacheck-action](https://github.com/SoftwareUnderstanding/rs-metacheck-action) and is up in GitHub MarketPlace at [rsmetacheck actions](https://github.com/marketplace/actions/rsmetacheck). - -The action will generate `all_pitfalls_results.json`, along with the `pitfalls/` and `somef_outputs/` directories directly in your workflow workspace. - -### Run the Detection Tool locally - -#### Analyze a Single Repository - -```bash -poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse -``` - -#### Analyze a Specific Branch - -You can analyze a specific branch of a repository by using the `--branch` or `-b` flag: - -```bash -poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --branch develop -``` - -#### Analyze Multiple Repositories from a JSON File - -```bash -poetry run rsmetacheck --input repositories.json -``` - -The `repositories.json` file should be structured as follows: - -```json -{ - "repositories": [ - "https://gitlab.com/example/example_repo_1", - "https://gitlab.com/example/example_repo_2", - "https://github.com/example/example_repo_3" - ] -} -``` - -#### Customize Output Paths - -```bash -poetry run rsmetacheck --input repositories.json \ - --somef-output ./results/somef \ - --pitfalls-output ./results/pitfalls \ - --analysis-output ./results/summary.json \ - --notes-output ./results/notes.json -``` - -#### Version Discrepancy Notes - -When a metadata version differs from the release version by a small margin (all version components differ by less than 2, e.g., `0.4.3.dev1` vs `0.4.2`), MetaCheck records a **note** rather than a full pitfall. To capture these observations, use the `--notes-output` flag: - -```bash -poetry run rsmetacheck --input https://github.com/example/repo --notes-output ./notes.json -``` - -The notes file is only created when there are observations to report and the `--notes-output` path is specified. Its structure is: - -```json -{ - "total_notes": 1, - "notes": [ - { - "repository": "example/repo", - "file_name": "repo_output.json", - "code": "P001", - "note": "Version discrepancy: metadata '0.4.3.dev1' vs release '0.4.2'" - } - ] -} -``` - -If the version difference is significant (any component differs by 2 or more, e.g., `0.12.4` vs `0.12.1`), it is still flagged as a pitfall. - -#### Skip SoMEF and Analyze Existing Outputs - -If you've already run SoMEF separately: - -```bash -poetry run rsmetacheck --skip-somef --input somef_outputs/*.json -``` - -Or for multiple paths: - -```bash -poetry run rsmetacheck --skip-somef --input my_somef_outputs_1/*.json my_somef_outputs_2/*.json -``` - -#### Verbose Output for Passed Checks - -By default, the JSON-LD files generated by RsMetaCheck will only contain information about pitfalls and warnings that were actually detected. If you want to include all tests in the final JSON-LD, even tests that the repository successfully passed, use the `--verbose` flag: - -```bash -poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --verbose -``` - -#### Configure Analysis with a Root Config File - -You can configure RsMetaCheck with a TOML file at the repository root named `.rsmetacheck.toml` (auto-detected), or pass a custom path with `--config`. - -Supported options: - -- `ignore`: warnings/pitfalls to ignore (e.g. `P001`, `W002`) -- `exclude_files`: metadata sources to ignore (glob, filename, or substring match) -- `parameters`: per-check parameters for configurable checks -- `profiles`: alternate configurations such as `unstable` or `prerelease` - -Example: - -```toml -ignore = ["W002"] -exclude_files = ["**/generated/**", "tmp_metadata.json"] - -[parameters.P001] -ahead_significant_diff = 2 - -[parameters.W002] -stale_after_days = 3 - -[profiles.unstable] -ignore = ["W002", "P017"] - -[profiles.unstable.parameters.P001] -ahead_significant_diff = 10 - -[profiles.prerelease] -ignore = [] - -[profiles.prerelease.parameters.P001] -ahead_significant_diff = 1 -``` - -Use a specific profile: - -```bash -poetry run rsmetacheck --input https://github.com/example/repo --config-profile unstable -``` - -Use a custom config path: - -```bash -poetry run rsmetacheck --input https://github.com/example/repo --config ./ci/rsmetacheck.toml -``` - -### Output - -The tool will: - -- Process all JSON files in the SoMEF output directory (by default `somef_outputs` created by the tool) -- Display progress messages showing detected pitfalls -- Generate JSON-LD files of detailed Pitfalls and Warnings detected by the tool in `output_1_pitfalls.jsonld`, - `output_2_pitfalls.jsonld`, etc... in `pitfalls` (by default created by the tool) directory -- Generate a comprehensive report in `all_pitfalls_results.json` - -The output file contains: - -- EVERSE standardized JSON-LD output of each repository -- Summary statistics of analyzed repositories -- Count and percentage for each pitfall type -- Language-specific breakdown for repositories with target languages +For full usage instructions — including CLI options, GitHub Action integration, GitLab CI/CD setup, output format, and configuration — please refer to the [usage documentation](https://rsmetacheck.readthedocs.io/en/latest/usage/). ## Troubleshooting diff --git a/docs/usage.md b/docs/usage.md index fcbfa29..16a7aa4 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,6 +1,6 @@ # Usage -RSMetaCheck can be used as a local command-line tool or integrated into your CI/CD pipeline as a GitHub Action. +RSMetaCheck can be used as a local command-line tool or integrated into your CI/CD pipeline as a GitHub Action or a GitLab CI/CD job. ## Command Line Interface @@ -12,13 +12,21 @@ poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse ### Analyze a Specific Branch +Use `--branch` (short: `-b`) to target a non-default branch: + ```bash poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --branch develop ``` -### Analyze Multiple Repositories from a JSON File +### Analyze Multiple Repositories + +Pass several URLs directly on the command line: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo_1 https://github.com/example/repo_2 +``` -Create a `repositories.json` file: +Or create a `repositories.json` file and pass it as the input: ```json { @@ -29,59 +37,187 @@ Create a `repositories.json` file: } ``` -Run the analysis: - ```bash poetry run rsmetacheck --input repositories.json ``` +### Customize Output Paths + +By default RSMetaCheck writes its output to the current working directory. Use the flags below to redirect any of the outputs: + +| Flag | Default | Description | +|------|---------|-------------| +| `--somef-output` | `./somef_outputs` | Directory for raw SoMEF JSON files | +| `--pitfalls-output` | `./pitfalls_outputs` | Directory for per-repository pitfall JSON-LD files | +| `--analysis-output` | `./analysis_results.json` | File for the overall summary report | +| `--notes-output` | *(not created)* | File for minor version-discrepancy notes (see below) | + +```bash +poetry run rsmetacheck --input repositories.json \ + --somef-output ./results/somef \ + --pitfalls-output ./results/pitfalls \ + --analysis-output ./results/summary.json \ + --notes-output ./results/notes.json +``` + +### Version Discrepancy Notes + +When a metadata version differs from the release version only slightly (every component differs by less than 2, e.g. `0.4.3.dev1` vs `0.4.2` — the pre-release suffix means it is numerically close), RSMetaCheck records a **note** instead of a full pitfall. Notes are only written when `--notes-output` is provided: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --notes-output ./notes.json +``` + +Example notes file: + +```json +{ + "total_notes": 1, + "notes": [ + { + "repository": "example/repo", + "file_name": "repo_output.json", + "code": "P001", + "note": "Version discrepancy: metadata '0.4.3.dev1' vs release '0.4.2'" + } + ] +} +``` + +When the difference is significant (any component differs by 2 or more, e.g. `0.12.3` vs `0.12.1`), the issue is still reported as a pitfall regardless. + +### Skip SoMEF and Analyze Existing Outputs + +If you have already run SoMEF separately, pass `--skip-somef` and point `--input` at the existing JSON files to avoid re-running SoMEF: + +```bash +poetry run rsmetacheck --skip-somef --input somef_outputs/*.json +``` + +Multiple glob patterns are supported: + +```bash +poetry run rsmetacheck --skip-somef --input my_somef_outputs_1/*.json my_somef_outputs_2/*.json +``` + +### Verbose Output + +By default, only detected pitfalls and warnings appear in the output JSON-LD files. Use `--verbose` to also include checks that passed: + +```bash +poetry run rsmetacheck --input https://github.com/tidyverse/tidyverse --verbose +``` + +### SoMEF Confidence Threshold + +Use `--threshold` to control how confident SoMEF must be before including a metadata field (default: `0.8`): + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --threshold 0.6 +``` + +### Generate CodeMeta Files + +Use `-c` / `--generate-codemeta` to instruct SoMEF to also produce a `codemeta.json` file for each repository: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --generate-codemeta +``` + ### Configure Analysis Rules -RsMetaCheck can load a root-level `.rsmetacheck.toml` file to customize analysis behavior. +RSMetaCheck automatically detects a `.rsmetacheck.toml` (or `rsmetacheck.toml`) file at the working directory. Alternatively, supply a custom path with `--config`: + +```bash +poetry run rsmetacheck --input https://github.com/example/repo --config ./ci/rsmetacheck.toml +``` + +Supported configuration keys: + +- `ignore` — list of pitfall/warning codes to skip (e.g. `"P001"`, `"W002"`) +- `exclude_files` — glob patterns, filenames, or substrings of metadata sources to ignore +- `parameters` — per-check tunable parameters +- `active_profile` — name of the profile to activate automatically when no `--config-profile` flag is passed +- `profiles` — named groups of overrides that can be selected at runtime + +Full example: ```toml ignore = ["W002"] -exclude_files = ["tmp_metadata.json"] +exclude_files = ["**/generated/**", "tmp_metadata.json"] +active_profile = "unstable" [parameters.P001] +ahead_significant_diff = 2 + +[parameters.W002] +stale_after_days = 3 + +[profiles.unstable] +ignore = ["W002", "P017"] + +[profiles.unstable.parameters.P001] ahead_significant_diff = 10 [profiles.prerelease] ignore = [] -[profiles.unstable] -ignore = ["W002", "P017"] +[profiles.prerelease.parameters.P001] +ahead_significant_diff = 1 ``` -Use a profile: +Activate a profile from the command line (overrides `active_profile`): ```bash poetry run rsmetacheck --input https://github.com/example/repo --config-profile unstable ``` -Use an explicit config path: +## GitHub Action -```bash -poetry run rsmetacheck --input https://github.com/example/repo --config ./ci/rsmetacheck.toml -``` +You can integrate RSMetaCheck into your GitHub workflow to test your own repository and detect issues automatically. +Please refer to our action in the GitHub MarketPlace at [rsmetacheck actions](https://github.com/marketplace/actions/rsmetacheck) for more information. -## GitHub Action +## GitLab CI/CD -You can integrate RSMetaCheck into your GitHub workflows: +You can integrate RSMetaCheck into your GitLab pipelines by adding the following snippet to your `.gitlab-ci.yml` file: ```yaml -name: RsMetaCheck +rsmetacheck: + image: python:3.11 + stage: test + script: + - pip install rsmetacheck + - somef configure -a + - rsmetacheck --input $CI_PROJECT_URL + artifacts: + paths: + - pitfalls_outputs/ + - somef_outputs/ + - analysis_results.json + when: always + expire_in: 1 week +``` -on: [push, pull_request] +`$CI_PROJECT_URL` is a [built-in GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html) that automatically resolves to your repository's URL. -jobs: - check-metadata: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - - uses: SoftwareUnderstanding/RsMetaCheck@v0.2.1 - with: - verbose: "false" - env: - GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} +### Providing a GitHub Token (recommended) + +SoMEF fetches repository metadata from GitHub's API. Without a token, anonymous requests are subject to low rate limits. Store your GitHub personal access token as a [GitLab CI/CD variable](https://docs.gitlab.com/ee/ci/variables/) named `GITHUB_TOKEN` and pass it to `somef configure`: + +```yaml +rsmetacheck: + image: python:3.11 + stage: test + script: + - pip install rsmetacheck + - somef configure -a -t $GITHUB_TOKEN + - rsmetacheck --input $CI_PROJECT_URL + artifacts: + paths: + - pitfalls_outputs/ + - somef_outputs/ + - analysis_results.json + when: always + expire_in: 1 week ``` +