Skip to content
Merged
60 changes: 32 additions & 28 deletions .github/workflows/ci_lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ on:
pull_request:
branches: [master, 'stabilization*']
permissions:
contents: read
contents: read
jobs:
yamllint:
name: Yaml Lint on Changed Controls and Profiles Files
name: Yaml Lint on Changed yaml files
runs-on: ubuntu-latest
steps:
- name: Install Git
Expand All @@ -27,37 +27,41 @@ jobs:
url="repos/$repo/pulls/$pr_number/files"
response=$(gh api "$url" --paginate)
echo "$response" | jq -r '.[].filename' > filenames.txt
cat filenames.txt

if grep -q "controls/" filenames.txt; then
echo "CONTROLS_CHANGES=true" >> $GITHUB_ENV
else
echo "CONTROLS_CHANGES=false" >> $GITHUB_ENV
fi
if grep -q "\.profile" filenames.txt; then
echo "PROFILES_CHANGES=true" >> $GITHUB_ENV
else
echo "PROFILES_CHANGES=false" >> $GITHUB_ENV
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have removed this insertion of PROFILES_CHANGES and CONTROL_CHANGES environment variables to GITHUB_ENV. But, the "Install yamllint" phase still uses these environment variables. I suspect it doesn't work.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the leftover conditional.

fi
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Install yamllint
if: ${{ env.CONTROLS_CHANGES == 'true' || env.PROFILES_CHANGES == 'true' }}
run: pip install yamllint

- name: Run yamllint in Control Files Modified by PR
if: ${{ env.CONTROLS_CHANGES == 'true' }}
run: |
for control_file in $(cat filenames.txt | grep "controls/"); do
echo "Running yamllint on $control_file..."
yamllint "$control_file"
done

- name: Run yamllint in Profile Files Modified by PR
if: ${{ env.PROFILES_CHANGES == 'true' }}
- name: Run yamllint on files modified by the PR
run: |
for profile_file in $(cat filenames.txt | grep "\.profile"); do
echo "Running yamllint on $profile_file..."
yamllint "$profile_file"
exit_code=0
for file in $(grep -E '\.(yml|yaml|fmf|profile)$' filenames.txt); do
if [[ ! -f "$file" ]]; then
continue
fi
echo "Running yamllint on $file..."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work on YAML files that contain Jinja macros?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the last commit I added a script which strips Jinja and still allows us to lint files.

if grep -qP '\{\{[%{#]' "$file"; then
# File contains Jinja2 constructs — strip them before linting.
# yamllint -s exits: 0 = clean, 1 = errors, 2 = warnings only.
output=$(python3 utils/strip_jinja_for_yamllint.py "$file" \
| yamllint -s -c .yamllint - 2>&1)
rc=$?
# Show all output (warnings and errors), replacing "stdin"
# with the actual filename since yamllint reads from a pipe.
if [ -n "$output" ]; then
echo "$output" | sed "s|^stdin|$file|"
fi
# Fail only on errors (exit code 1), not warnings (exit code 2).
if [ "$rc" -eq 1 ]; then
exit_code=1
fi
else
yamllint -s -c .yamllint "$file"
rc=$?
if [ "$rc" -eq 1 ]; then
exit_code=1
fi
fi
done
exit $exit_code
19 changes: 14 additions & 5 deletions .yamllint
Original file line number Diff line number Diff line change
@@ -1,13 +1,22 @@
---
extends: default
locale: en_US.UTF-8
yaml-files:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the yaml-files option is somehow ignored by yamllint.

In the CI run, we can see that the job failed on utils/strip_jinja_for_yamllint.py. Unfortunately, the CI output doesn't show any details.

When I try to reproduce it locally, I run yamllint -s -c .yamllint utils/strip_jinja_for_yamllint.py and from the output it looks like it tried to evaluate the Python file as YAML file.

jcerny@fedora:~/work/git/scap-security-guide (pr/14601)$ yamllint -s -c .yamllint  utils/strip_jinja_for_yamllint.py
utils/strip_jinja_for_yamllint.py
  2:3       error    syntax error: expected '<document start>', but found '<scalar>' (syntax)

Please check if the filtering by glob really works. Also, please investigate whether we can have a detailed output in the GitHub CI output.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahaaa, I see what is the problem. The yaml-files block (according to man page) applies in case yamllint processes a directory. It does not apply when it is passed a particular file.
If we would like to stick with usage of this configuration option to limit set of analyzed files, we would need to somehow create directory structure containing only files changed in the analyzed PR... that seems actually quite strange to me. I think I will go back to some kind of parsing list of changed files and passing only those relevant files to yamllint. Do you agree?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, thanks for discovering this. I agree with your proposal.

- "*.yaml"
- "*.yml"
- "*.fmf"
- "*.profile"

# https://yamllint.readthedocs.io/en/stable/rules.html
rules:
comments: disable
comments-indentation: disable
document-start: disable
truthy: disable # do not check for strict true / false boolean values
comments: disable # disable syntax checking of comments
comments-indentation: disable # disable indentation checks for comments
document-start: disable # do not require the document start marker
empty-lines:
level: warning
level: warning # only warn about empty lines
indentation:
# pass if spaces are used for indentation and number of spaces is consistent through a file
spaces: consistent
line-length: disable
line-length:
max: 99 # allow lines up to 99 chars
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all our YAML files meet the line lenghth limit of 99 lines?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They probably don't, but this should run only on newly modified files. At the same time, we impose this limit on our selves... so with this approach, it will be enforced gradually. Do you think it is a problem?

78 changes: 78 additions & 0 deletions utils/strip_jinja_for_yamllint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#!/usr/bin/env python3
"""Strip Jinja2 constructs from YAML files to make them yamllint-safe.

This project uses Jinja2 templating ({{% %}}, {{{ }}}, {{# #}}) inside YAML
files. yamllint cannot parse these constructs, so this script removes them
while preserving line numbers (replaced regions become blank lines) so that
yamllint error messages still point to the correct source lines.

Usage:
python3 utils/strip_jinja_for_yamllint.py FILE

The cleaned content is written to stdout.
"""

import re
import sys


def _replace_with_blanks(match):
"""Replace a match with the same number of newlines to preserve line numbers."""
return "\n" * match.group(0).count("\n")


def strip_jinja(content):
# 1. Remove whole-line Jinja block tags: {{% ... %}} on their own line(s).
# Match the entire line (including leading whitespace) to avoid leaving
# trailing spaces behind.
content = re.sub(
r"^[ \t]*\{\{%.*?%\}\}[ \t]*$",
_replace_with_blanks,
content,
flags=re.MULTILINE | re.DOTALL,
)
# Remove any remaining inline block tags (rare).
content = re.sub(r"\{\{%.*?%\}\}", _replace_with_blanks, content, flags=re.DOTALL)

# 2. Remove whole-line Jinja comments: {{# ... #}}
content = re.sub(
r"^[ \t]*\{\{#.*?#\}\}[ \t]*$",
_replace_with_blanks,
content,
flags=re.MULTILINE | re.DOTALL,
)
# Remove any remaining inline comments.
content = re.sub(r"\{\{#.*?#\}\}", "", content, flags=re.DOTALL)

# 3a. Standalone Jinja expressions occupying entire lines — these typically
# expand to top-level YAML keys (e.g. ocil/ocil_clause macros) or
# Ansible tasks, so replacing them with a placeholder string would
# produce invalid YAML. Replace with a YAML-safe comment placeholder
# to avoid trailing whitespace on otherwise blank lines.
content = re.sub(
r"^([ \t]*)\{\{\{.*?\}\}\}[ \t]*$",
lambda m: m.group(1) + "# jinja" + "\n" * (m.group(0).count("\n") - 1)
if m.group(0).count("\n") > 0
else m.group(1) + "# jinja",
content,
flags=re.MULTILINE | re.DOTALL,
)

# 3b. Inline Jinja expressions embedded inside a YAML value — replace
# with a short placeholder so the surrounding YAML stays valid.
content = re.sub(r"\{\{\{.*?\}\}\}", "JINJA_EXPRESSION", content)

return content


def main():
if len(sys.argv) != 2:
print(f"Usage: {sys.argv[0]} FILE", file=sys.stderr)
sys.exit(2)

with open(sys.argv[1]) as f:
sys.stdout.write(strip_jinja(f.read()))


if __name__ == "__main__":
main()
Loading