Handle filenames with spaces and batch files in check-file-format check #24
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Instead of invoking
editorconfigonce, we usexargsto invokeeditorconfigmultiple time with batches of up to 1000 files. In the case where there are between 1 and 1000 files to be checked, we will continue to invokeeditorconfigexactly once. In the case where there are no files to be checked, we will not invokeeditorconfigat all; therefore we remove the/dev/nullhack.There are two consequences of using
xargs.First, we cannot run
editorconfigvia a shell function, asxargscannot run shell functions. Instead we must run it via a script. I have taken the approach of havingxargscall the samecheck-file-format.shscript that calls it, but with different flags. I am aware that some people strongly prefer having separate helper scripts forxargsto call. I am happy to rewrite this script in that fashion if that is your preference.Second, instead of providing the list of files via the command line, we provide it via a pipe. This allows us to use the
-0flag ofxargsand the-zflag of the git tooling to pass the filenames as NUL terminated instead of LF terminated, and prevents spaces in filenames from being interpreted as delimiters.In the case where we run
editorconfigvia docker, we further need to shell-escape each filename so that filenames with spaces survive being parsed by the inner shell: hence theprintf '%q 'nonsense.My intent is to squash when I merge.
Context
The BCSS team is migrating its codebase from GitLab to GitHub. As part of that, we have added the contents of this template repository to our own repositories.
We encountered two bugs:
check=all, this script invokeseditorconfig(possibly via docker) with the names of every file in the repository on its command line. This means that with a large enough repository, we hit the limit for the size of a process's command line and environment combined (typically 2MB on Linux but only 256KB on macOS).This PR fixes both bugs.
The same bugs are present in the check-english-usage and check-markdown-format scripts. I'm happy to extend this PR to fix both of those scripts, or to do so in separate PRs.
Type of changes
Checklist
Sensitive Information Declaration
To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.