Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions psalm.xml
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,10 @@
<directory name="src/" />
</errorLevel>
</ClassMustBeFinal>
<DeprecatedMethod>
<errorLevel type="suppress">
<directory name="src/" />
</errorLevel>
</DeprecatedMethod>
</issueHandlers>
</psalm>
140 changes: 48 additions & 92 deletions src/Collection/FilesCollection.php
Original file line number Diff line number Diff line change
Expand Up @@ -7,26 +7,17 @@
use Closure;
use Doctrine\Common\Collections\ArrayCollection;
use GrumPHP\Util\Regex;
use Symfony\Component\Finder\Comparator;
use Symfony\Component\Finder\Iterator;
use SplFileInfo;
use Symfony\Component\Finder\SplFileInfo as SymfonySplFileInfo;
use Traversable;

/**
* @extends ArrayCollection<array-key, \SplFileInfo>
* @extends ArrayCollection<array-key, string>
*/
class FilesCollection extends ArrayCollection
{
/**
* @param \Traversable<array-key, \SplFileInfo> $iterator
*/
public static function fromTraversable(\Traversable $iterator): FilesCollection
{
return new self(array_values(iterator_to_array($iterator)));
}

/**
* @deprecated use FilesCollectionFilter::name directly
*
* Adds a rule that files must match.
*
* You can use a pattern (delimited with / sign), a glob or a simple string.
Expand All @@ -39,10 +30,12 @@ public static function fromTraversable(\Traversable $iterator): FilesCollection
*/
public function name($pattern): self
{
return $this->names([$pattern]);
return FilesCollectionFilter::name($this, $pattern);
}

/**
* @deprecated use FilesCollectionFilter::names directly
*
* Adds rules that files must match.
*
* You can use patterns (delimited with / sign), globs or simple strings.
Expand All @@ -53,12 +46,12 @@ public function name($pattern): self
*/
public function names(array $patterns): self
{
$filter = new Iterator\FilenameFilterIterator($this->getIterator(), $patterns, []);

return self::fromTraversable($filter);
return FilesCollectionFilter::names($this, $patterns);
}

/**
* @deprecated use FilesCollectionFilter::notName directly
*
* Adds rules that files must match.
*
* You can use patterns (delimited with / sign), globs or simple strings.
Expand All @@ -69,36 +62,36 @@ public function names(array $patterns): self
*/
public function notName(string $pattern): self
{
$filter = new Iterator\FilenameFilterIterator($this->getIterator(), [], [$pattern]);

return self::fromTraversable($filter);
return FilesCollectionFilter::notName($this, $pattern);
}

/**
* @deprecated use FilesCollectionFilter::path directly
*
* Filter by path.
*
* $collection->path('/^spec\/')
*/
public function path(string $pattern): self
{
return $this->paths([$pattern]);
return FilesCollectionFilter::path($this, $pattern);
}

/**
* @deprecated use FilesCollectionFilter::paths directly
*
* Filter by paths.
*
* $collection->paths(['/^spec\/','/^src\/'])
*
* @psalm-suppress ArgumentTypeCoercion - Works on int, \SplFileInfo as well
*/
public function paths(array $patterns): self
{
$filter = new Iterator\PathFilterIterator($this->getIterator(), $patterns, []);

return self::fromTraversable($filter);
return FilesCollectionFilter::paths($this, $patterns);
}

/**
* @deprecated use FilesCollectionFilter::notPath directly
*
* Adds rules that filenames must not match.
*
* You can use patterns (delimited with / sign) or simple strings.
Expand All @@ -107,54 +100,50 @@ public function paths(array $patterns): self
*/
public function notPath(string $pattern): self
{
return $this->notPaths([$pattern]);
return FilesCollectionFilter::notPath($this, $pattern);
}

/**
* @deprecated use FilesCollectionFilter::notPaths directly
*
* Adds rules that filenames must not match.
*
* You can use patterns (delimited with / sign) or simple strings.
*
* $collection->notPaths(['/^spec\/','/^src\/'])
*
* @psalm-suppress ArgumentTypeCoercion - Works on int, \SplFileInfo as well
*/
public function notPaths(array $pattern): self
{
$filter = new Iterator\PathFilterIterator($this->getIterator(), [], $pattern);

return self::fromTraversable($filter);
return FilesCollectionFilter::notPaths($this, $pattern);
}

/**
* @deprecated use FilesCollectionFilter::extensions directly
*/
public function extensions(array $extensions): self
{
if (!\count($extensions)) {
return new self();
}

return $this->name(sprintf('/\.(%s)$/i', implode('|', $extensions)));
return FilesCollectionFilter::extensions($this, $extensions);
}

/**
* @deprecated use FilesCollectionFilter::size directly
*
* Adds tests for file sizes.
*
* $collection->filterBySize('> 10K');
* $collection->filterBySize('<= 1Ki');
* $collection->filterBySize(4);
*
*
*
* @see NumberComparator
*/
public function size(string $size): self
{
$comparator = new Comparator\NumberComparator($size);
$filter = new Iterator\SizeRangeFilterIterator($this->getIterator(), [$comparator]);

return self::fromTraversable($filter);
return FilesCollectionFilter::size($this, $size);
}

/**
* @deprecated use FilesCollectionFilter::date directly
*
* Adds tests for file dates (last modified).
*
* The date must be something that strtotime() is able to parse:
Expand All @@ -164,26 +153,21 @@ public function size(string $size): self
* $collection->filterByDate('> now - 2 hours');
* $collection->filterByDate('>= 2005-10-15');
*
*
*
* @see DateComparator
*/
public function date(string $date): self
{
$comparator = new Comparator\DateComparator($date);
$filter = new Iterator\DateRangeFilterIterator($this->getIterator(), [$comparator]);

return self::fromTraversable($filter);
return FilesCollectionFilter::date($this, $date);
}

/**
* @deprecated use FilesCollectionFilter::filter directly
*
* Filters the iterator with an anonymous function.
*
* The anonymous function receives a \SplFileInfo and must return false
* to remove files.
*
*
*
* @see CustomFilterIterator
*
* @psalm-suppress LessSpecificImplementedReturnType
Expand All @@ -192,26 +176,20 @@ public function date(string $date): self
*/
public function filter(Closure $p): self
{
$filter = new Iterator\CustomFilterIterator($this->getIterator(), [$p]);

return self::fromTraversable($filter);
return FilesCollectionFilter::filter($this, $p);
}

/**
* @deprecated use FilesCollectionFilter::filterByFileList directly
*
* @param Traversable<array-key, SplFileInfo> $fileList
*/
public function filterByFileList(Traversable $fileList): FilesCollection
public function filterByFileList(Traversable $fileList): self
{
$allowedFiles = array_map(function (SplFileInfo $file) {
return $file->getPathname();
}, iterator_to_array($fileList));

return $this->filter(function (SplFileInfo $file) use ($allowedFiles) {
return \in_array($file->getPathname(), $allowedFiles, true);
});
return FilesCollectionFilter::filterByFileList($this, $fileList);
}

public function ensureFiles(self $files): FilesCollection
public function ensureFiles(self $files): self
{
$newFiles = new self($this->toArray());

Expand All @@ -224,49 +202,27 @@ public function ensureFiles(self $files): FilesCollection
return $newFiles;
}

public function ignoreSymlinks(): FilesCollection
/**
* @deprecated use FilesCollectionFilter::ignoreSymlinks directly
*/
public function ignoreSymlinks(): self
{
return $this->filter(function (SplFileInfo $file) {
return !$file->isLink();
});
return FilesCollectionFilter::ignoreSymlinks($this);
}

/**
* SplFileInfo cannot be serialized. Therefor, we help PHP a bit.
* This stuff is used for running tasks in parallel.
*/
public function __serialize(): array
{
return $this->map(function (SplFileInfo $fileInfo): string {
return $fileInfo instanceof SymfonySplFileInfo
? $fileInfo->getRelativePathname()
: $fileInfo->getPathname();
})->toArray();
return $this->toArray();
}

/**
* SplFileInfo cannot be serialized. Therefor, we help PHP a bit.
* This stuff is used for running tasks in parallel.
*/
public function __unserialize(array $data): void
{
$files = $data;
$this->clear();
foreach ($files as $file) {
$this->add(new SymfonySplFileInfo($file, dirname($file), $file));
foreach ($data as $path) {
$this->add($path);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature however is ArrayCollection<array-key, \SplFileInfo>.
Now it also contains strings making it result in broken public API contracts (see e.g; the getFirst() vs first() and optionally similar methods like last, current, ...)

I'm also wondering: now every time an iterator goes over it, it will be hydrated.
So if you have more tasks (or file filters), it will always increase memory.

This makes me wonder: what is the ACTUAL memory issue that we are trying to solve?
Can we find a better solution than keeping it as strings intermediate?

It also has been a while but: what benifit does SymfonySplFileInfo give over regular SplFileInfo here. Which one does add the overhead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment. It allowed me to rethink the solution and actually understand where the problem is. As you already indicated, FilesCollection was holding SplFileInfo objects, and I broke the contract by adding pure paths there. I understood that the main problem is that FilesCollection has two responsibilities: holding the array of files and filtering the collection.

Now I separated those responsibilities into FilesCollection, which holds the array of strings, and FilesCollectionFilter, which filters the FilesCollection. The filter is converting the paths from strings to SymfonySplFileInfo, doing the filtering and returning the FilesCollection with strings. The SymfonySplFileInfo objects are garbaged immediately after filtering. Also we must use it for filtering instead of \SplFileInfo.

This solution is fully backwards-compatible, but I marked using the filtering methods in FilesCollection as deprecated, and it should be removed in v3. The result? Previously, my app was taking 126 MBs in peak memory usage. Now it's only 50 MBs for the same suite. I had to migrate a few places which were directly working on SplFileInfo objects to work with file paths. I haven't fixed phpspec yet, and the tasks themselves, but that can be covered later. Waiting for your opinion on that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider this a massive breaking change in the contract of the FilesCollection, especially since its not final. + You added a new 'filter' collection class that isnt really a filter, but rather another type of files collection.

I don't think this is the way forward to be fair.

@ckuran ckuran Jun 12, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's not a small change in FilesCollection, but the whole solution is in the final state. The contract for public methods in FilesCollection was not changed so that's why I consider this solution backward-compatible. The new 'filter' class can be renamed but basically it takes the FilesCollection as an argument and returns filtered FilesCollection. That was the easiest way to make it backward compatible. Running bin/grumphp run returns only failing phpspec, which is todo after we agree what next. If you have a better idea how to organize this change, please let me know.

}
}

/**
* Help Psalm out a bit:
*
* @return \ArrayIterator<array-key, SplFileInfo>
*/
public function getIterator(): \ArrayIterator
{
return new \ArrayIterator($this->toArray());
}

public function toFileList(): string
{
return \implode(PHP_EOL, $this->toArray());
Expand Down
Loading