Skip to content

Fix stack overflow on self-referential protocol overload filtering#3895

Open
mikeleppane wants to merge 1 commit into
facebook:mainfrom
mikeleppane:fix/protocol-overload-self-recursion-3893
Open

Fix stack overflow on self-referential protocol overload filtering#3895
mikeleppane wants to merge 1 commit into
facebook:mainfrom
mikeleppane:fix/protocol-overload-self-recursion-3893

Conversation

@mikeleppane

Copy link
Copy Markdown
Contributor

Fixes #3893

What

pyrefly aborts with a native stack overflow when it checks certain self-referential protocols. The smallest reproduction is a protocol whose overloaded method annotates self with the protocol itself:

from typing import Protocol, TypeVar, overload

S = TypeVar("S", covariant=True)
R = TypeVar("R", covariant=True)

class Lens(Protocol[S, R]):
    @overload
    def __call__[S2, R2](self: Lens[S2, R2], state: S2, /, value: R2) -> S2: ...
    @overload
    def __call__[S2, R2](self: Lens[S2, R2], state: S2, /) -> R2: ...

class BaseLens(Lens[S, R], Protocol):
    def at(self) -> Lens[S, R]:
        return self

Before this change pyrefly prints thread '<unknown>' has overflowed its stack / fatal runtime error: stack overflow and aborts. After it, the file type-checks with no crash. mypy, pyright, and ty all terminate on this input, so the crash was a pyrefly bug rather than bad code.

It shows up on real projects. pyrefly check src/ on the kUPS library (https://github.com/cusp-ai-oss/kUPS) crashed partway through; it now runs to completion. The crash is a regression since 1.1.1.

Why

When pyrefly checks a value against a protocol, it filters the candidate's overloaded methods by their explicit self: annotations, so an overload whose self: cannot match the receiver does not get to bind type variables it should not. That filter landed in 1.1.1 (#3819).

The filter tests each overload with is_subset_eq(self_type, self_param). When self_param is itself a protocol, that check re-enters protocol conformance, looks up the same overloaded member again, and re-filters the same (self_type, self_param) pair. Nothing breaks the loop, so it recurses until the stack is exhausted.

pyrefly already has two cycle guards for protocol subtyping, and neither catches this one. Both live on Subset, and Solver::is_subset_eq builds a fresh Subset on every call. The recursion passes through is_subset_eq at each turn, so the (got, want) subset cache and the class_protocol_assumptions coinductive set are both empty again by the time the same pair comes back around.

How

The guard goes where the recursion is actually observable. It lives on ThreadState, which persists across the is_subset_eq boundary that resets Subset.

  • pyrefly/lib/alt/answers_solver.rs: a per-thread overload_self_filter_stack of in-progress (self_type, self_param) pairs, plus enter_overload_self_filter and exit_overload_self_filter.
  • pyrefly/lib/alt/class/class_field.rs, in filter_overloads_by_self_type: before testing a protocol-typed self:, record the pair; on re-entry of the same pair, assume the overload applies and return. That is the greatest-fixpoint reading of a recursive protocol, which matches mypy and pyright.

A non-protocol self: cannot start this recursion, so it skips the guard and runs the plain check. That keeps the common path free of the extra bookkeeping, including hot cases like str methods with self: LiteralString overloads. The guard only changes the filter's decision on a real cycle, where keeping the overload is the safe direction. The conformance verdict still comes from the attribute-subset check that runs afterward, so a class that does not satisfy the protocol is still rejected.

This leaves the subtyping solver, the existing Subset cycle guards, and the unbound class-access caller of the same filter unchanged.

Test plan

Three tests in pyrefly/lib/test/protocol.rs:

Test What it pins
test_protocol_overloaded_generic_self_referencing_protocol_terminates the issue repro: a protocol with a self-typed overloaded __call__ type-checks instead of crashing
test_protocol_overloaded_generic_self_mutual_recursion_terminates two protocols whose self: annotations reference each other also terminate
test_protocol_overloaded_generic_self_non_conforming_still_rejected a class that matches __call__ but lacks another required member is still reported as not assignable, so the guard does not hide a real failure

Each terminate test aborts the test runner without the fix. The existing test_protocol_overloaded_method_filtered_by_self (#3819) still passes, so filtering by concrete protocol self-types is unchanged.

  • cargo test -p pyrefly --lib: 5746 passed, 0 failed.
  • Formatting and linting clean.
  • mypy_primer over 69 protocol-heavy and overload-heavy projects (pydantic, pandas-stubs, pandera, sympy, xarray, scipy, attrs, strawberry, kornia, jax, and more): no diagnostic changes.

Summary:
Checking a value against a protocol filters the candidate's overloaded
methods by their explicit `self:` annotations (added in 1.1.1 for facebook#3819).
When an overload's `self:` is itself a protocol -- e.g. `Lens` with
`def __call__[S2, R2](self: Lens[S2, R2], ...)` -- the filter's
`is_subset_eq(self_type, self_param)` re-enters protocol conformance,
re-looks-up the same overloaded member, and re-filters the same
`(self_type, self_param)` pair, recursing without bound. The recursion
crosses `is_subset_eq`, which builds a fresh `Subset` on every call, so
neither the `(got, want)` subset cache nor the `class_protocol_assumptions`
coinductive guard -- both scoped to a single `Subset` -- can observe the
cycle. Real code triggers it: pyrefly aborts with a native stack overflow
on the kUPS library, a regression since 1.1.1.

Add a coinductive guard at the filtering site, stored on `ThreadState` so
it survives the `is_subset_eq`/fresh-`Subset` boundary. Before testing a
protocol-typed `self:`, record the `(self_type, self_param)` pair; on
re-entry of the same pair, assume the overload applies -- the
greatest-fixpoint answer for a self-referential protocol, matching mypy
and pyright. Non-protocol `self:` annotations cannot recurse and skip the
guard entirely, so the common path is unchanged, and the real conformance
verdict is still computed by the following attribute-subset check, so
non-conforming classes are still rejected.

Fixes facebook#3893
@yangdanny97

yangdanny97 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Thanks for investigating this, we did change soem stuff about the solver last version so this seems like a plausible root cause. will review in more detail soon

@meta-codesync

meta-codesync Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@yangdanny97 has imported this pull request. If you are a Meta employee, you can view this in D109305359.

@github-actions

Copy link
Copy Markdown

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@yangdanny97 yangdanny97 self-assigned this Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stack overflow

2 participants