Skip to content

[pypi] Module extension fails if an extra index returns 404 on root path, even if simpleapi_skip is set #3769

@udaya2899

Description

@udaya2899

🐞 bug report

Affected Rule

The issue is caused by the rule: Module extension pip.parse (specifically the internal simpleapi_download logic used during extension evaluation).

Is this a regression?

Yes, the previous version in which this bug was not present was: 1.9.0 (we are upgrading from 1.9.0 to 2.0.0).

Description

In rules_python v2.0.0, the module extension introduced a step to fetch the list of available packages from. each index to create a mapping and optimize downloads.

This is implemented in python/private/pypi/simpleapi_download.bzl in _get_dist_urls . It unconditionally attempts to download the root page of every index in index_urls (with parse_index = True ) to parse the available packages here:

for index_url in index_urls:
download = read_simpleapi(
ctx = ctx,
attr = attr,
url = _normalize_url("{index_url}/".format(index_url = index_url)),
parse_index = True,
versions = {pkg: None for pkg in sources},
block = block,
**kwargs
)

If an extra index (specified via --extra-index-url in requirements or parsed from lock file) does not support root listing (e.g., it's just a file server hosting wheels and returns 404 Not Found for the root path), the build fails with an IOException .

This failure happens even if the packages expected from that index are listed in simpleapi_skip, because the loop over index_urls is independent of the filtered sources .

🔬 Minimal Reproduction

  1. Set up a repository with rules_python v2.0.0.
  2. Add an extra index URL that returns 404 on the root path (e.g., a file server without a directory listing or index.html).
  3. Add a package that should be fetched from that index to requirements.txt.
  4. Add the package to simpleapi_skip in pip.parse.
  5. Run any bazel command that triggers module resolution (e.g. bazel build ... ). It will fail trying to fetch the root of the extra index.

🔥 Exception or Error

404 Not Found

🌍 Your Environment

Operating System: linux

Output of bazel version: 8.5.0

Rules_python version: 2.0.0

Anything else relevant?
The issue seems to be that _get_dist_urls in simpleapi_download.bzl assumes all indexes are Simple API compliant and support root listing, and doesn't check if all packages for an index are skipped before probing it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions