Skip to content

test_itertools: non-ASCII temp dir name corrupted under Windows mbcs codepage and not removed, causing FileNotFoundError in subsequent tests #146202

@Aasyaco

Description

@Aasyaco

Windows CI failure: test_itertools corrupts Unicode temp directory name under mbcs encoding, breaks subsequent tests

Summary

On Windows (official Azure CI), the CPython 3.15 test suite produces cascading failures originating from test_itertools. The test creates a temporary directory whose name contains a non-ASCII character. Because the Windows default filesystem encoding is mbcs, the Unicode byte sequence is misinterpreted and the directory name becomes corrupted on disk. The directory is never cleaned up, which causes all subsequent tests that create temporary directories to fail with FileNotFoundError.


CPython Version

Python 3.15.0a7 (main branch)

Operating System / Platform

Windows (Azure Pipelines Microsoft-hosted runner)

Filesystem Encoding

>>> import sys
>>> sys.getfilesystemencoding()
'mbcs'

CI Pipeline Context

Build step:

PCbuild\build.bat -e $(buildOpt)

Test step:

PCbuild\rt.bat -q -uall -u-cpu -rwW --slowest --timeout=1200 -j0 \
  --junit-xml="$(Build.BinariesDirectory)\test-results.xml" \
  --tempdir="$(Build.BinariesDirectory)\test"

Azure DevOps build log:
https://dev.azure.com/Python/cpython/_build/results?buildId=168864&view=logs&j=c8a71634-e5ec-54a0-3958-760f4148b765&t=ddcdae4e-111a-5c2a-2289-6b784c553924


Steps to Reproduce

  1. Check out the CPython main branch.
  2. Build on Windows:
    PCbuild\build.bat -e <buildOpt>
    
  3. Run the test suite:
    PCbuild\rt.bat -q -uall -u-cpu -rwW --slowest --timeout=1200 ^
        -j0 --tempdir="%TEMP%"
    
  4. Observe a directory with a corrupted name (e.g. packageæ/) left behind in the temp directory.
  5. Observe that subsequent tests fail with FileNotFoundError.

Expected Behavior

  • All tests pass cleanly on Windows during CI builds.
  • No test leaves behind artifacts that interfere with other tests.
  • Test isolation is preserved — temporary directories are created and removed correctly.
  • Unicode filenames are handled consistently regardless of platform filesystem encoding.

Actual Behavior

  • test_itertools creates a directory whose name contains a non-ASCII character (e.g. æ).
  • Under the Windows default codepage (mbcs), the Unicode name is misinterpreted: æ (U+00E6) is stored as æ.
  • The corrupted directory is not removed after test_itertools completes.
  • Subsequent tests (e.g. test_class, test_pdb) fail when attempting to create temporary directories because the corrupted entry causes os_helper.temp_dir to raise FileNotFoundError.
  • The runner reports a warning for test_itertools and marks unrelated tests as failed, significantly increasing triage cost.

Relevant Log Excerpt

From Azure build 168864:

Warning -- files was modified by test_itertools
After: ... 'packageæ/'

test_class failed (env changed)
FileNotFoundError: [WinError 3] unable to create temporary directory:
'D:\a\1\s\build\test_python_9140æ'

Root Cause (Analysis)

Two independent problems combine to produce this failure:

1. Encoding mismatch in test_itertools

The test constructs or receives a directory name containing a non-ASCII character. On Windows, sys.getfilesystemencoding() returns 'mbcs' rather than 'utf-8'. The UTF-8 byte sequence for æ (0xC3 0xA6) is reinterpreted under the ANSI code page as two separate characters, producing the visible corruption æ. This is a violation of cross-platform Unicode path handling requirements.

2. Missing cleanup in test_itertools

The test creates the directory but does not register it for cleanup (e.g. via addCleanup, self.addCleanup, or a context manager such as tempfile.TemporaryDirectory). The corrupted directory persists in the shared --tempdir location for the duration of the entire test run, causing every subsequent call to os_helper.temp_dir to fail with WinError 3.


Impact

Area Description Severity
CI pipeline stability All Windows builds fail after test_itertools runs High
Test isolation Subsequent tests inherit corrupted filesystem state High
Unicode correctness Non-ASCII filenames silently corrupted on Windows Medium
Triage cost Unrelated failures obscure the root cause Medium
Local Windows dev Developers running the full suite locally may be affected Low

Files Involved

  • Lib/test/test_itertools.py — primary location of the encoding issue and missing cleanup
  • Lib/test/support/os_helper.py — secondary impact point (temp_dir context manager)

Linked Resources

Metadata

Metadata

Assignees

No one assigned

    Labels

    OS-windowstestsTests in the Lib/test dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions