Skip to content

Fix a Windows issue where Python codepage would be reverted from unicode to cp1252#26972

Open
juj wants to merge 1 commit into
emscripten-core:mainfrom
juj:fix_windows_python_cp1252
Open

Fix a Windows issue where Python codepage would be reverted from unicode to cp1252#26972
juj wants to merge 1 commit into
emscripten-core:mainfrom
juj:fix_windows_python_cp1252

Conversation

@juj
Copy link
Copy Markdown
Collaborator

@juj juj commented May 16, 2026

Fix a Windows issue where Python codepage would be reverted from unicode to cp1252, if stdout/stderr was being redirected to a file.

To fix this issue, pass the -X utf8 command line parameter whenever python -E flag is being used.

Fixes test other.test_wasm_sourcemap_relative_paths on Windows when build is driven by Buildbot CI. See buildbot/buildbot#9047 for related info.

…ode to cp1252, if stdout/stderr was being redirected to a file. Fixes test other.test_wasm_sourcemap_relative_paths on Windows when build is driven by Buildbot CI. See buildbot/buildbot#9047 for related info.
@juj juj enabled auto-merge (squash) May 16, 2026 21:15
@sbc100
Copy link
Copy Markdown
Collaborator

sbc100 commented May 17, 2026

Is this because you are settings PYTHONUTF8 and -E is then ignoring it?

@juj
Copy link
Copy Markdown
Collaborator Author

juj commented May 17, 2026

No. I do not set PYTHONUTF8=1 on my CI.

I first tried setting PYTHONUTF8=1 as the fix, but it did nothing. (which is expected, since -E is being passed)

@sbc100
Copy link
Copy Markdown
Collaborator

sbc100 commented May 17, 2026

Is there any reason somebody might what to write something other than utf-8 to stdout/stderr?

What kind of output are were generating that is non-acsii? i.e. which test fails?

I'm a little worryied that we could break some other use case here because -X utf8 also ignores the system encoding. As well as effecting stdout/stderr it apparently also effects sys.getfilesystemencoding(), locale.getpreferredencoding().

On the other hand we go out our way to always write files explictly in utf-8 in almost all cases so maybe this fine?

There is one place we specifically do something different: expand_response_file in tools/response_file.py. However looking at that funcion it looks like the attached comment regarding locale.getpreferredencoding might be out-of-date?

@sbc100
Copy link
Copy Markdown
Collaborator

sbc100 commented May 17, 2026

Yup, it looks like the comment in response_file.py #15426 was maybe never accurate?

@juj
Copy link
Copy Markdown
Collaborator Author

juj commented May 18, 2026

Is there any reason somebody might what to write something other than utf-8 to stdout/stderr?

That I don't know an answer to. Currently I don't know of a use case here.

What kind of output are were generating that is non-acsii? i.e. which test fails?

Test other.test_wasm_sourcemap_relative_paths fails on Windows when build is driven by Buildbot CI. It attempts to print the name of a file during the test to stdout:

test('A ä☃ö Z.cpp')

which in this test is 'A ä☃ö Z.cpp'.

I'm a little worryied that we could break some other use case here because -X utf8 also ignores the system encoding.

If system encoding is CP437 or CP1252, then if one attempts to print() a character that is not part of either of these encodings, the Python print() function will throw. That would cause an exception from anywhere from the internals of Emscripten that happened to contain a unicode character as part of filename, or as part of a source file.

See e.g. http://clbri.com:8010/api/v2/logs/444660/raw_inline where it happened in the test.

Yup, it looks like the comment in response_file.py #15426 was maybe never accurate?

Not sure which comment?

Iiuc encoding of response files is somewhat orthogonal to the encoding of stdout/stderr streams?

I think the options we have here are to either run as -X utf8, or alternatively monkeypatch Python's print() so that it doesn't throw when attempting to print a character that cannot be encoded in the current stdout/stderr codepage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants