Encoding decoder tests don't test codepoints

E.g. `new TextDecoder('shift-jis').decode(Uint8Array.of(0x5C))` is `\`, not `¥` per spec and in all major browsers.
But test data has `<span data-cp="A5" data-bytes="5C">`

The mismatch is also noticeable in the test runner, on the very second row -- symbols are different + assert shows not a `¥` sign:

<img width="1660" height="520" alt="Image" src="https://github.com/user-attachments/assets/3878fc0d-26ce-4e2b-9246-6d03d7b4343d" />

Also e.g. `new TextDecoder('iso-2022-jp').decode(Uint8Array.of(0x1B, 0x24, 0x42, 0x21, 0x5D, 0x1B, 0x28, 0x42)).codePointAt().toString(16)` is `ff0d` and not `2212` as in tests:

https://github.com/web-platform-tests/wpt/blob/4f21b1109a9393d1da22d24c87c5da30167d1b11/encoding/legacy-mb-japanese/iso-2022-jp/iso2022jp_chars.html#L177

Those don't cause failures because the test runner doesn't test bytes -> charcode decoding.
What it tests here is that html decoding of bytes matches decoder decoding of the same bytes, but nothing compares them to the codepoint in the test data:

https://github.com/web-platform-tests/wpt/blob/2a1fdfaaeadea7bea70a551785aa5c8e68f69f7c/encoding/resources/decode-common.js#L32-L35

(cc @domenic?)

	assert_equals(
	nodes[i].textContent,
	decoder(nodes[i].dataset.bytes)
	);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Encoding decoder tests don't test codepoints #56748

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Encoding decoder tests don't test codepoints #56748

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions