Core: Fix namespace URL encoding to use %20 instead of + for spaces#15948
Core: Fix namespace URL encoding to use %20 instead of + for spaces#15948osscm wants to merge 1 commit intoapache:mainfrom
Conversation
RESTUtil.encodeNamespace() called encodeString() which uses Java URLEncoder. This follows application/x-www-form-urlencoded rules and encodes spaces as +. While correct for form data and OAuth2, URL path segments require RFC 3986 percent-encoding where spaces must be %20. Servers receiving /v1/namespaces/a+b treat + as a literal character, not a space, causing namespace lookup failures in catalogs such as Polaris. Fix by replacing + with %20 in encodeNamespace() only, leaving encodeString() unchanged so form data and OAuth2 encoding is unaffected. URLDecoder.decode() already handles both + and %20 as spaces, so decoding remains backward compatible. Fixes apache#14263
| String separator = "%2E"; | ||
|
|
||
| // single-level namespace with a space | ||
| Namespace singleLevel = Namespace.of("a b"); |
There was a problem hiding this comment.
| Namespace singleLevel = Namespace.of("a b"); | |
| Namespace singleLevel = Namespace.of("my namespace"); |
There was a problem hiding this comment.
we should also verify encoding/decoding Namespace.of("my+namespace with+spaces") in a separate test to check that the non-encoded namespace with space and a + in its name properly can be encoded/decoded
| .as("space must be encoded as %%20, not +") | ||
| .isEqualTo("a%20b") | ||
| .doesNotContain("+"); | ||
| assertThat(RESTUtil.decodeNamespace(encoded, separator)).isEqualTo(singleLevel); |
There was a problem hiding this comment.
we also need to test that decoding the previous way can be properly decoded:
String legacyEncoded = "my+namespace";
assertThat(RESTUtil.decodeNamespace(legacyEncoded, separator)).isEqualTo(singleLevel);
| .as("spaces in every level must be encoded as %%20, not +") | ||
| .isEqualTo("my%20namespace%2Emy%20schema") | ||
| .doesNotContain("+"); | ||
| assertThat(RESTUtil.decodeNamespace(encodedMulti, separator)).isEqualTo(multiLevel); |
There was a problem hiding this comment.
same as above, we need to make sure that a legacy encoded namespaces can still be decoded properly
|
|
||
| for (int i = 0; i < levels.length; i++) { | ||
| encodedLevels[i] = encodeString(levels[i]); | ||
| encodedLevels[i] = encodeString(levels[i]).replace("+", "%20"); |
There was a problem hiding this comment.
We also need a test for encodeFormData() to verify application/x-www-form-urlencoded is properly used. Additionally, we need some more tests for all paths being constructed via ResourcePaths
|
we should maybe have a separate that is used for namespace and other path encoding from ResourcePaths |
RESTUtil.encodeNamespace()calledencodeString()which uses Java URLEncoder.This follows
application/x-www-form-urlencodedrules and encodes spaces as+.While correct for form data and OAuth2, URL path segments require RFC 3986 percent-encoding where spaces must be
%20. Servers receiving/v1/namespaces/a+btreat+as a literal character, not a space, causing namespace lookup failures in catalogs such as Polaris.Fix by replacing
+with%20in encodeNamespace() only, leaving encodeString() unchanged so form data and OAuth2 encoding is unaffected. URLDecoder.decode() already handles both+and%20as spaces, so decoding remains backward compatible.Fixes #14263