Skip to content

Core: Fix namespace URL encoding to use %20 instead of + for spaces#15948

Open
osscm wants to merge 1 commit intoapache:mainfrom
osscm:fix-namespace-url-encoding
Open

Core: Fix namespace URL encoding to use %20 instead of + for spaces#15948
osscm wants to merge 1 commit intoapache:mainfrom
osscm:fix-namespace-url-encoding

Conversation

@osscm
Copy link
Copy Markdown

@osscm osscm commented Apr 12, 2026

RESTUtil.encodeNamespace() called encodeString() which uses Java URLEncoder.

This follows application/x-www-form-urlencoded rules and encodes spaces as +.
While correct for form data and OAuth2, URL path segments require RFC 3986 percent-encoding where spaces must be %20. Servers receiving /v1/namespaces/a+b treat + as a literal character, not a space, causing namespace lookup failures in catalogs such as Polaris.

Fix by replacing + with %20 in encodeNamespace() only, leaving encodeString() unchanged so form data and OAuth2 encoding is unaffected. URLDecoder.decode() already handles both + and %20 as spaces, so decoding remains backward compatible.

Fixes #14263

RESTUtil.encodeNamespace() called encodeString() which uses Java URLEncoder.
This follows application/x-www-form-urlencoded rules and encodes spaces as +.
While correct for form data and OAuth2, URL path segments require RFC 3986
percent-encoding where spaces must be %20. Servers receiving /v1/namespaces/a+b
treat + as a literal character, not a space, causing namespace lookup failures
in catalogs such as Polaris.

Fix by replacing + with %20 in encodeNamespace() only, leaving encodeString()
unchanged so form data and OAuth2 encoding is unaffected. URLDecoder.decode()
already handles both + and %20 as spaces, so decoding remains backward compatible.

Fixes apache#14263
@github-actions github-actions bot added the core label Apr 12, 2026
@osscm
Copy link
Copy Markdown
Author

osscm commented Apr 12, 2026

cc @nastra @omarsmak please review.

String separator = "%2E";

// single-level namespace with a space
Namespace singleLevel = Namespace.of("a b");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Namespace singleLevel = Namespace.of("a b");
Namespace singleLevel = Namespace.of("my namespace");

Copy link
Copy Markdown
Contributor

@nastra nastra Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also verify encoding/decoding Namespace.of("my+namespace with+spaces") in a separate test to check that the non-encoded namespace with space and a + in its name properly can be encoded/decoded

.as("space must be encoded as %%20, not +")
.isEqualTo("a%20b")
.doesNotContain("+");
assertThat(RESTUtil.decodeNamespace(encoded, separator)).isEqualTo(singleLevel);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to test that decoding the previous way can be properly decoded:

String legacyEncoded = "my+namespace";
assertThat(RESTUtil.decodeNamespace(legacyEncoded, separator)).isEqualTo(singleLevel);

.as("spaces in every level must be encoded as %%20, not +")
.isEqualTo("my%20namespace%2Emy%20schema")
.doesNotContain("+");
assertThat(RESTUtil.decodeNamespace(encodedMulti, separator)).isEqualTo(multiLevel);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, we need to make sure that a legacy encoded namespaces can still be decoded properly


for (int i = 0; i < levels.length; i++) {
encodedLevels[i] = encodeString(levels[i]);
encodedLevels[i] = encodeString(levels[i]).replace("+", "%20");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need a test for encodeFormData() to verify application/x-www-form-urlencoded is properly used. Additionally, we need some more tests for all paths being constructed via ResourcePaths

@nastra
Copy link
Copy Markdown
Contributor

nastra commented Apr 13, 2026

we should maybe have a separate

/** Encodes a value for use in a URL path segment (RFC 3986: spaces as %20, not +). */
  public static String encodePathSegment(String toEncode) {
      return encodeString(toEncode).replace("+", "%20");
  }

that is used for namespace and other path encoding from ResourcePaths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Encoding of Namespace having space to +

2 participants