Skip to content

Commit 8b5ca89

Browse files
committed
fix: ifemp round-trip + stale docstrings from linear-scan refactor
One logic fix and a sweep of stale references left over from the regex-to-scan rewrite. ifemp round-trip (_scan_prefix): the name-continuation guard rejected the empty-value case when the template's next literal started with a non-stop-char. api{;key}X{+rest} with key='' expands to api;keyX/tail but matched None because 'X' after ;key was treated as a name continuation. Now checks whether the next literal starts at the current position before rejecting. Doc/style cleanups: - match() docstring: 'regex derived from the template' -> 'linear scan' - _split_query_tail: 'strict regex' -> 'strict scan' - test comments: 5x 'regex' -> 'scan' - DEFAULT_RESOURCE_SECURITY: docstring now mentions null-byte rejection - migration.md: describe client-visible 'Unknown resource' error rather than the internal ResourceSecurityError type - _Atom type alias: remove unnecessary string quoting - UriTemplate fields: list[...] not tuple[..., ...] — arbitrary-sized tuples are not a defence worth having
1 parent f9aa92b commit 8b5ca89

File tree

4 files changed

+54
-35
lines changed

4 files changed

+54
-35
lines changed

docs/migration.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -552,13 +552,13 @@ Several behaviors have changed:
552552

553553
**Path-safety checks applied by default.** Extracted parameter values
554554
containing `..` as a path component, a null byte, or looking like an
555-
absolute path (`/etc/passwd`, `C:\Windows`) now raise
556-
`ResourceSecurityError` and halt template iteration — a strict
557-
template's rejection no longer falls through to a later permissive
558-
template. This is checked on the decoded value, so `..%2Fetc`,
559-
`%2E%2E`, and `%00` are caught too. Note that `..` is only flagged as
560-
a standalone path component, so values like `v1.0..v2.0` or
561-
`HEAD~3..HEAD` are unaffected.
555+
absolute path (`/etc/passwd`, `C:\Windows`) now cause the read to
556+
fail — the client receives an "Unknown resource" error and template
557+
iteration stops, so a strict template's rejection does not fall
558+
through to a later permissive template. This is checked on the
559+
decoded value, so `..%2Fetc`, `%2E%2E`, and `%00` are caught too.
560+
Note that `..` is only flagged as a standalone path component, so
561+
values like `v1.0..v2.0` or `HEAD~3..HEAD` are unaffected.
562562

563563
If a parameter legitimately needs to receive absolute paths or
564564
traversal sequences, exempt it:

src/mcp/server/mcpserver/resources/templates.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ def validate(self, params: Mapping[str, str | list[str]]) -> str | None:
8080

8181

8282
DEFAULT_RESOURCE_SECURITY = ResourceSecurity()
83-
"""Secure-by-default policy: traversal and absolute paths rejected."""
83+
"""Secure-by-default policy: traversal, absolute paths, and null bytes rejected."""
8484

8585

8686
class ResourceSecurityError(ValueError):

src/mcp/shared/uri_template.py

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ class _Cap:
169169
ifemp: bool = False
170170

171171

172-
_Atom: TypeAlias = "_Lit | _Cap"
172+
_Atom: TypeAlias = _Lit | _Cap
173173

174174

175175
def _is_greedy(var: Variable) -> bool:
@@ -286,12 +286,12 @@ class UriTemplate:
286286
"""
287287

288288
template: str
289-
_parts: tuple[_Part, ...] = field(repr=False, compare=False)
290-
_variables: tuple[Variable, ...] = field(repr=False, compare=False)
291-
_prefix: tuple[_Atom, ...] = field(repr=False, compare=False)
289+
_parts: list[_Part] = field(repr=False, compare=False)
290+
_variables: list[Variable] = field(repr=False, compare=False)
291+
_prefix: list[_Atom] = field(repr=False, compare=False)
292292
_greedy: Variable | None = field(repr=False, compare=False)
293-
_suffix: tuple[_Atom, ...] = field(repr=False, compare=False)
294-
_query_variables: tuple[Variable, ...] = field(repr=False, compare=False)
293+
_suffix: list[_Atom] = field(repr=False, compare=False)
294+
_query_variables: list[Variable] = field(repr=False, compare=False)
295295

296296
@staticmethod
297297
def is_template(value: str) -> bool:
@@ -356,12 +356,12 @@ def parse(
356356

357357
return cls(
358358
template=template,
359-
_parts=tuple(parts),
360-
_variables=tuple(variables),
361-
_prefix=tuple(prefix),
359+
_parts=parts,
360+
_variables=variables,
361+
_prefix=prefix,
362362
_greedy=greedy,
363-
_suffix=tuple(suffix),
364-
_query_variables=tuple(query_vars),
363+
_suffix=suffix,
364+
_query_variables=query_vars,
365365
)
366366

367367
@property
@@ -436,8 +436,8 @@ def expand(self, variables: Mapping[str, str | Sequence[str]]) -> str:
436436
def match(self, uri: str, *, max_uri_length: int = DEFAULT_MAX_URI_LENGTH) -> dict[str, str | list[str]] | None:
437437
"""Match a concrete URI against this template and extract variables.
438438
439-
This is the inverse of :meth:`expand`. The URI is matched against
440-
a regex derived from the template and captured values are
439+
This is the inverse of :meth:`expand`. The URI is matched via a
440+
linear scan of the template and captured values are
441441
percent-decoded. The round-trip ``match(expand({k: v})) == {k: v}``
442442
holds when ``v`` does not contain its operator's separator
443443
unencoded: ``{.ext}`` with ``ext="tar.gz"`` expands to
@@ -446,7 +446,7 @@ def match(self, uri: str, *, max_uri_length: int = DEFAULT_MAX_URI_LENGTH) -> di
446446
inherent reversal limitation.
447447
448448
Matching is structural at the URI level only: a simple ``{name}``
449-
will not match across a literal ``/`` in the URI (the regex stops
449+
will not match across a literal ``/`` in the URI (the scan stops
450450
there), but a percent-encoded ``%2F`` that decodes to ``/`` is
451451
accepted as part of the value. Path-safety validation belongs at
452452
a higher layer; see :mod:`mcp.shared.path_security`.
@@ -483,7 +483,7 @@ def match(self, uri: str, *, max_uri_length: int = DEFAULT_MAX_URI_LENGTH) -> di
483483
Args:
484484
uri: A concrete URI string.
485485
max_uri_length: Maximum permitted length of the input URI.
486-
Oversized inputs return ``None`` without regex evaluation,
486+
Oversized inputs return ``None`` without scanning,
487487
guarding against resource exhaustion.
488488
489489
Returns:
@@ -643,7 +643,7 @@ def _split_query_tail(parts: list[_Part]) -> tuple[list[_Part], list[Variable]]:
643643
expressions and the preceding path portion contains no literal
644644
``?``. If the path has a literal ``?`` (e.g., ``?fixed=1{&page}``),
645645
the URI's ``?`` split won't align with the template's expression
646-
boundary, so strict regex matching is used instead.
646+
boundary, so the strict scan is used instead.
647647
648648
Returns:
649649
A pair ``(path_parts, query_vars)``. If lenient matching does
@@ -671,8 +671,8 @@ def _split_query_tail(parts: list[_Part]) -> tuple[list[_Part], list[Variable]]:
671671

672672
# If the path portion contains a literal ?/# or a {?...}/{#...}
673673
# expression, lenient matching's partition("#") then partition("?")
674-
# would strip content the path regex expects to see. Fall back to
675-
# strict regex.
674+
# would strip content the path scan expects to see. Fall back to
675+
# the strict scan.
676676
for part in parts[:split]:
677677
if isinstance(part, str):
678678
if "?" in part or "#" in part:
@@ -1024,11 +1024,14 @@ def _scan_prefix(
10241024

10251025
if atom.ifemp:
10261026
# Optional = after ;name. A non-= non-delimiter here means
1027-
# the name continued (e.g. ;keys vs ;key) — reject.
1027+
# the name continued (e.g. ;keys vs ;key) — reject, unless
1028+
# the template's next literal starts right here, in which
1029+
# case the value is legitimately empty.
10281030
if pos < limit and uri[pos] == "=":
10291031
pos += 1
10301032
elif pos < limit and uri[pos] not in stops:
1031-
return None
1033+
if not (isinstance(nxt, _Lit) and uri.startswith(nxt.text, pos)):
1034+
return None
10321035

10331036
# Latest valid end: the var stops at the first stop-char or
10341037
# the scan limit, whichever comes first.

tests/shared/test_uri_template.py

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -444,15 +444,15 @@ def test_expand_rejects_invalid_value_types(value: object):
444444
("search{?q}", "search#frag", {}),
445445
# Multiple ?/& expressions collected together
446446
("api{?v}{&page,limit}", "api?limit=10&v=2", {"v": "2", "limit": "10"}),
447-
# Standalone {&var} falls through to strict regex (expands with
448-
# & prefix, no ? for lenient matching to split on)
447+
# Standalone {&var} falls through to the strict scan (expands
448+
# with & prefix, no ? for lenient matching to split on)
449449
("api{&page}", "api&page=2", {"page": "2"}),
450-
# Literal ? in path portion falls through to strict regex
450+
# Literal ? in path portion falls through to the strict scan
451451
("api?x{?page}", "api?x?page=2", {"page": "2"}),
452452
# {?...} expression in path portion also falls through
453453
("api{?q}x{?page}", "api?q=1x?page=2", {"q": "1", "page": "2"}),
454454
# {#...} or literal # in path portion falls through: lenient
455-
# matching would strip the fragment before the path regex sees it
455+
# matching would strip the fragment before the path scan sees it
456456
("page{#section}{?q}", "page#intro?q=x", {"section": "intro", "q": "x"}),
457457
("page#lit{?q}", "page#lit?q=x", {"q": "x"}),
458458
# Empty & segments in query are skipped
@@ -465,7 +465,7 @@ def test_expand_rejects_invalid_value_types(value: object):
465465
("api://x{?token}", "api://x?%74oken=evil&token=real", {"token": "real"}),
466466
("api://x{?token}", "api://x?%74oken=evil", {}),
467467
# Level 3: query continuation with literal ? falls back to
468-
# strict regex (template-order, all-present required)
468+
# the strict scan (template-order, all-present required)
469469
("?a=1{&b}", "?a=1&b=2", {"b": "2"}),
470470
# Explode: path segments as list
471471
("/files{/path*}", "/files/a/b/c", {"path": ["a", "b", "c"]}),
@@ -512,8 +512,8 @@ def test_match_no_match(template: str, uri: str):
512512

513513
def test_match_adjacent_vars_with_prefix_names():
514514
# Two adjacent simple vars where one name is a prefix of the other.
515-
# We use positional capture groups, so names only affect the result
516-
# dict keys, not the regex. Adjacent unrestricted vars are inherently
515+
# Capture positions are ordinal, so names only affect the result
516+
# dict keys, not the scan. Adjacent unrestricted vars are inherently
517517
# ambiguous; greedy * resolution means the first takes everything.
518518
t = UriTemplate.parse("{var}{vara}")
519519
assert t.match("ab") == {"var": "ab", "vara": ""}
@@ -654,6 +654,22 @@ def test_match_prefix_ifemp_rejects_name_continuation():
654654
assert t.match("api;keys/xyz") is None
655655

656656

657+
def test_match_prefix_ifemp_empty_before_non_stop_literal():
658+
# Regression: _scan_prefix rejected the empty-value case when the
659+
# following template literal starts with a non-stop-char. The
660+
# name-continuation guard saw 'X' after ';key' and assumed the
661+
# name continued, but 'X' is the template's next literal.
662+
t = UriTemplate.parse("api{;key}X{+rest}")
663+
# Non-empty round-trips fine:
664+
assert t.match(t.expand({"key": "abc", "rest": "/tail"})) == {"key": "abc", "rest": "/tail"}
665+
# Empty value (ifemp → bare ;key, then X) must also round-trip:
666+
uri = t.expand({"key": "", "rest": "/tail"})
667+
assert uri == "api;keyX/tail"
668+
assert t.match(uri) == {"key": "", "rest": "/tail"}
669+
# But an actual name continuation still rejects:
670+
assert t.match("api;keyZX/tail") is None
671+
672+
657673
def test_match_large_uri_against_greedy_template():
658674
# Large payload against a greedy template — the scan visits each
659675
# character once for the suffix anchor and once for the greedy

0 commit comments

Comments
 (0)