fix(core): encode reserved path chars in file URIs#151
Conversation
aceae78 to
07196a4
Compare
bug-ops
left a comment
There was a problem hiding this comment.
Review: changes requested
Critical (C1): is the enum discriminant , not the path byte offset. The test only passes because the path is long — fails silently for short paths like .
Minor (M1): Test covers only a long path; a short-path case would catch the bug.
Minor (M2): and backtick are RFC 3986 §2.2 reserved chars omitted from encoding. Unlikely to matter with WHATWG consumers, but worth tracking.
| } | ||
|
|
||
| fn encode_rfc3986_path_chars(uri: &str) -> String { | ||
| let path_start = url::Position::BeforePath as usize; |
There was a problem hiding this comment.
[C1 — critical] url::Position::BeforePath is an enum discriminant (10), not a byte offset in the URI string. For file:///[a].ts the [ sits at byte 8 — it falls in the "prefix" slice and is never encoded. The stated motivation for this fix (Next.js [slug] files) silently fails.
Use the Index trait that the url crate actually provides for this:
fn encode_rfc3986_path_chars(uri: &str) -> String {
let url = Url::parse(uri).expect("encode called with invalid URI");
let prefix = url[..url::Position::BeforePath].to_owned();
let encoded = url[url::Position::BeforePath..]
.replace('[', "%5B")
.replace(']', "%5D")
.replace('^', "%5E")
.replace('|', "%7C");
format!("{prefix}{encoded}")
}| #[cfg(windows)] | ||
| let path = Path::new(r"C:\home\user\routes\api\[...]^|.ts"); | ||
| #[cfg(not(windows))] | ||
| let path = Path::new("/home/user/routes/api/[...]^|.ts"); |
There was a problem hiding this comment.
[M1 — minor] Only a long path is tested; the split_at(10) bug does not surface here. Add a short case:
let path = Path::new("/[a].ts");
let uri = path_to_uri(path);
assert!(uri.as_str().ends_with("%5Ba%5D.ts"));| fn encode_rfc3986_path_chars(uri: &str) -> String { | ||
| let path_start = url::Position::BeforePath as usize; | ||
| let (prefix, path) = uri.split_at(path_start); | ||
| let encoded_path = path |
There was a problem hiding this comment.
[M2 — minor] {, }, and backtick are also RFC 3986 §2.2 reserved characters omitted here. Low risk for WHATWG-compliant consumers (rust-analyzer, pyright), but silently wrong for strict RFC 3986 clients. Tracking in a follow-up issue.
Summary
Fixes #150.
path_to_uriformatted file URIs asfile://{path}. Valid filesystem paths containing URI-reserved characters, such as bracketed route filenames, produced invalid raw URI paths.lsp_types::Uri::from_strthen returnedUnexpectedChar; because this path usedexpect, diagnostics requests could panic and abort the MCP stdio server.This builds platform-aware file URLs with
Url::from_file_path, preserves the existing Windows-rooted synthetic test-path behavior, encodes RFC3986-invalid path characters that WHATWG URL leaves raw in the path ([,],^,|), and adds a regression test for a route-style filename withuri_to_pathround-trip coverage.Validation
cargo +nightly fmt --checkcargo +1.95 test -p mcpls-core path_to_uricargo +1.95 test -p mcpls-corecargo +1.95 clippy -p mcpls-core --all-targets --all-features -- -D warningsrustup target add --toolchain 1.95 x86_64-pc-windows-msvc && cargo +1.95 check -p mcpls-core --target x86_64-pc-windows-msvc --all-targets --all-featuresget_diagnosticson a bracketed route path aborted withfailed to create URI from path: ParseError { ... UnexpectedChar }; after the patch, the same call returned a normal tool response.