Skip to content

Conversation

@smowton
Copy link
Contributor

@smowton smowton commented Feb 11, 2026

This has been in use by autofix for a couple of years, aiming to give the AI better advice-- and in my view, this is better advice for humans too.

Expanded recommendations for validating user input when constructing file paths, including normalization and using allowlists.
@smowton smowton requested a review from a team as a code owner February 11, 2026 12:11
Copilot AI review requested due to automatic review settings February 11, 2026 12:11
@github-actions
Copy link
Contributor

QHelp previews:

python/ql/src/Security/CWE-022/PathInjection.qhelp

Uncontrolled data used in path expression

Accessing files using paths constructed from user-controlled data can allow an attacker to access unexpected resources. This can result in sensitive information being revealed or deleted, or an attacker being able to influence behavior by modifying unexpected files.

Recommendation

Validate paths constructed from untrusted user input before using them to access files.

The choice of validation depends on the use case.

If you want to allow paths spanning multiple folders, a common strategy is to make sure that the constructed file path is contained within a safe root folder. First, normalize the path using os.path.normpath or os.path.realpath to remove any ".." segments. Then check that the normalized path starts with the root folder. Note that the normalization step is important, since otherwise even a path that starts with the root folder could be used to access files outside the root folder.

More restrictive options include using a library function like werkzeug.utils.secure_filename to eliminate any special characters from the file path, or restricting the path to an allow list of safe paths. These options are safe, but can only be used in particular circumstances.

Example

In the first example, a file name is read from an HTTP request and then used to access a file. However, a malicious user could enter a file name that is an absolute path, such as "/etc/passwd".

In the second example, it appears that the user is restricted to opening a file within the "user" home directory. However, a malicious user could enter a file name containing special characters. For example, the string "../../../etc/passwd" will result in the code reading the file located at "/server/static/images/../../../etc/passwd", which is the system's password file. This file would then be sent back to the user, giving them access to all the system's passwords. Note that a user could also use an absolute path here, since the result of os.path.join("/server/static/images/", "/etc/passwd") is "/etc/passwd".

In the third example, the path used to access the file system is normalized before being checked against a known prefix. This ensures that regardless of the user input, the resulting path is safe.

import os.path
from flask import Flask, request, abort

app = Flask(__name__)

@app.route("/user_picture1")
def user_picture1():
    filename = request.args.get('p')
    # BAD: This could read any file on the file system
    data = open(filename, 'rb').read()
    return data

@app.route("/user_picture2")
def user_picture2():
    base_path = '/server/static/images'
    filename = request.args.get('p')
    # BAD: This could still read any file on the file system
    data = open(os.path.join(base_path, filename), 'rb').read()
    return data

@app.route("/user_picture3")
def user_picture3():
    base_path = '/server/static/images'
    filename = request.args.get('p')
    #GOOD -- Verify with normalised version of path
    fullpath = os.path.normpath(os.path.join(base_path, filename))
    if not fullpath.startswith(base_path):
        raise Exception("not allowed")
    data = open(fullpath, 'rb').read()
    return data

References

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Python Path Injection query help to use the newer “safe root folder + normalization” guidance that’s been used by autofix, aiming to provide clearer remediation advice in the CWE-022 documentation.

Changes:

  • Replaces the previous filename-validation bullet list with root-folder containment guidance.
  • Adds discussion of different validation strategies (root containment vs stricter sanitization/allowlisting).
  • Keeps references to werkzeug.utils.secure_filename as a restrictive option.
Comments suppressed due to low confidence (1)

python/ql/src/Security/CWE-022/PathInjection.qhelp:28

  • The recommendation to verify containment by checking that the normalized path "starts with" the root folder is not reliable (e.g., a path like /safe/root2/file starts with /safe/root but is outside the intended directory; on Windows, case/separator differences also make string prefix checks error-prone). Prefer a path-aware containment check such as comparing os.path.commonpath([root, candidate]) == root, or using pathlib.Path(...).resolve() and is_relative_to() (and ensure root is normalized the same way).
If you want to allow paths spanning multiple folders, a common strategy is to make sure that the constructed
file path is contained within a safe root folder. First, normalize the path using <code>os.path.normpath</code> or
<code>os.path.realpath</code> to remove any ".." segments. Then check that the normalized path starts with the
root folder. Note that the normalization step is important, since otherwise even a path that starts with the root
folder could be used to access files outside the root folder.

Comment on lines +25 to +28
file path is contained within a safe root folder. First, normalize the path using <code>os.path.normpath</code> or
<code>os.path.realpath</code> to remove any ".." segments. Then check that the normalized path starts with the
root folder. Note that the normalization step is important, since otherwise even a path that starts with the root
folder could be used to access files outside the root folder.
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saying that os.path.normpath or os.path.realpath can be used to "remove any '..' segments" is misleading from a security perspective: normpath is purely lexical (doesn't make the path absolute or resolve symlinks) and can give a false sense of safety against traversal via symlinks within the root. Consider recommending resolving both root and the constructed path (for example with realpath/Path.resolve) before performing the containment check, and clarifying the symlink caveat if you keep normpath as an option.

This issue also appears on line 24 of the same file.

Suggested change
file path is contained within a safe root folder. First, normalize the path using <code>os.path.normpath</code> or
<code>os.path.realpath</code> to remove any ".." segments. Then check that the normalized path starts with the
root folder. Note that the normalization step is important, since otherwise even a path that starts with the root
folder could be used to access files outside the root folder.
file path is contained within a safe root folder. To do this safely, resolve both the root folder and the
constructed path to absolute, symlink-resolved paths (for example, using <code>os.path.realpath</code> or
<code>pathlib.Path.resolve</code>), and then check that the resolved path is contained within the resolved root.
Note that <code>os.path.normpath</code> only performs lexical normalization (for example, collapsing ".."
segments) and does not resolve symlinks, so it is not sufficient on its own to prevent traversal via symlinks
inside the root, although it can be used as a preliminary cleanup step.

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +34
More restrictive options include using a library function like <code>werkzeug.utils.secure_filename</code> to eliminate
any special characters from the file path, or restricting the path to an allow list of safe paths. These options are
safe, but can only be used in particular circumstances.
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terminology consistency: elsewhere in the Python qhelp docs the term “allowlist” is used (for example python/ql/src/Security/CWE-078/CommandInjection.qhelp and python/ql/src/Security/CWE-020/IncompleteUrlSubstringSanitization.qhelp). Consider using “allowlist” here as well instead of “allow list”.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant