Sanitize bare angle brackets and quotes in markup stripping by assisted-by-ai · Pull Request #34 · Kicksecure/helper-scripts

assisted-by-ai · 2026-04-02T09:34:21Z

Summary

Enhanced the markup stripping and sanitization logic to handle additional security edge cases by sanitizing bare angle brackets (< and >) that are not part of valid HTML tags, and quote characters in malicious input patterns.

Key Changes

Expanded character sanitization in strip_markup_lib.py:
- Modified the malicious input detection branch to sanitize quote characters (" and ') in addition to <, >, and & to prevent attribute injection attacks
- Added a second sanitization pass after both strip attempts to catch bare < and > characters that survived the HTML parser (e.g., comparison operators like 2 < 3)
- Restructured the conditional logic for clarity and to apply both sanitization strategies
Added comprehensive test coverage:
- New _test_bare_angle_bracket_strings() test helper to verify sanitization of bare angle brackets in mathematical/comparison expressions
- New _test_malicious_markup_quote_strings() test helper to verify quote character sanitization in malicious input patterns
- Added corresponding test methods in both TestStripMarkup and TestSanitizeString test classes
- Updated existing test expectations in sanitize_string tests to reflect the new quote sanitization behavior

Implementation Details

The fix addresses two distinct security concerns:

Bare angle brackets: Characters like < and > in expressions such as 2 < 3 are not recognized as HTML tags by Python's HTMLParser, but could be misinterpreted if the output is later placed in an HTML context
Quote injection: Malicious inputs with quotes (e.g., <<b>b "onmouseover="alert(1)<</b>/b>) are now fully sanitized to prevent attribute injection attacks

Both sanitization strategies work together: the malicious input branch handles quotes, while the final pass handles any remaining bare angle brackets.

https://claude.ai/code/session_01Wjn2KfitiA5iTADcLLfdbR

…utput strip_markup had two security issues: 1. The malicious-input sanitization branch only replaced <, >, & with underscores but not " and ', which could allow HTML attribute injection if output is placed in an attribute context. 2. Bare < and > characters that HTMLParser doesn't treat as tags (e.g., "2 < 3") passed through both strip passes unchanged and were returned unsanitized, potentially dangerous in HTML contexts. Now quotes are sanitized in the malicious branch, and remaining < / > are always sanitized in the final output. https://claude.ai/code/session_01Wjn2KfitiA5iTADcLLfdbR

ArrayBolt3 · 2026-04-08T02:52:33Z

Rejected, these changes are out-of-scope. strip-markup is intended to ensure that a string won't be interpreted as HTML markup in isolation. It isn't intended to prevent multiple adjacent strings from forming something weird, and it isn't intended to prepare text for use in an HTML attribute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sanitize bare angle brackets and quotes in markup stripping#34

Sanitize bare angle brackets and quotes in markup stripping#34
assisted-by-ai wants to merge 1 commit intoKicksecure:masterfrom
assisted-by-ai:claude/audit-strip-markup-security-I6mmm

assisted-by-ai commented Apr 2, 2026

Uh oh!

ArrayBolt3 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

assisted-by-ai commented Apr 2, 2026

Summary

Key Changes

Implementation Details

Uh oh!

ArrayBolt3 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants