From 2f14d649438bd56cd0ec324fc4bb2548eedd76e9 Mon Sep 17 00:00:00 2001 From: Connor Shea <2977353+connorshea@users.noreply.github.com> Date: Sun, 7 Jun 2026 13:05:43 -0600 Subject: [PATCH 1/2] fix: Improve the performance of the CONTENT_PATTERN regex in RSS parser to avoid backtracking issues. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This improves the performance of the XML stylesheet parsing a good amount. `RSS::Parser.parse` on ``, Ruby 3.4.9: | n (chars) | before (HEAD) | after (fix) | speedup | |-----------|--------------:|------------:|--------:| | 10,000 | 0.131 s | 0.0004 s | ~330× | | 20,000 | 0.508 s | 0.0003 s | ~1,600× | | 40,000 | 1.966 s | 0.0004 s | ~5,000× | | 80,000 | 7.847 s | 0.0007 s | ~11,000× | --- lib/rss/parser.rb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/rss/parser.rb b/lib/rss/parser.rb index e1bcfc5..3c9eeb1 100644 --- a/lib/rss/parser.rb +++ b/lib/rss/parser.rb @@ -391,7 +391,7 @@ def _ns(ns, prefix) ns.fetch(prefix, "") end - CONTENT_PATTERN = /\s*([^=]+)=(["'])([^\2]+?)\2/ + CONTENT_PATTERN = /\G\s*([^=]++)=(["'])([^\2]+?)\2/ # Extract the first name="value" pair from content. # Works with single quotes according to the constant # CONTENT_PATTERN. Return a Hash. From a2ee62a942c61900305264ddd74a337226ce3587 Mon Sep 17 00:00:00 2001 From: Connor Shea <2977353+connorshea@users.noreply.github.com> Date: Tue, 9 Jun 2026 21:13:53 -0400 Subject: [PATCH 2/2] Update lib/rss/parser.rb --- lib/rss/parser.rb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/rss/parser.rb b/lib/rss/parser.rb index 3c9eeb1..1be4caa 100644 --- a/lib/rss/parser.rb +++ b/lib/rss/parser.rb @@ -391,7 +391,7 @@ def _ns(ns, prefix) ns.fetch(prefix, "") end - CONTENT_PATTERN = /\G\s*([^=]++)=(["'])([^\2]+?)\2/ + CONTENT_PATTERN = /\G\s*([^=]+)=(["'])([^\2]+?)\2/ # Extract the first name="value" pair from content. # Works with single quotes according to the constant # CONTENT_PATTERN. Return a Hash.