Skip to content

[LLM Classification] Skip duplicate LLM calls on no-op feed updates#470

Open
tcx4c70 wants to merge 1 commit into
FreshRSS:mainfrom
tcx4c70:llm-duplicate-update
Open

[LLM Classification] Skip duplicate LLM calls on no-op feed updates#470
tcx4c70 wants to merge 1 commit into
FreshRSS:mainfrom
tcx4c70:llm-duplicate-update

Conversation

@tcx4c70
Copy link
Copy Markdown

@tcx4c70 tcx4c70 commented Jun 1, 2026

Some RSS sources re-publish existing articles with no semantic change (reformatted author, tweaked date, new enclosure attribute, etc.). FreshRSS detects these as updated entries via hash mismatch and calls EntryBeforeInsert again, which previously triggered another LLM API call for every such pseudo-update.

The extension now stores a SHA-1 of the prompt it sent (plus the exact list of tags it assigned) on each classified entry, under the 'llm_classification' attribute namespace. On a feed update of an already-classified entry:

  • if the new prompt hashes to the same value, the prior tags are restored and no LLM call is made;
  • otherwise, behaviour follows the new 'Re-classify when content changes' toggle (default on): call the LLM and refresh, or keep the prior tags untouched.

When re-classifying, the prior tag list is used to remove only those exact tags (instead of the previous prefix-based heuristic), so manual tags sharing the prefix are preserved.

Some RSS sources re-publish existing articles with no semantic change
(reformatted author, tweaked date, new enclosure attribute, etc.).
FreshRSS detects these as updated entries via hash mismatch and calls
EntryBeforeInsert again, which previously triggered another LLM API
call for every such pseudo-update.

The extension now stores a SHA-1 of the prompt it sent (plus the exact
list of tags it assigned) on each classified entry, under the
'llm_classification' attribute namespace. On a feed update of an
already-classified entry:

- if the new prompt hashes to the same value, the prior tags are
  restored and no LLM call is made;
- otherwise, behaviour follows the new 'Re-classify when content
  changes' toggle (default on): call the LLM and refresh, or keep the
  prior tags untouched.

When re-classifying, the prior tag list is used to remove only those
exact tags (instead of the previous prefix-based heuristic), so manual
tags sharing the prefix are preserved.
@tcx4c70
Copy link
Copy Markdown
Author

tcx4c70 commented Jun 1, 2026

After the fix, the LLM API requests drop from 4-8k per day to ~800 per day, which matches the number of new articles in my RSS sources.

截屏2026-06-01 22 19 25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant