Skip to content

Truncate recommend() result to caller's N when filter_items is set (#736)#757

Open
Cyberfilo wants to merge 1 commit into
benfred:mainfrom
Cyberfilo:fix/736-filter-items-result-count
Open

Truncate recommend() result to caller's N when filter_items is set (#736)#757
Cyberfilo wants to merge 1 commit into
benfred:mainfrom
Cyberfilo:fix/736-filter-items-result-count

Conversation

@Cyberfilo
Copy link
Copy Markdown

Summary

`ItemItemRecommender.recommend(N=K, filter_items=F)` was returning up to `K + len(F)` items instead of `K`. Fixes #736.

Root cause

In `implicit/nearest_neighbours.py::ItemItemRecommender.recommend`, the post-mask slice reused an inflated `N` that was meant only for the over-fetch step:

```python
if filter_items is not None:
N += len(filter_items)
...
if filter_items is not None:
mask = np.isin(ids, filter_items, invert=True)
ids, scores = ids[mask][:N], scores[mask][:N] # uses inflated N
```

Fix

Store the caller-requested `N` before inflating, then truncate to that value after masking:

```python
requested_n = N
if filter_items is not None:
N += len(filter_items)
...
if filter_items is not None:
mask = np.isin(ids, filter_items, invert=True)
ids, scores = ids[mask][:requested_n], scores[mask][:requested_n]
```

The over-fetch is still required to ensure enough rows survive the mask in the worst case (when every filtered item happens to be in the top-N scoring window). No behavior change for callers that don't use `filter_items`.

Test plan

Manual reproducer:

```python
from implicit.nearest_neighbours import CosineRecommender
from scipy.sparse import csr_matrix
import numpy as np

model = CosineRecommender(K=20)
model.fit(csr_matrix(np.random.rand(10, 10)))
user_items = csr_matrix(np.ones((1, 10)))
ids, _ = model.recommend(0, user_items, N=3, filter_items=[1, 2, 3], filter_already_liked_items=False)
assert len(ids) == 3 # before: len(ids) == 6
```

Diff is 5 lines: 1 variable assignment, 1 comment block, and a 1-character change on each of two slice lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

recommend(...) returns N + len(filter_items) results when filter_items is used

1 participant