Skip to content

fix: clear deleted knowledge bases from sessions#8259

Open
he-yufeng wants to merge 1 commit into
AstrBotDevs:masterfrom
he-yufeng:fix/kb-delete-clears-session-config
Open

fix: clear deleted knowledge bases from sessions#8259
he-yufeng wants to merge 1 commit into
AstrBotDevs:masterfrom
he-yufeng:fix/kb-delete-clears-session-config

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

@he-yufeng he-yufeng commented May 20, 2026

Summary

Fixes #8251.

When a knowledge base is deleted, remove that KB id from every session-level kb_config.kb_ids entry. This avoids stale per-session bindings continuing to point at a deleted KB when unique_session is enabled.

This intentionally keeps the rest of each session KB config, such as top_k, unchanged. If the deleted KB was the only binding, kb_ids becomes an empty list so retrieval stops cleanly for that session.

To verify

PYTHONUTF8=1 TMP=.tmp/pytest TEMP=.tmp/pytest python -m pytest tests/test_kb_import.py::test_remove_deleted_kb_from_session_configs -q -p no:cacheprovider
PYTHONUTF8=1 TMP=.tmp/pytest TEMP=.tmp/pytest python -m pytest tests/test_kb_import.py -q -p no:cacheprovider
python -m ruff check astrbot/dashboard/routes/knowledge_base.py tests/test_kb_import.py
python -m py_compile astrbot/dashboard/routes/knowledge_base.py tests/test_kb_import.py
git diff --check

Summary by Sourcery

Ensure deleting a knowledge base also removes its ID from all stored session knowledge base configurations to prevent stale bindings.

Bug Fixes:

  • Clear deleted knowledge base IDs from per-session kb_config entries while preserving other settings like top_k.

Tests:

  • Add an async test covering removal of a deleted knowledge base from multiple session configurations, including edge cases with mixed and malformed entries.

@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label May 20, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In _remove_kb_from_session_configs, consider defensively handling the case where sp.session_get(None, "kb_config") returns None or a non-iterable to avoid a TypeError when iterating over prefs.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_remove_kb_from_session_configs`, consider defensively handling the case where `sp.session_get(None, "kb_config")` returns `None` or a non-iterable to avoid a `TypeError` when iterating over `prefs`.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a cleanup mechanism to remove deleted knowledge base IDs from session configurations. It introduces a new method _remove_kb_from_session_configs in the KnowledgeBaseRoute and updates the delete_kb endpoint to trigger this cleanup, accompanied by new unit tests. Feedback highlights that this business logic should ideally reside in the KnowledgeBaseManager for better architectural consistency and points out a potential race condition in the update loop that could lead to data loss during concurrent updates.

return self.core_lifecycle.kb_manager

@staticmethod
async def _remove_kb_from_session_configs(kb_id: str) -> int:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for cleaning up session configurations when a knowledge base is deleted is business logic that should ideally reside in the KnowledgeBaseManager (in astrbot/core/knowledge_base/kb_mgr.py) rather than the API route. Moving it to the core manager ensures that the cleanup happens regardless of how the deletion is triggered (e.g., via a CLI, a scheduled task, or another internal component), maintaining data consistency across the system.

Comment on lines +78 to +91
value = pref.value.get("val") if isinstance(pref.value, dict) else None
if not isinstance(value, dict):
continue

kb_ids = value.get("kb_ids")
if not isinstance(kb_ids, list) or kb_id not in kb_ids:
continue

new_value = {
**value,
"kb_ids": [item for item in kb_ids if item != kb_id],
}
await sp.session_put(pref.scope_id, "kb_config", new_value)
updated += 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a potential race condition here. The code iterates over a pre-fetched list of session configurations (prefs). Because the loop contains await calls (specifically sp.session_put), the event loop can process other tasks between iterations. If a session's configuration is updated by another coroutine after the initial session_get but before the loop reaches that specific session, the concurrent update will be overwritten and lost.

To mitigate this, you should fetch the latest value for each session configuration inside the loop before applying the changes. Note that if you apply this suggestion, you will also need to update the unit test mock for session_get in tests/test_kb_import.py to handle non-null umo values.

Suggested change
value = pref.value.get("val") if isinstance(pref.value, dict) else None
if not isinstance(value, dict):
continue
kb_ids = value.get("kb_ids")
if not isinstance(kb_ids, list) or kb_id not in kb_ids:
continue
new_value = {
**value,
"kb_ids": [item for item in kb_ids if item != kb_id],
}
await sp.session_put(pref.scope_id, "kb_config", new_value)
updated += 1
# Fetch the latest value for this specific session to avoid overwriting concurrent updates
value = await sp.session_get(pref.scope_id, "kb_config")
if not isinstance(value, dict):
continue
kb_ids = value.get("kb_ids")
if not isinstance(kb_ids, list) or kb_id not in kb_ids:
continue
new_value = {
**value,
"kb_ids": [item for item in kb_ids if item != kb_id],
}
await sp.session_put(pref.scope_id, "kb_config", new_value)
updated += 1
References
  1. In a single-threaded asyncio event loop, synchronous functions (code blocks without 'await') are executed atomically and will not be interrupted by other coroutines. Therefore, they are safe from race conditions when modifying shared state within that block.

@he-yufeng he-yufeng force-pushed the fix/kb-delete-clears-session-config branch from c501840 to 5ce1775 Compare May 20, 2026 14:24
Copy link
Copy Markdown
Contributor Author

Updated to address the review feedback:

  • defensively skip missing/non-list session preference scans
  • re-read each session's current kb_config before writing, so cleanup does not overwrite concurrent changes from a stale pref snapshot
  • extended the unit test mock to cover per-session reads and added a missing-scan guard test

Validation:

  • ./.venv/Scripts/python.exe -m pytest tests/test_kb_import.py -q -p no:cacheprovider (5 passed)
  • ./.venv/Scripts/python.exe -m py_compile astrbot/dashboard/routes/knowledge_base.py tests/test_kb_import.py
  • ./.venv/Scripts/python.exe -m ruff check astrbot/dashboard/routes/knowledge_base.py tests/test_kb_import.py
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]

1 participant