UN-3159 [MISC] Add improved logging for retrieval operations #1747

chandrasekharan-zipstack · 2026-01-19T08:58:27Z

What

Improved logging during retrieval operations for better observability
Added cache stats logging for variable extraction to monitor hit rates
JSON repair optimization to reduce unnecessary parsing

Why

Better observability into retrieval operations helps track and debug extraction performance
Cache stats logging provides insights into whether the lru_cache for variable extraction is beneficial
JSON repair optimization reduces unnecessary double parsing for single JSON objects

How

Enhanced logging in retrieval.py and simple.py to track retrieval operation flow
Integrated periodic cache stats logging (every 50 calls) in variable_replacement.py to report hit rate, cache size, and prompt size
Added heuristic in json_repair_helper.py to skip unnecessary double parsing when working with single JSON objects

Can this PR break any existing features. If yes, please list possible items. If no, please explain why.

No. These are additive changes that only add logging and a minor optimization. No API changes or data model modifications.

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

UN-3159

Dependencies Versions

Notes on Testing

These changes should be tested by:

Running retrieval operations and verifying improved logs are visible
Monitoring cache stats output from variable extraction
Verifying JSON repair optimization doesn't affect output correctness

Screenshots

- Add cache statistics logging to variable_replacement lru_cache (logs every 50 calls with hit rate, cache size) - Move retry logging in simple retriever to only log when actually retrying (not for initial attempts) - Optimize json_repair_helper with heuristic to skip double parsing when unnecessary - Add detailed retrieval metrics logging to services/retrieval.py for better observability Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai · 2026-01-19T08:58:40Z

Summary by CodeRabbit

Chores
- Enhanced logging and instrumentation for retrieval operations with improved timing metrics and contextual information.
- Implemented caching for variable extraction to improve performance and added periodic cache statistics reporting.
- Improved retry messaging when retrieval context is unavailable due to system lag.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

These changes enhance logging and instrumentation across retrieval and variable extraction modules. SimpleRetriever logging is adjusted for retry reporting, variable extraction introduces a caching layer with statistics tracking, and retrieval operations now capture and log timing metrics alongside contextual information.

Changes

Cohort / File(s)	Summary
Logging Adjustments `prompt-service/src/unstract/prompt_service/core/retrievers/simple.py`	Removed initial per-request log; added retry logging when no context is retrieved (before 2-second sleep).
Caching Implementation `prompt-service/src/unstract/prompt_service/helpers/variable_replacement.py`	Introduced `_extract_variables_cached()` helper with LRU caching; `extract_variables_from_prompt()` now uses cached extraction; periodic cache statistics logging every 50 calls (hits, misses, rate, size).
Retrieval Instrumentation `prompt-service/src/unstract/prompt_service/services/retrieval.py`	Added timing and contextual logging across retrieval paths: `perform_retrieval` derives prompt_name and logs with vector_db_id; `run_retrieval` computes and logs elapsed time with retrieval metrics; `retrieve_complete_context` tracks completion time and logs with character length.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main objective of the pull request: adding improved logging for retrieval operations, which aligns with the actual changes across multiple files.
Description check	✅ Passed	The description covers all critical template sections with appropriate detail: What/Why/How are comprehensive, breaking changes are explicitly addressed, testing notes are provided, and related issues are documented.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

prompt-service/src/unstract/prompt_service/utils/json_repair_helper.py

Remove the heuristic optimization - keeping only logging improvements in this PR. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

harini-venkataraman

LGTM

sonarqubecloud · 2026-01-21T06:08:21Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2026-01-21T06:08:26Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 66 passed, 0 failed (66 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

chandrasekharan-zipstack self-assigned this Jan 19, 2026

chandrasekharan-zipstack requested review from Deepak-Kesavan and harini-venkataraman January 19, 2026 09:03

chandrasekharan-zipstack changed the title ~~UN-3159 [FEAT] Add improved logging for retrieval operations~~ UN-3159 [MISC] Add improved logging for retrieval operations Jan 19, 2026

Deepak-Kesavan reviewed Jan 19, 2026

View reviewed changes

prompt-service/src/unstract/prompt_service/utils/json_repair_helper.py Outdated Show resolved Hide resolved

Revert json_repair_helper.py changes

75fe8f9

Remove the heuristic optimization - keeping only logging improvements in this PR. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Deepak-Kesavan approved these changes Jan 21, 2026

View reviewed changes

harini-venkataraman approved these changes Jan 21, 2026

View reviewed changes

pk-zipstack and others added 2 commits January 21, 2026 11:22

Merge branch 'main' into feature/UN-3159-improve-retrieval-logging

4018b2e

Merge branch 'main' into feature/UN-3159-improve-retrieval-logging

18661ba

chandrasekharan-zipstack merged commit af6c81e into main Jan 21, 2026
7 checks passed

chandrasekharan-zipstack deleted the feature/UN-3159-improve-retrieval-logging branch January 21, 2026 06:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UN-3159 [MISC] Add improved logging for retrieval operations #1747

UN-3159 [MISC] Add improved logging for retrieval operations #1747

Uh oh!

chandrasekharan-zipstack commented Jan 19, 2026

Uh oh!

coderabbitai bot commented Jan 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

harini-venkataraman left a comment

Uh oh!

sonarqubecloud bot commented Jan 21, 2026

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

UN-3159 [MISC] Add improved logging for retrieval operations #1747

UN-3159 [MISC] Add improved logging for retrieval operations #1747

Uh oh!

Conversation

chandrasekharan-zipstack commented Jan 19, 2026

What

Why

How

Can this PR break any existing features. If yes, please list possible items. If no, please explain why.

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

Dependencies Versions

Notes on Testing

Screenshots

Uh oh!

coderabbitai bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

harini-venkataraman left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Jan 21, 2026

Quality Gate passed

Uh oh!

github-actions bot commented Jan 21, 2026

Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

coderabbitai bot commented Jan 19, 2026 •

edited

Loading