UN-3157 [FIX] Fix resource leaks in platform-service #1748

johnyrahul · 2026-01-19T11:16:02Z

Summary

Fix database cursor leaks by adding try/finally blocks to ensure cursors are always closed, even when exceptions occur
Add shared Redis connection pool to prevent per-request connection creation and potential connection exhaustion
Add comprehensive memory leak simulation tests to verify the fixes

Changes

Resource Leak Fixes

execute_query() - Wrap cursor operations in try/finally
validate_bearer_token() - Ensure cursor is closed on all code paths
get_adapter_instance_from_db() - Fix cursor leak when adapter not found (APIError)
get_prompt_instance_from_db() - Fix cursor leak when prompt not found
get_llm_profile_instance_from_db() - Fix cursor leak when LLM profile not found

Redis Connection Pool

Add get_redis_pool() and get_redis_client() in extensions.py
Refactor cache endpoint to use shared connection pool instead of creating new connections per request

Tests Added

test_memory_leak_simulation.py - 9 tests demonstrating OLD vs NEW code behavior
Uses tracemalloc for actual memory growth measurement
Includes load test simulating 1000 requests with 30% failure rate

Test plan

Run memory leak simulation tests: uv run pytest tests/test_memory_leak_simulation.py -v -s
Manual testing of cache endpoint
Manual testing of adapter/prompt/LLM profile endpoints
Monitor PostgreSQL connection count under load
Monitor Redis connection count under load

Test Results

[OLD CODE] Open cursors after 100 failed requests: 100  ❌ LEAKED
[NEW CODE] Open cursors after 100 failed requests: 0    ✅ NO LEAK

[LOAD TEST] 1000 requests with 30% failure rate:
  OLD CODE leaked cursors: 100
  NEW CODE leaked cursors: 0

🤖 Generated with Claude Code

- Add try/finally blocks to ensure database cursors are always closed - Add shared Redis connection pool to prevent per-request connection creation - Fix execute_query(), validate_bearer_token() in platform.py - Fix get_adapter_instance_from_db() in adapter_instance.py - Fix get_prompt_instance_from_db(), get_llm_profile_instance_from_db() in prompt_studio.py - Add get_redis_client() and safe_cursor() utilities in extensions.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add test_memory_leak_simulation.py demonstrating OLD vs NEW code behavior - Tests verify cursor/connection cleanup in exception scenarios - Uses tracemalloc for actual memory growth measurement - Includes load test simulating 1000 requests with 30% failure rate - Add tests/README.md with instructions for running tests Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai · 2026-01-19T11:16:14Z

Summary by CodeRabbit

Bug Fixes
- Fixed database cursor and cache connection leaks to improve stability and memory use.
New Features
- Introduced shared Redis connection pooling and safer DB cursor management for more reliable request handling.
Tests
- Added comprehensive tests that simulate memory and connection leak scenarios to prevent regressions.
Documentation
- Added a tests README documenting test setup and resource-management testing patterns.
Chores
- Optimized connection reuse and cleanup for better performance.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Introduce a shared Redis pool and client, add a safe_cursor context manager for DB cursor lifecycle, update controller and helper modules to use these utilities (replacing per-call Redis and manual cursor handling), and add tests plus README for leak/memory simulations.

Changes

Cohort / File(s)	Summary
Extensions / Shared resources `platform-service/src/unstract/platform_service/extensions.py`	Added `get_redis_pool()`, `get_redis_client()`, and `safe_cursor()` context manager (lazy Redis pool + shared client and cursor lifecycle utility).
Controller updates `platform-service/src/unstract/platform_service/controller/platform.py`	Replaced direct per-request Redis usage with `get_redis_client()`; switched DB cursor handling to `safe_cursor()` in `execute_query`, `validate_bearer_token`, and cache GET/POST/DELETE flows.
Helper modules `platform-service/src/unstract/platform_service/helper/adapter_instance.py`, `platform-service/src/unstract/platform_service/helper/prompt_studio.py`	Replaced ad-hoc cursor creation/closure with `safe_cursor()` and parameterized SQL (%s) for queries; removed direct `db` cursor usage.
Tests & docs `platform-service/tests/test_memory_leak_simulation.py`, `platform-service/tests/README.md`	Added comprehensive leak/memory simulation tests (cursor and Redis connection scenarios, memory profiling, load simulation) and a README documenting test usage and patterns.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant Controller
participant safe_cursor as "safe_cursor (DB)"
participant Redis
Client->>Controller: request (e.g., validate token / cache op)
Controller->>safe_cursor: with safe_cursor(query, params)
safe_cursor->>safe_cursor: execute query, yield cursor
safe_cursor->>Controller: return rows/result
Controller->>Redis: get_redis_client() -> reuse connection
Controller->>Redis: GET/SET/DEL cache
Controller-->>Client: response
note right of safe_cursor: cursor closed automatically upon exit

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 58.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: fixing resource leaks in the platform-service, which aligns with the changeset's core objective.
Description check	✅ Passed	The PR description covers most required template sections including What, Why, How, breaking changes, testing, and related issues. However, Database Migrations, Env Config, and Dependencies Versions sections are missing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@platform-service/src/unstract/platform_service/controller/platform.py`:
- Around line 87-90: The query construction for platform_key_table interpolates
token directly into the SQL (query = f"""... WHERE key = '{token}'"""), creating
an SQL injection risk; change this to use a parameterized query consistent with
execute_query: build the SQL string using a %s placeholder for the key and pass
token as a parameter to execute_query (keep Env.DB_SCHEMA and platform_key_table
usage intact but do not interpolate token), ensuring execute_query receives the
parameter list/tuple so the DB driver performs safe binding.

🧹 Nitpick comments (4)

platform-service/src/unstract/platform_service/extensions.py (2)
14-33: Thread safety consideration for lazy initialization.

The lazy initialization of _redis_pool using a global variable without synchronization could lead to a race condition in multi-threaded environments (e.g., multiple Flask workers). While unlikely to cause issues in practice (worst case: pool created twice, one discarded), consider using a lock for thread-safe initialization if this service runs with threading.
♻️ Thread-safe lazy initialization (optional)
+import threading
+
 # Redis connection pool (initialized lazily)
 _redis_pool: redis.ConnectionPool | None = None
+_redis_pool_lock = threading.Lock()
 
 
 def get_redis_pool() -> redis.ConnectionPool:
     """Get or create the Redis connection pool.
 
     Returns:
         redis.ConnectionPool: Shared connection pool for Redis operations.
     """
     global _redis_pool
     if _redis_pool is None:
-        # Import here to avoid circular imports
-        from unstract.platform_service.env import Env
-
-        _redis_pool = redis.ConnectionPool(
-            host=Env.REDIS_HOST,
-            port=Env.REDIS_PORT,
-            username=Env.REDIS_USERNAME,
-            password=Env.REDIS_PASSWORD,
-            max_connections=10,
-            decode_responses=False,
-        )
+        with _redis_pool_lock:
+            if _redis_pool is None:  # Double-check locking
+                from unstract.platform_service.env import Env
+
+                _redis_pool = redis.ConnectionPool(
+                    host=Env.REDIS_HOST,
+                    port=Env.REDIS_PORT,
+                    username=Env.REDIS_USERNAME,
+                    password=Env.REDIS_PASSWORD,
+                    max_connections=10,
+                    decode_responses=False,
+                )
     return _redis_pool
45-60: safe_cursor context manager is defined but not used.

The safe_cursor context manager is a useful utility, but the changed files (adapter_instance.py, prompt_studio.py, platform.py) all use inline try/finally blocks instead of this context manager. Consider either using safe_cursor in those files for consistency, or documenting that it's available for future use.
platform-service/tests/test_memory_leak_simulation.py (1)
16-44: Consider annotating mutable class attribute with ClassVar.

The _open_cursors class attribute is used for test tracking and works correctly, but for type correctness it should be annotated with ClassVar as flagged by static analysis.
♻️ Add ClassVar annotation
+from typing import ClassVar
+
 class FakeCursor:
     """Simulates a database cursor that tracks if it was closed."""
 
-    _open_cursors: list["FakeCursor"] = []
+    _open_cursors: ClassVar[list["FakeCursor"]] = []
platform-service/tests/README.md (1)
56-67: Add language specifier to fenced code block.

The sample output code block is missing a language specifier. Use text or console for output examples.
♻️ Add language specifier
 **Sample Output:**
-```
+```text
 [OLD CODE] Open cursors after 100 failed requests: 100  ❌ LEAKED
 [NEW CODE] Open cursors after 100 failed requests: 0    ✅ NO LEAK

platform-service/src/unstract/platform_service/controller/platform.py

muhammad-ali-e

Added some NIT

platform-service/tests/test_memory_leak_simulation.py

platform-service/src/unstract/platform_service/controller/platform.py

muhammad-ali-e · 2026-01-20T06:14:47Z

@johnyrahul please check sonarqubecloud

- Fix SQL injection vulnerability by using parameterized queries - Refactor cursor handling to use safe_cursor context manager - Add ClassVar type annotation in test file - Add language specifier to README code block Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@platform-service/tests/test_memory_leak_simulation.py`:
- Around line 196-260: Tests use absolute tracemalloc snapshot diffs which are
flaky; update both test_memory_growth_with_leaked_objects and
test_no_memory_growth_with_proper_cleanup to call gc.collect() before sampling,
use tracemalloc.get_traced_memory() (start tracing, call gc.collect(), read
current/peak), compute a relative delta (e.g., percent change) instead of
summing snapshot compare_to results, and assert using relative thresholds (e.g.,
growth >X% for leaked case and <Y% for cleanup case) to make the assertions
stable across CI allocators.

🧹 Nitpick comments (2)

platform-service/src/unstract/platform_service/controller/platform.py (2)
84-96: Consider using explicit column names instead of SELECT *.

The query uses SELECT * with positional indexing (result_row[1], result_row[2]), which is fragile if the table schema changes. Consider selecting explicit columns to make the code more maintainable:
♻️ Suggested improvement
     platform_key_table = DBTable.PLATFORM_KEY
     query = f"""
-        SELECT * FROM \"{Env.DB_SCHEMA}\".{platform_key_table}
+        SELECT key, is_active FROM \"{Env.DB_SCHEMA}\".{platform_key_table}
         WHERE key = %s
     """

     try:
         with safe_cursor(query, (token,)) as cursor:
             result_row = cursor.fetchone()
             if not result_row or len(result_row) == 0:
                 app.logger.error(f"Authentication failed. bearer token not found {token}")
                 return False
-            platform_key = str(result_row[1])
-            is_active = bool(result_row[2])
+            platform_key = str(result_row[0])
+            is_active = bool(result_row[1])
340-346: Consider moving the success return to an else block for clarity.

The static analysis hint (TRY300) suggests moving return value, 200 to an else block. This is a minor style preference that makes it clearer that the return only happens when no exception occurs.
♻️ Optional style improvement
         try:
             redis_key = f"{account_id}:{key}"
             app.logger.info(f"Getting cached data for key: {redis_key}")
             value = r.get(redis_key)
             if value is None:
                 return "Not Found", 404
-            return value, 200
         except Exception as e:
             raise APIError(message=f"Error while getting cached data: {e}") from e
+        else:
+            return value, 200

platform-service/tests/test_memory_leak_simulation.py

github-actions · 2026-01-21T05:52:55Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 66 passed, 0 failed (66 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

sonarqubecloud · 2026-01-21T05:53:04Z

Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

johnyrahul and others added 2 commits January 19, 2026 14:31

johnyrahul requested review from jaseemjaskp, muhammad-ali-e and vishnuszipstack and removed request for muhammad-ali-e January 19, 2026 11:18

Merge branch 'main' into fix/UN-3157-memory-leak-platformservice

6519b28

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

platform-service/src/unstract/platform_service/controller/platform.py Show resolved Hide resolved

johnyrahul requested a review from ritwik-g January 19, 2026 11:19

jaseemjaskp approved these changes Jan 19, 2026

View reviewed changes

johnyrahul requested a review from hari-kuriakose January 20, 2026 04:57

muhammad-ali-e approved these changes Jan 20, 2026

View reviewed changes

platform-service/tests/test_memory_leak_simulation.py Show resolved Hide resolved

platform-service/src/unstract/platform_service/controller/platform.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

platform-service/tests/test_memory_leak_simulation.py Show resolved Hide resolved

muhammad-ali-e added 2 commits January 21, 2026 11:19

Merge branch 'main' into fix/UN-3157-memory-leak-platformservice

ee5c74d

Merge branch 'main' into fix/UN-3157-memory-leak-platformservice

7cfae6a

muhammad-ali-e merged commit 2a433ea into main Jan 21, 2026
7 checks passed

muhammad-ali-e deleted the fix/UN-3157-memory-leak-platformservice branch January 21, 2026 05:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UN-3157 [FIX] Fix resource leaks in platform-service #1748

UN-3157 [FIX] Fix resource leaks in platform-service #1748

Uh oh!

johnyrahul commented Jan 19, 2026

Uh oh!

coderabbitai bot commented Jan 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

muhammad-ali-e left a comment

Uh oh!

Uh oh!

Uh oh!

muhammad-ali-e commented Jan 20, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

sonarqubecloud bot commented Jan 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

UN-3157 [FIX] Fix resource leaks in platform-service #1748

UN-3157 [FIX] Fix resource leaks in platform-service #1748

Uh oh!

Conversation

johnyrahul commented Jan 19, 2026

Summary

Changes

Resource Leak Fixes

Redis Connection Pool

Tests Added

Test plan

Test Results

Uh oh!

coderabbitai bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

muhammad-ali-e left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

muhammad-ali-e commented Jan 20, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jan 21, 2026

Test Results

Uh oh!

sonarqubecloud bot commented Jan 21, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coderabbitai bot commented Jan 19, 2026 •

edited

Loading