Skip to content

FIX: Fixing test that does not work properly / Failing in connection pool#493

Open
subrata-ms wants to merge 1 commit intomainfrom
subrata-ms/ConnectionPoolTestFix
Open

FIX: Fixing test that does not work properly / Failing in connection pool#493
subrata-ms wants to merge 1 commit intomainfrom
subrata-ms/ConnectionPoolTestFix

Conversation

@subrata-ms
Copy link
Copy Markdown
Contributor

@subrata-ms subrata-ms commented Mar 31, 2026

Work Item / Issue Reference

AB#40818

GitHub Issue: #<ISSUE_NUMBER>


Summary

This pull request improves the robustness, reliability, and debuggability of the connection pooling and cleanup logic in the SQL Server Python driver. It introduces safer error handling in destructors, enhances logging for connection pool operations, and makes the pooling tests more reliable and meaningful by using unique connection identifiers and removing unnecessary skips.

Error handling and resource management:

  • The Connection destructor now safely handles exceptions from disconnect() by logging and suppressing them, preventing potential crashes due to exceptions during stack unwinding.
  • In disconnect(), errors from SQLDisconnect are logged instead of thrown, ensuring cleanup is always attempted and never throws from contexts like destructors or pool cleanup.

Logging and diagnostics:

  • Added detailed logging in the connection pool's acquire() method to track pool size, idle timeout, and connection idle times, aiding in diagnosing pool behavior.
  • Improved formatting and consistency of log messages in connection handle compaction routines. [1] [2]
  • Enhanced error log formatting for non-statement handle detection in disconnect().

Connection pooling configuration:

  • Removed the unnecessary std::once_flag from enable_pooling(), allowing reconfiguration of the pool for each test run.

Testing improvements:

  • Re-enabled and rewrote previously skipped pooling tests to be more reliable:
    • test_pool_idle_timeout_removes_connections now uses connection_id instead of @@SPID to uniquely identify physical connections and increases the idle wait to avoid flakiness in CI environments.
    • test_pool_removes_invalid_connections is now robust, directly simulates a dirty connection, and verifies that the pool discards invalid connections and creates new ones using connection_id.
    • Improved password replacement logic in test_pool_recovery_after_failed_connection for broader compatibility with different connection string formats.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make connection pooling behavior easier to diagnose and make pooling-related tests more reliable, while also hardening cleanup paths (notably destructors / disconnect) to avoid crashes during shutdown or error unwinding.

Changes:

  • Updates pooling tests to avoid SPID reuse issues and re-enables previously skipped pooling recovery tests.
  • Makes Connection cleanup paths more destructor-safe by suppressing exceptions and logging failures.
  • Adjusts pooling configuration to be reconfigurable across test runs and adds more pool acquisition diagnostics.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
tests/test_009_pooling.py Reworks idle-timeout and invalid-connection pooling tests; improves bad-password mutation logic.
mssql_python/pybind/ddbc_bindings.cpp Allows enable_pooling() to reconfigure defaults on each call (removes call_once).
mssql_python/pybind/connection/connection.cpp Makes destructor/disconnect safer by suppressing throws and improving log formatting.
mssql_python/pybind/connection/connection_pool.cpp Adds detailed debug logging around pool acquisition and idle pruning.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +35 to +36
LOG("ConnectionPool::acquire: checking conn idle_time=%lld vs timeout=%d",
(long long)idle_time, _idle_timeout_secs);
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added per-connection LOG calls occur while holding ConnectionPool::_mutex. LoggerBridge::log acquires a mutex and the Python GIL, so logging inside this critical section can significantly increase lock contention and, in worst cases, deadlock if Python logging handlers call back into code that needs the pool. Consider limiting this to a single summary log per acquire, or collecting idle_time values and logging after releasing _mutex (or behind a higher-verbosity flag).

Suggested change
LOG("ConnectionPool::acquire: checking conn idle_time=%lld vs timeout=%d",
(long long)idle_time, _idle_timeout_secs);

Copilot uses AI. Check for mistakes.
Comment on lines 53 to +60
Connection::~Connection() {
disconnect(); // fallback if user forgets to disconnect
try {
disconnect(); // fallback if user forgets to disconnect
} catch (...) {
// Never throw from a destructor — doing so during stack unwinding
// causes std::terminate(). Log and swallow.
LOG_ERROR("Exception suppressed in ~Connection destructor");
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~Connection now swallows all exceptions (good), but the log message loses the original error context. Consider catching const std::exception& first and logging ex.what() (then a catch-all for non-std exceptions), so debugging unexpected disconnect failures is possible without rethrowing.

Copilot uses AI. Check for mistakes.
// Log the error but do NOT throw — disconnect must be safe to call
// from destructors, reset() failure paths, and pool cleanup.
// Throwing here during stack unwinding causes std::terminate().
LOG_ERROR("SQLDisconnect failed (ret=%d), forcing handle cleanup", ret);
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When SQLDisconnect fails, the code logs only the SQLRETURN and then suppresses the error. For debuggability, consider also retrieving and logging the ODBC diagnostic details (e.g., via SQLCheckError_Wrap(SQL_HANDLE_DBC, _dbcHandle, ret)), but without throwing, so cleanup remains noexcept-safe while still providing actionable error info.

Suggested change
LOG_ERROR("SQLDisconnect failed (ret=%d), forcing handle cleanup", ret);
LOG_ERROR("SQLDisconnect failed (ret=%d), forcing handle cleanup", ret);
// Best-effort: retrieve and log ODBC diagnostics for debuggability.
// This must not throw, to keep disconnect noexcept-safe.
try {
ErrorInfo err = SQLCheckError_Wrap(SQL_HANDLE_DBC, _dbcHandle, ret);
std::string diagMsg = WideToUTF8(err.ddbcErrorMsg);
LOG_ERROR("SQLDisconnect diagnostics: %s", diagMsg.c_str());
} catch (...) {
// Swallow all exceptions: cleanup paths must not throw.
LOG_ERROR("SQLDisconnect: failed to retrieve ODBC diagnostics");
}

Copilot uses AI. Check for mistakes.
Comment on lines +348 to +355
# Verify it's a different physical connection
new_cursor.execute(
"SELECT connection_id FROM sys.dm_exec_connections WHERE session_id = @@SPID"
)
new_conn_id = new_cursor.fetchone()[0]
assert (
original_conn_id != new_conn_id
), "Expected a new physical connection after pool discarded the dirty one"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test assumes that returning a connection to the pool with an open transaction will make reset() fail and force the pool to create a new physical connection. However, the driver’s reset() path uses SQL_ATTR_RESET_CONNECTION, which is designed to clean up session state (and may legitimately succeed even with an open transaction), in which case reusing the same connection_id is valid and this assertion becomes brittle. Consider asserting that the pool recovers and the session state is clean (e.g., @@TRANCOUNT = 0 / XACT_STATE() = 0 after reacquire), or allow either outcome (same connection reused after successful reset vs. new connection after discard).

Copilot uses AI. Check for mistakes.
Comment on lines +366 to 370
pwd_match = re.search(r"(Pwd|Password)=", conn_str, re.IGNORECASE)
if pwd_match:
key = pwd_match.group(0) # e.g. "PWD=" or "Pwd=" or "Password="
bad_conn_str = conn_str.replace(key, key + "wrongpassword")
else:
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection-string mutation only replaces the key token (e.g. Pwd=) and prefixes wrongpassword onto the existing password value; it can also replace occurrences in parameter values and doesn’t handle whitespace around =. To make this deterministic and avoid corrupting unrelated parts of the string, consider using a regex substitution that replaces the value of the first Pwd/Password key-value pair (up to the next ; or end of string), preserving the rest of the connection string.

Suggested change
pwd_match = re.search(r"(Pwd|Password)=", conn_str, re.IGNORECASE)
if pwd_match:
key = pwd_match.group(0) # e.g. "PWD=" or "Pwd=" or "Password="
bad_conn_str = conn_str.replace(key, key + "wrongpassword")
else:
# Replace the value of the first Pwd/Password key-value pair with "wrongpassword"
pattern = re.compile(r"(?i)(Pwd|Password\s*=\s*)([^;]*)")
bad_conn_str, num_subs = pattern.subn(lambda m: m.group(1) + "wrongpassword", conn_str, count=1)
if num_subs == 0:

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

📊 Code Coverage Report

🔥 Diff Coverage

82%


🎯 Overall Coverage

77%


📈 Total Lines Covered: 5788 out of 7433
📁 Project: mssql-python


Diff Coverage

Diff: main...HEAD, staged and unstaged changes

  • mssql_python/pybind/connection/connection.cpp (62.5%): Missing lines 59-60,132-135
  • mssql_python/pybind/connection/connection_pool.cpp (100%)
  • mssql_python/pybind/ddbc_bindings.cpp (100%)

Summary

  • Total: 35 lines
  • Missing: 6 lines
  • Coverage: 82%

mssql_python/pybind/connection/connection.cpp

Lines 55-64

  55         disconnect();  // fallback if user forgets to disconnect
  56     } catch (...) {
  57         // Never throw from a destructor — doing so during stack unwinding
  58         // causes std::terminate().  Log and swallow.
! 59         LOG_ERROR("Exception suppressed in ~Connection destructor");
! 60     }
  61 }
  62 
  63 // Allocates connection handle
  64 void Connection::allocateDbcHandle() {

Lines 128-139

  128                     // SAFETY ASSERTION: Only STMT handles should be in this vector
  129                     // This is guaranteed by allocStatementHandle() which only creates STMT handles
  130                     // If this assertion fails, it indicates a serious bug in handle tracking
  131                     if (handle->type() != SQL_HANDLE_STMT) {
! 132                         LOG_ERROR(
! 133                             "CRITICAL: Non-STMT handle (type=%d) found in _childStatementHandles. "
! 134                             "This will cause a handle leak!",
! 135                             handle->type());
  136                         continue;  // Skip marking to prevent leak
  137                     }
  138                     handle->markImplicitlyFreed();
  139                 }


📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.h: 67.8%
mssql_python.pybind.ddbc_bindings.cpp: 70.4%
mssql_python.row.py: 70.5%
mssql_python.pybind.connection.connection.cpp: 76.1%
mssql_python.__init__.py: 77.1%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.connection.py: 85.2%
mssql_python.cursor.py: 85.6%
mssql_python.helpers.py: 89.3%

🔗 Quick Links

⚙️ Build Summary 📋 Coverage Details

View Azure DevOps Build

Browse Full Coverage Report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-size: medium Moderate update size

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants