Skip to content

[WIP] Fix improper socket teardown without shutdown()#508

Closed
Copilot wants to merge 1 commit into
masterfrom
copilot/fix-socket-teardown-issues
Closed

[WIP] Fix improper socket teardown without shutdown()#508
Copilot wants to merge 1 commit into
masterfrom
copilot/fix-socket-teardown-issues

Conversation

Copy link
Copy Markdown

Copilot AI commented Jun 2, 2026

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.


This section details on the original issue you should resolve

<issue_title>Socket closed without shutdown() causes TIME_WAIT accumulation and data loss risk</issue_title>
<issue_description># Socket closed without shutdown() causes TIME_WAIT accumulation and data loss risk

Summary

The clickhouse-cpp client calls close() / closesocket() directly on the TCP socket without first calling shutdown(). This is an improper socket teardown that causes two problems:

  1. TIME_WAIT socket accumulation on the client side when the client initiates the close, which can exhaust ephemeral ports under high connection churn.
  2. Potential data lossclose() on a socket with unread data in the receive buffer causes the OS to send a RST instead of a graceful FIN, which can discard in-flight data and confuse the peer.

The issue affects both plain TCP and SSL connections (no SSL_shutdown() either).

Root Cause

Socket::Close() in socket.cpp:

void Socket::Close() {
    CloseSocket(handle_);
    handle_ = INVALID_SOCKET;
}

Where CloseSocket is:

void CloseSocket(SOCKET socket) {
    if (socket == INVALID_SOCKET)
        return;
#if defined(_win_)
    closesocket(socket);
#else
    close(socket);
#endif
}

There is no shutdown(fd, SHUT_RDWR) or shutdown(fd, SHUT_WR) call before close() anywhere in the library. A search of the entire clickhouse/ source tree confirms zero uses of shutdown().

For SSL sockets, SSLSocket inherits from Socket and its destructor chain also never calls SSL_shutdown() before closing the underlying file descriptor.

Affected Code Paths

Every socket close goes through Socket::Close()CloseSocket():

  • Client::~Client()~Impl()~socket_Socket::~Socket()Close()
  • Client::Impl::ResetConnection()InitializeStreams() → old socket swapped out → ~Socket()
  • SocketRAIIWrapper::~SocketRAIIWrapper() in SocketConnect() (on connection failure)

Consequences

1. TIME_WAIT accumulation

When the client closes first without shutdown(), the TCP stack enters TIME_WAIT on the client side. Under high connection churn (many short-lived client instances), this can exhaust the available ephemeral port range (~28,000 ports on Linux default), causing connect() failures with EADDRNOTAVAIL.

2. RST instead of graceful FIN

Per POSIX/RFC 793, calling close() on a socket that has unread data in the kernel receive buffer causes the OS to send a TCP RST to the peer instead of a graceful FIN. This can happen when:

  • The server sent data the client never read (e.g., client destroys Client object mid-query)
  • The server sent an exception or log packet the client didn't consume

The RST causes the peer to see ECONNRESET, and any data the client sent that is still in-flight may be discarded by the peer's TCP stack.

3. SSL session not terminated properly

For SSL connections, SSL_shutdown() is never called. This means the TLS close_notify alert is never sent, which:

  • Prevents SSL session resumption/caching
  • May cause the server to log TLS errors
  • Violates the TLS specification (RFC 5246 §7.2.1)

Reproduction

void tcp_server(std::atomic<bool>& running, int port) {
    int server_fd = socket(AF_INET, SOCK_STREAM, 0);
    // ... bind, listen ...
    while (running) {
        int client_fd = accept(server_fd, nullptr, nullptr);
        if (client_fd >= 0) {
            // Read Hello, then close properly from server side
            recv(client_fd, buffer, sizeof(buffer), 0);
            shutdown(client_fd, SHUT_RDWR);
            close(client_fd);
        }
    }
}

// Client side — creates and destroys many connections
for (int i = 0; i < 15000; ++i) {
    try {
        clickhouse::Client client(opts);
    } catch (...) {}
}

// Check TIME_WAIT count — will be very high
// Linux: grep "06 " /proc/net/tcp | wc -l

Suggested Fix

Add shutdown() before close() in Socket::Close():

void Socket::Close() {
    if (handle_ == INVALID_SOCKET)
        return;

#if defined(_win_)
    shutdown(handle_, SD_BOTH);
    closesocket(handle_);
#else
    shutdown(handle_, SHUT_RDWR);
    close(handle_);
#endif
    handle_ = INVALID_SOCKET;
}

For SSL sockets, add SSL_shutdown() in the SSLSocket destructor before the base Socket::Close() runs:

SSLSocket::~SSLSocket() {
    if (ssl_) {
        SSL_shutdown(ssl_.get());
    }
}

Note: shutdown() may fail if the socket is already in an error state — the return value can be safely ignored here since we are closing regardless.

Environment

  • Library: clickhouse-cpp (all current versions)
  • Affected platforms: All (Linux, macOS, Windows)
  • Affected files: clickhouse/base/socket.cpp, clickhouse/base/sslsocket.cpp</issue_description>

Comments on the Issue (you are @copilot in this section)

Copilot stopped work on behalf of slabko due to an error June 2, 2026 15:46
Copilot AI requested a review from slabko June 2, 2026 15:46
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@slabko slabko closed this Jun 2, 2026
@slabko slabko deleted the copilot/fix-socket-teardown-issues branch June 2, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Socket closed without shutdown() causes TIME_WAIT accumulation and data loss risk

3 participants