Bug: fix IPC named lock client-side robustness for Windows by gnodet · Pull Request #1910 · apache/maven-resolver

gnodet · 2026-06-08T14:09:46Z

Summary

Fixes Windows CI flakiness in IpcNamedLockFactoryIT.exclusiveAccess() (seen in PR #1909, CI run 27139141318).

Root cause: When the loser thread's client.lock() times out in doLockExclusively(), the cleanup call client.unlock(contextId) could either:

Hang forever — it used Long.MAX_VALUE timeout, so if the server/receiver had any issue, the thread blocked indefinitely
Throw RuntimeException — which propagated out of Access.run() (which only catches InterruptedException), causing the thread to terminate without calling loser.countDown(). The winner thread then waited on loser.await() forever, and the test hung until Surefire killed it at 25 seconds.

Changes

IpcNamedLock: Wrap cleanup unlock in tryUnlock() — a try-catch that swallows exceptions so the calling thread always returns false instead of crashing. This is the primary fix.
IpcClient.unlock(): Replace Long.MAX_VALUE timeout with 10 seconds. Normal unlock round-trips complete in microseconds; 10 seconds is generous enough while preventing infinite hangs.
IpcClient.stopServer(): Replace Long.MAX_VALUE timeout with 30 seconds.
IpcClient.receive(): On empty response, complete the future exceptionally instead of throwing IllegalStateException — the old behavior killed the receiver thread, breaking all subsequent operations on the shared client.
IpcClient.close(): Call close() on EOF so pending futures are properly cleaned up when the server disconnects, and reset the initialized flag to prevent stale state.

Test plan

All 27 IPC tests pass locally (unit + integration: IpcNamedLockFactoryIT, IpcAdapterNoForkIT, IpcAdapterIT)
Windows CI should no longer hang on exclusiveAccess

When lock acquisition times out in IpcNamedLock, the cleanup call to client.unlock() could either hang forever (Long.MAX_VALUE timeout) or throw RuntimeException. On Windows, where IPC socket behavior can differ, this causes test threads to either block indefinitely or crash without signaling their CountDownLatch, making the test appear to hang. Four changes: 1. IpcNamedLock: wrap cleanup unlock in try-catch (tryUnlock) so that a failure during timeout handling returns false instead of crashing the calling thread. This is the primary fix for the exclusiveAccess test hang on Windows. 2. IpcClient.unlock/stopServer: replace Long.MAX_VALUE timeouts with bounded values (10s/30s) so operations fail fast when the server is unresponsive. 3. IpcClient.receive: complete the future exceptionally on empty responses instead of throwing, which would kill the receiver thread and break all subsequent operations on the shared client. 4. IpcClient.close: properly call close() on EOF to clean up pending futures, and reset the initialized flag to prevent stale state.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

gnodet · 2026-06-08T15:10:31Z

retest

The send() method read the `output` field multiple times without holding the same lock as close(Throwable), which nulls it under synchronized(this). Between ensureInitialized() returning and reaching synchronized(output), the receiver thread could close the connection and null out the field, causing a NullPointerException at output.flush(). Fix by capturing the output field in a local variable immediately after ensureInitialized(), checking for null, and using the local throughout the synchronized block.

gnodet · 2026-06-08T16:26:18Z

Pushed a fix for the remaining NPE race condition in IpcClient.send().

Root cause: The send() method read the output field multiple times without holding the same lock as close(Throwable), which nulls the field under synchronized(this). Between ensureInitialized() returning and reaching synchronized(output), the receiver thread could call close(Throwable) — setting output = null — causing a NullPointerException at output.flush().

Fix: Capture output in a local variable immediately after ensureInitialized(), check it for null (throwing IOException("Connection closed") if the connection was already torn down), and use the local variable for both the synchronized block and all writes within it. This eliminates the TOCTOU race between the field read and its use.

gnodet · 2026-06-08T17:10:17Z

/rerun

gnodet · 2026-06-08T17:20:24Z

Rerunning CI to gather additional Windows stability data.

gnodet · 2026-06-08T18:00:25Z

retest this please

github-actions · 2026-06-08T20:05:19Z

@gnodet Please assign appropriate label to PR according to the type of change.

gnodet mentioned this pull request Jun 8, 2026

Fix for ChainedWorkspaceReader #1909

Draft

gnodet requested a review from Copilot June 8, 2026 14:21

Copilot started reviewing on behalf of gnodet June 8, 2026 14:21 View session

Copilot AI reviewed Jun 8, 2026

ci: retrigger Windows CI validation (run 3/3)

a097248

cstamas approved these changes Jun 8, 2026

View reviewed changes

gnodet merged commit 9975d78 into apache:master Jun 8, 2026
42 checks passed

github-actions Bot assigned gnodet Jun 8, 2026

github-actions Bot added this to the 2.0.19 milestone Jun 8, 2026

cstamas added the bug Something isn't working label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: fix IPC named lock client-side robustness for Windows#1910

Bug: fix IPC named lock client-side robustness for Windows#1910
gnodet merged 3 commits into
apache:masterfrom
gnodet:gnodet/fix-ipc-windows

gnodet commented Jun 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gnodet commented Jun 8, 2026

Summary

Changes

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

gnodet commented Jun 8, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants