Skip to content

ConcurrentModificationException in RegionCoordinatorTestSupport during EndWorker retry test #5546

@Yicong-Huang

Description

@Yicong-Huang

What happened?

RegionExecutionCoordinatorSpec failed on main with java.util.ConcurrentModificationException: mutation occurred during iteration. The probe ControllerRpcProbe stores RPC calls in a mutable.ArrayBuffer (calls) that is appended to from the actor thread while assertion helpers call .filter(...) on it from the test thread. Under Scala 2.13's MutationTracker, this race surfaces as a hard CME instead of a silent wrong result.

Failing read site: amber/src/test/scala/org/apache/texera/amber/engine/architecture/scheduling/RegionCoordinatorTestSupport.scala:108

def endWorkerCalls: Seq[WorkerRpcCall] =
  calls.filter(_.methodName == EndWorker).toSeq

Sibling helpers on lines 103, 106, 112 (methodTrace, initializedWorkers, startedWorkers, onlyEndWorkerCall) share the same unsynchronized read pattern.

Codecov record: https://app.codecov.io/gh/apache/texera/tests/main?parameter=FAILED_TESTS (filter to RegionExecutionCoordinatorSpec)

How to reproduce?

Run on main:

sbt 'amber/testOnly org.apache.texera.amber.engine.architecture.scheduling.RegionExecutionCoordinatorSpec'

The test "RegionExecutionCoordinator should retry EndWorker failures and delay gracefulStop until a retry succeeds" fails non-deterministically. On the CI commit referenced below it failed twice in the same run; subsequent runs on later commits passed. Frequency depends on actor mailbox timing.

Version/Branch

1.3.0-incubating-SNAPSHOT (main)

Commit Hash (Optional)

c0615e1

Relevant log output

java.util.ConcurrentModificationException: mutation occurred during iteration
      at scala.collection.mutable.MutationTracker$.checkMutations(MutationTracker.scala:43)
      at scala.collection.mutable.CheckedIndexedSeqView$CheckedIterator.hasNext(CheckedIndexedSeqView.scala:47)
      at scala.collection.StrictOptimizedIterableOps.filterImpl(StrictOptimizedIterableOps.scala:225)
      at scala.collection.StrictOptimizedIterableOps.filterImpl$(StrictOptimizedIterableOps.scala:222)
      at scala.collection.mutable.ArrayBuffer.filterImpl(ArrayBuffer.scala:41)
      at scala.collection.StrictOptimizedIterableOps.filter(StrictOptimizedIterableOps.scala:218)
      at scala.collection.StrictOptimizedIterableOps.filter$(StrictOptimizedIterableOps.scala:218)
      at scala.collection.mutable.ArrayBuffer.filter(ArrayBuffer.scala:41)
      at org.apache.texera.amber.engine.architecture.scheduling.RegionCoordinatorTestSupport$ControllerRpcProbe.endWorkerCalls(RegionCoordinatorTestSupport.scala:108)
      at org.apache.texera.amber.engine.architecture.scheduling.RegionExecutionCoordinatorSpec.$anonfun$new$5(RegionExecutionCoordinatorSpec.scala:99)
      at org.apache.texera.amber.engine.architecture.scheduling.RegionCoordinatorTestSupport$.waitUntil(RegionCoordinatorTestSupport.scala:213)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions