test: mitigate e2e simulator hang / retry flakes by mikehardy · Pull Request #9057 · invertase/react-native-firebase

mikehardy · 2026-06-20T02:22:21Z

Summary

This PR is intended to hold a continued series of e2e flake fixes.

Simulator hang / Jet retry recovery — Preserve the mocha-remote runner across transient Jet WS disconnects, kill and reboot the resolved iOS simulator before retries, and ensure debug Jet retries still wait for Metro before relaunching the app.

Test plan

iOS E2E debug CI passes on merge (or Jet attempt-2 retry recovers cleanly after a transient 1006/session desync)
iOS E2E release CI remains green

gemini-code-assist · 2026-06-20T02:22:29Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a series of stability improvements for end-to-end testing, specifically targeting flaky simulator behavior and WebSocket connection issues. By enhancing the recovery logic for both the test runner and the underlying iOS simulator, the changes aim to reduce CI noise caused by transient infrastructure failures.

Highlights

Mocha Remote Server Resilience: Updated the mocha-remote-server to handle transient WebSocket disconnects by introducing a grace period and state management to preserve the test runner during brief network interruptions.
iOS Simulator Recovery: Implemented a more robust recovery mechanism for iOS E2E tests by rebooting the simulator when specific launch failures occur, rather than simply terminating the app.
E2E Test Stability: Expanded the list of retryable launch failures and added caching for Metro status to improve the reliability of the test suite execution.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/scripts/boot-simulator.sh

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a patch to mocha-remote-server to handle transient client disconnects with a reconnect grace timer, and updates the E2E test suite to reboot the iOS simulator upon retryable launch failures. Feedback highlights a critical race condition in the server patch where a restarted client process may hang waiting for a run command that is never sent. Additionally, the synchronous reboot of the iOS simulator blocks the Node.js event loop, potentially freezing the WebSocket server; it is recommended to refactor this to be asynchronous and properly awaited.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

codecov · 2026-06-20T02:46:46Z

Codecov Report

❌ Patch coverage is 70.58824% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.23%. Comparing base (f375acf) to head (7aa8d80).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #9057      +/-   ##
============================================
+ Coverage     60.92%   62.23%   +1.31%     
============================================
  Files           457      351     -106     
  Lines         33665    23396   -10269     
  Branches       5479     3978    -1501     
============================================
- Hits          20508    14558    -5950     
+ Misses        12026     8361    -3665     
+ Partials       1131      477     -654

Flag	Coverage Δ
android-native	`?`
e2e-ts-android	`?`
e2e-ts-ios	`51.46% <16.67%> (-0.02%)`	⬇️
e2e-ts-macos	`26.13% <11.12%> (ø)`
ios-native	`51.46% <16.67%> (-0.02%)`	⬇️
jest	`62.35% <80.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Cap Gradle workers at min(physical_cpus, 6) to limit parallel heap pressure; 5GB daemon heap handles peak packageDebugAndroidTest load. Scales with hardware without overwhelming low-core CI machines.

…ilures Run modular getSessionId probes before all other analytics tests; drop namespace getSessionId coverage to avoid cross-test session interference.

Inline CI at Metro bundle time so Jet tests on device see the CI runner flag instead of an undefined process.env lookup.

Increase Jet reconnectGraceMs from 15s to 30s so transient WS 1006/1001 drops can recover before fatal exit during long debug+coverage runs.

Poll simctl boot state up to 120s before simctl install to avoid LaunchServices races when rebooting between Jet retry attempts.

Log loadavg/memory on transient disconnect, proactively pull coverage after reconnect, and default reconnect grace to 30s in the Jet patch.

Send pull-coverage when mocha-remote client reconnects mid-run and log coverage-ready receipt; align server reconnect grace default to 30s.

Ping keepalive on connect, log send readyState failures, and retry coverage-ready upload up to 3 times with backoff after reconnect.

Snapshot load, top, and e2e-related process stats every 10s into resource-monitor.log for correlating flakes with CPU/memory pressure.

Collate jet-ws, rnfb-e2e, lifecycle, and launch markers from CI logs into flake-summary.txt for faster post-run triage.

Stream testing/SpringBoard logs, run resource monitor, tee Detox output, write flake summary, and upload new diagnostic artifacts on failure.

Match FrontBoard/FBSOpenApplication launch errors and treat coverage teardown WebSocket failures as retryable Jet session failures.

Dump get_app_container/listapps before and after each launch attempt and log the Detox failure reason when launchAppWithRetry gives up.

Time terminateApp during launch retries and reboot the simulator when terminate exceeds RNFB_SLOW_TERMINATE_MS before relaunching.

Use shorter release launch timeout, skip delete on inner retry, and log liveMetro/delete flags to distinguish release stalls from Metro issues.

Mark exhausted inner launch retries as Jet-retryable so debug FrontBoard failures get a full simulator reboot instead of a terminal false.

Emit structured retry-eligibility checks on Jet attempt failure so CI logs show which sub-condition blocked or allowed the second attempt.

Update OKF bundle with new artifacts, boot-simulator shutdown wait, Jet WS/coverage handshake mitigations, FrontBoard launch flakes, and local stress iteration guidance.

Host-orchestrated Tart VMs with detached iteration, session-scoped artifacts, virtiofs completion polling, and optional SCP harvest (--no-sync-artifacts).

Snapshot host and guest loadavg during Detox runs, upload the log as a CI artifact, and include it in flake-summary triage.

Drop bootanim gate (CI uses -no-boot-anim). After adb reboot, wait for boot_completed, package handler queue, and guest loadavg below 5 before starting Jet attempt 2.

Await orchestration teardown, stop Jet, and force-stop the app before adb reboot so attempt 2 does not race with attempt 1 instrumentation.

gemini-code-assist Bot reviewed Jun 20, 2026

View reviewed changes

Comment thread .yarn/patches/mocha-remote-server-npm-1.13.2-619a29d2e3.patch

Comment thread tests/e2e/firebase.test.js

Comment thread tests/e2e/firebase.test.js

mikehardy force-pushed the continued-e2e-deflake branch 2 times, most recently from 897a66b to 98c2229 Compare June 21, 2026 21:42

mikehardy added 21 commits June 22, 2026 21:16

test: mitigate e2e simulator hang / retry flakes

a9c9918

build(android): mitigate build OOM errors w/worker cap and added heap

c8ef565

Cap Gradle workers at min(physical_cpus, 6) to limit parallel heap pressure; 5GB daemon heap handles peak packageDebugAndroidTest load. Scales with hardware without overwhelming low-core CI machines.

test(analytics): better logging and test handling for getSessionId fa…

6d9029e

…ilures Run modular getSessionId probes before all other analytics tests; drop namespace getSessionId coverage to avoid cross-test session interference.

test: ensure e2e app context gets real value for global.isCI

0deeda4

Inline CI at Metro bundle time so Jet tests on device see the CI runner flag instead of an undefined process.env lookup.

style(lint): run prettier on all files

00fe9ac

style(lint): fix eslint config so prettier rules apply as intended

23d94b8

test(ios): raise reconnect grace period

96cce6a

Increase Jet reconnectGraceMs from 15s to 30s so transient WS 1006/1001 drops can recover before fatal exit during long debug+coverage runs.

test(ios): wait shutdown before install

9616bb9

Poll simctl boot state up to 120s before simctl install to avoid LaunchServices races when rebooting between Jet retry attempts.

test(ios): log disconnect context on Jet WS

ff08292

Log loadavg/memory on transient disconnect, proactively pull coverage after reconnect, and default reconnect grace to 30s in the Jet patch.

test(ios): pull coverage after WS reconnect

e809a2f

Send pull-coverage when mocha-remote client reconnects mid-run and log coverage-ready receipt; align server reconnect grace default to 30s.

test(ios): add WS keepalive and coverage retry

5074221

Ping keepalive on connect, log send readyState failures, and retry coverage-ready upload up to 3 times with backoff after reconnect.

test(ios): add resource monitor script

9621f97

Snapshot load, top, and e2e-related process stats every 10s into resource-monitor.log for correlating flakes with CPU/memory pressure.

test(ios): add flake summary script

2758353

Collate jet-ws, rnfb-e2e, lifecycle, and launch markers from CI logs into flake-summary.txt for faster post-run triage.

test(ios): upload filtered e2e diagnostics

3ad714a

Stream testing/SpringBoard logs, run resource monitor, tee Detox output, write flake summary, and upload new diagnostic artifacts on failure.

test(ios): extend retryable launch patterns

0e4abc0

Match FrontBoard/FBSOpenApplication launch errors and treat coverage teardown WebSocket failures as retryable Jet session failures.

test(ios): log simctl install state

c5a4647

Dump get_app_container/listapps before and after each launch attempt and log the Detox failure reason when launchAppWithRetry gives up.

test(ios): reboot after slow terminate

8c96d9a

Time terminateApp during launch retries and reboot the simulator when terminate exceeds RNFB_SLOW_TERMINATE_MS before relaunching.

test(ios): tune release launch retries

f7303b8

Use shorter release launch timeout, skip delete on inner retry, and log liveMetro/delete flags to distinguish release stalls from Metro issues.

test(ios): escalate launch failures to Jet

21953c0

Mark exhausted inner launch retries as Jet-retryable so debug FrontBoard failures get a full simulator reboot instead of a terminal false.

test(ios): log retry eligibility tree

a2e5252

Emit structured retry-eligibility checks on Jet attempt failure so CI logs show which sub-condition blocked or allowed the second attempt.

docs(okf): document iOS e2e deflake instrumentation

ddf48bb

Update OKF bundle with new artifacts, boot-simulator shutdown wait, Jet WS/coverage handshake mitigations, FrontBoard launch flakes, and local stress iteration guidance.

mikehardy force-pushed the continued-e2e-deflake branch 2 times, most recently from ddf48bb to b7c94a6 Compare June 23, 2026 04:20

style(lint): run prettier on new files

6364ad2

mikehardy force-pushed the continued-e2e-deflake branch from b7c94a6 to f9b005f Compare June 23, 2026 13:21

mikehardy added 2 commits June 23, 2026 09:42

test(android): mitigate android orchestration desync with retry

f143294

test(ios): add tart ephemeral e2e reproduction pipeline

7aa8d80

Host-orchestrated Tart VMs with detached iteration, session-scoped artifacts, virtiofs completion polling, and optional SCP harvest (--no-sync-artifacts).

mikehardy force-pushed the continued-e2e-deflake branch from f9b005f to 7aa8d80 Compare June 23, 2026 14:51

mikehardy added 3 commits June 23, 2026 10:56

test(ci): add Android e2e resource monitor

41dc13c

Snapshot host and guest loadavg during Detox runs, upload the log as a CI artifact, and include it in flake-summary triage.

test(android): wait for load and package handler after reboot

d8252b7

Drop bootanim gate (CI uses -no-boot-anim). After adb reboot, wait for boot_completed, package handler queue, and guest loadavg below 5 before starting Jet attempt 2.

test(android): drain Jet attempt before outer retry

07814bd

Await orchestration teardown, stop Jet, and force-stop the app before adb reboot so attempt 2 does not race with attempt 1 instrumentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: mitigate e2e simulator hang / retry flakes#9057

test: mitigate e2e simulator hang / retry flakes#9057
mikehardy wants to merge 27 commits into
mainfrom
continued-e2e-deflake

mikehardy commented Jun 20, 2026

Uh oh!

gemini-code-assist Bot commented Jun 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mikehardy commented Jun 20, 2026

Summary

Test plan

Uh oh!

gemini-code-assist Bot commented Jun 20, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 20, 2026 •

edited

Loading