Skip to content

test: mitigate e2e simulator hang / retry flakes#9057

Open
mikehardy wants to merge 27 commits into
mainfrom
continued-e2e-deflake
Open

test: mitigate e2e simulator hang / retry flakes#9057
mikehardy wants to merge 27 commits into
mainfrom
continued-e2e-deflake

Conversation

@mikehardy

Copy link
Copy Markdown
Collaborator

Summary

This PR is intended to hold a continued series of e2e flake fixes.

  1. Simulator hang / Jet retry recovery — Preserve the mocha-remote runner across transient Jet WS disconnects, kill and reboot the resolved iOS simulator before retries, and ensure debug Jet retries still wait for Metro before relaunching the app.

Test plan

  • iOS E2E debug CI passes on merge (or Jet attempt-2 retry recovers cleanly after a transient 1006/session desync)
  • iOS E2E release CI remains green

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a series of stability improvements for end-to-end testing, specifically targeting flaky simulator behavior and WebSocket connection issues. By enhancing the recovery logic for both the test runner and the underlying iOS simulator, the changes aim to reduce CI noise caused by transient infrastructure failures.

Highlights

  • Mocha Remote Server Resilience: Updated the mocha-remote-server to handle transient WebSocket disconnects by introducing a grace period and state management to preserve the test runner during brief network interruptions.
  • iOS Simulator Recovery: Implemented a more robust recovery mechanism for iOS E2E tests by rebooting the simulator when specific launch failures occur, rather than simply terminating the app.
  • E2E Test Stability: Expanded the list of retryable launch failures and added caching for Metro status to improve the reliability of the test suite execution.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/scripts/boot-simulator.sh
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a patch to mocha-remote-server to handle transient client disconnects with a reconnect grace timer, and updates the E2E test suite to reboot the iOS simulator upon retryable launch failures. Feedback highlights a critical race condition in the server patch where a restarted client process may hang waiting for a run command that is never sent. Additionally, the synchronous reboot of the iOS simulator blocks the Node.js event loop, potentially freezing the WebSocket server; it is recommended to refactor this to be asynchronous and properly awaited.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread .yarn/patches/mocha-remote-server-npm-1.13.2-619a29d2e3.patch
Comment thread tests/e2e/firebase.test.js
Comment thread tests/e2e/firebase.test.js
@codecov

codecov Bot commented Jun 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 70.58824% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.23%. Comparing base (f375acf) to head (7aa8d80).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #9057      +/-   ##
============================================
+ Coverage     60.92%   62.23%   +1.31%     
============================================
  Files           457      351     -106     
  Lines         33665    23396   -10269     
  Branches       5479     3978    -1501     
============================================
- Hits          20508    14558    -5950     
+ Misses        12026     8361    -3665     
+ Partials       1131      477     -654     
Flag Coverage Δ
android-native ?
e2e-ts-android ?
e2e-ts-ios 51.46% <16.67%> (-0.02%) ⬇️
e2e-ts-macos 26.13% <11.12%> (ø)
ios-native 51.46% <16.67%> (-0.02%) ⬇️
jest 62.35% <80.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mikehardy mikehardy force-pushed the continued-e2e-deflake branch 2 times, most recently from 897a66b to 98c2229 Compare June 21, 2026 21:42
mikehardy added 21 commits June 22, 2026 21:16
Cap Gradle workers at min(physical_cpus, 6) to limit parallel heap
pressure; 5GB daemon heap handles peak packageDebugAndroidTest load.
Scales with hardware without overwhelming low-core CI machines.
…ilures

Run modular getSessionId probes before all other analytics tests; drop
namespace getSessionId coverage to avoid cross-test session interference.
Inline CI at Metro bundle time so Jet tests on device see the CI runner
flag instead of an undefined process.env lookup.
Increase Jet reconnectGraceMs from 15s to 30s so transient WS 1006/1001
drops can recover before fatal exit during long debug+coverage runs.
Poll simctl boot state up to 120s before simctl install to avoid
LaunchServices races when rebooting between Jet retry attempts.
Log loadavg/memory on transient disconnect, proactively pull coverage
after reconnect, and default reconnect grace to 30s in the Jet patch.
Send pull-coverage when mocha-remote client reconnects mid-run and log
coverage-ready receipt; align server reconnect grace default to 30s.
Ping keepalive on connect, log send readyState failures, and retry
coverage-ready upload up to 3 times with backoff after reconnect.
Snapshot load, top, and e2e-related process stats every 10s into
resource-monitor.log for correlating flakes with CPU/memory pressure.
Collate jet-ws, rnfb-e2e, lifecycle, and launch markers from CI logs
into flake-summary.txt for faster post-run triage.
Stream testing/SpringBoard logs, run resource monitor, tee Detox output,
write flake summary, and upload new diagnostic artifacts on failure.
Match FrontBoard/FBSOpenApplication launch errors and treat coverage
teardown WebSocket failures as retryable Jet session failures.
Dump get_app_container/listapps before and after each launch attempt and
log the Detox failure reason when launchAppWithRetry gives up.
Time terminateApp during launch retries and reboot the simulator when
terminate exceeds RNFB_SLOW_TERMINATE_MS before relaunching.
Use shorter release launch timeout, skip delete on inner retry, and log
liveMetro/delete flags to distinguish release stalls from Metro issues.
Mark exhausted inner launch retries as Jet-retryable so debug FrontBoard
failures get a full simulator reboot instead of a terminal false.
Emit structured retry-eligibility checks on Jet attempt failure so CI
logs show which sub-condition blocked or allowed the second attempt.
Update OKF bundle with new artifacts, boot-simulator shutdown wait,
Jet WS/coverage handshake mitigations, FrontBoard launch flakes, and
local stress iteration guidance.
@mikehardy mikehardy force-pushed the continued-e2e-deflake branch 2 times, most recently from ddf48bb to b7c94a6 Compare June 23, 2026 04:20
@mikehardy mikehardy force-pushed the continued-e2e-deflake branch from b7c94a6 to f9b005f Compare June 23, 2026 13:21
Host-orchestrated Tart VMs with detached iteration, session-scoped artifacts,
virtiofs completion polling, and optional SCP harvest (--no-sync-artifacts).
@mikehardy mikehardy force-pushed the continued-e2e-deflake branch from f9b005f to 7aa8d80 Compare June 23, 2026 14:51
Snapshot host and guest loadavg during Detox runs, upload the log as
a CI artifact, and include it in flake-summary triage.
Drop bootanim gate (CI uses -no-boot-anim). After adb reboot, wait for
boot_completed, package handler queue, and guest loadavg below 5 before
starting Jet attempt 2.
Await orchestration teardown, stop Jet, and force-stop the app before
adb reboot so attempt 2 does not race with attempt 1 instrumentation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant