Disable qnn_16a16w Llama runner test (OOM on linux.2xlarge)#20511
Conversation
The test-llama-runner-qnn-linux (qnn_16a16w) job has been OOM-killed on linux.2xlarge since PR pytorch#19660 landed, blocking viable/strict from advancing for 73+ commits. Disable it while the Qualcomm team investigates the memory regression and potential accuracy issue.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20511
Note: Links to docs will display an error until the docs builds have been completed. ❗ 2 Active SEVsThere are 2 currently active SEVs. If your PR is affected, please view them below:
❌ 3 New Failures, 3 Unrelated Failures, 1 Unclassified FailureAs of commit e620e96 with merge base aada6d7 ( NEW FAILURES - The following jobs have failed:
UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
This PR temporarily removes the qnn_16a16w configuration from the Llama runner QNN Linux CI matrix to prevent OOM failures on linux.2xlarge, unblocking viable/strict while the underlying regression is investigated.
Changes:
- Drop
qnn_16a16wfrom thept2e_quantizematrix for thetest-llama-runner-qnn-linuxjob in both trunk and PR workflows. - Add a TODO note indicating the configuration should be re-enabled once the OOM issue is resolved.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| .github/workflows/trunk.yml | Removes qnn_16a16w from the trunk CI matrix for the QNN Llama runner job. |
| .github/workflows/pull.yml | Removes qnn_16a16w from the PR CI matrix for the QNN Llama runner job. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # TODO(T12345): re-enable qnn_16a16w once OOM on linux.2xlarge is resolved | ||
| pt2e_quantize: [qnn_8a8w] |
| # TODO(T12345): re-enable qnn_16a16w once OOM on linux.2xlarge is resolved | ||
| pt2e_quantize: [qnn_8a8w] |
|
@winskuo-quic can you pls approve it if [test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job ) passes on this PR ? |
winskuo-quic
left a comment
There was a problem hiding this comment.
LGTM. Thanks a lot for the support
larryliu0820
left a comment
There was a problem hiding this comment.
Have you tried linux.4xlarge?
The test-llama-runner-qnn-linux (qnn_16a16w) job has been OOM-killed on linux.2xlarge since PR #19660 landed, blocking viable/strict from advancing for 73+ commits. Disable it while the Qualcomm team investigates the memory regression and potential accuracy issue.