Added llama3.1-70b Benchmarking recipe on A3-Mega nodes by krishnakanthankam-qt · Pull Request #246 · AI-Hypercomputer/gpu-recipes

krishnakanthankam-qt · 2026-06-05T11:08:17Z

Description

Title

Add Llama 3.1 70B Recipe and Optimized Sequential Benchmarking

Summary

Introduces a high-performance recipe for serving and benchmarking Llama 3.1 70B on A3mega GKE node pools.

google-cla · 2026-06-05T11:08:27Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

depksingh · 2026-06-12T05:36:19Z

+
+This recipe supports the following models. Running TRTLLM inference benchmarking on these models are only tested and validated on A3-Mega GKE nodes with certain combination of TP, PP, EP, number of GPU chips, input & output sequence length, precision, etc.
+
+Example model configuration YAML files included in this repo only show a certain combination of parallelism hyperparameters and configs for benchmarking purposes. Input and output length in `/home/akrishnakanth/gpu-recipes/inference/a3mega/llama3.1-70b/trtllm-gke/values.yaml` need to be adjusted according to the model and its configs.


we can remove this

depksingh · 2026-06-12T05:46:20Z

-    rm -rf $engine_dir
-    rm -f $dataset_file
+    rm -rf $engine_dir || true
+    rm -f $dataset_file || true


Please remove

krishnakanthankam-qt added 2 commits June 5, 2026 16:07

Added new recipe for llama3.1-70b on A3-mega nodes

21ba9f6

modified trtllm-launcher.sh for backward compatibility

d4c2a9c

depksingh marked this pull request as draft June 5, 2026 11:14

krishnakanthankam-qt added 2 commits June 11, 2026 14:29

streamlined launcher script and modified helm deployment

5feb2d3

added ld_prelaod path as env to the container

ebc6650

depksingh reviewed Jun 12, 2026

View reviewed changes

krishnakanthankam-qt added 4 commits June 12, 2026 12:47

standardize readme

8bc5552

modified readme

8cdcfe1

updated readme

e1c28af

fixed readme.md

caba832

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added llama3.1-70b Benchmarking recipe on A3-Mega nodes#246

Added llama3.1-70b Benchmarking recipe on A3-Mega nodes#246
krishnakanthankam-qt wants to merge 8 commits into
AI-Hypercomputer:mainfrom
krishnakanthankam-qt:main

krishnakanthankam-qt commented Jun 5, 2026 •

edited

Loading

Uh oh!

google-cla Bot commented Jun 5, 2026

Uh oh!

depksingh Jun 12, 2026

Uh oh!

depksingh Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		This recipe supports the following models. Running TRTLLM inference benchmarking on these models are only tested and validated on A3-Mega GKE nodes with certain combination of TP, PP, EP, number of GPU chips, input & output sequence length, precision, etc.

		Example model configuration YAML files included in this repo only show a certain combination of parallelism hyperparameters and configs for benchmarking purposes. Input and output length in `/home/akrishnakanth/gpu-recipes/inference/a3mega/llama3.1-70b/trtllm-gke/values.yaml` need to be adjusted according to the model and its configs.

Conversation

krishnakanthankam-qt commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Title

Summary

Uh oh!

google-cla Bot commented Jun 5, 2026

Uh oh!

depksingh Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

depksingh Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

krishnakanthankam-qt commented Jun 5, 2026 •

edited

Loading