Feature/add gpu mlp estimation package#47
Merged
yoshifuminakamura merged 2 commits intoJun 12, 2026
Conversation
Signed-off-by: Yoshifumi Nakamura <nakamura@riken.jp>
Connect GENESIS GPU runs to the PerfTools MLP_NN/v1.5 estimator path for early CI validation. GENESIS now emits a gpu_kernel_region estimation section when GPU MLP profiling is enabled and a padata archive is available, and the gpu_kernel_mlp_v15 section package can consume BenchKit padata archives containing Nsight Compute raw CSV data. Add an NCU-to-PerfTools input bridge that extracts profile_raw.csv from padata, normalizes the Nsight Compute columns observed on MiyabiG, fills the current v1.5 static GPU spec gaps for known GPUs, and produces a prepared CSV before invoking predict_v15.py. Temporary NCU extraction stays outside the uploaded estimation artifact bundle so raw profiler data is not duplicated. Add local validation support via scripts/test_estimate_submit.sh, mirroring test_submit.sh style for scheduler submission and adding an --estimate-only mode that can run inside Apptainer with PERFTOOLS/SIF or the corresponding BK_* variables. Record GENESIS results with the GENESIS-specific Exp p8 and declare the same baseline Exp for estimation. This avoids falling through to the common CASE0 default, which is QWS-specific and is not a valid GENESIS experiment label. Rename the lightweight estimation bundle from estimation_inputs to estimation_artifacts because it now carries prepared inputs, prediction outputs, and logs. Result Server storage, client restore paths, tests, and docs now use results/estimation_artifacts and received_estimation_artifacts. Add canonical Result Server APIs /api/ingest/estimation-artifacts and /api/query/estimation-artifacts while keeping the old estimation-inputs endpoints as compatibility aliases. send_estimate.sh posts to the new endpoint and falls back to the legacy endpoint on 404. Avoid duplicate large uploads: send_results.sh no longer uploads estimation bundles, send_estimate.sh excludes raw profiler archives such as *.ncu-rep, profile_raw.csv, padata*.tgz, and nested tgz files, and HTTP 413 for estimation artifact upload is treated as non-fatal after the Estimate JSON has been ingested. Allow local matrix generation without CI_PIPELINE_SOURCE by recording PARENT_PIPELINE_SOURCE=local, and align the PerfTools smoke-mode documentation with the repository-wide Python 3.12+ runtime expectation. Temporary bring-up wiring is intentionally explicit in GitLab CI: BK_QWS_GPU_MLP_SMOKE, BK_ESTIMATE_RUNNER_TAG=fncx-estimate-python, BK_GPU_MLP_PERFTOOLS_REPO/REF, BK_GENESIS_GPU_MLP_PROFILE, BK_GPU_MLP_NCU_LAUNCH_COUNT, BK_GPU_MLP_SOURCE_GPU, and BK_GPU_MLP_KERNEL_COUNT are provisional switches. Remove or replace them once the real estimator runner/package flow is settled. Validation: WSL bash -n and shellcheck -S error for changed shell scripts; test_send_estimate_artifacts.sh, test_qws_gpu_mlp_smoke_estimation.sh, test_estimation_gpu_kernel_mlp_v15.sh, test_genesis_gpu_mlp_estimation.sh, and test_send_results_profile_data.sh; WSL result_server pytest for API, upload limits, audit logging, CSRF, and rate limiting; git diff --check. Signed-off-by: Yoshifumi Nakamura <nakamura@riken.jp>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR intentionally includes temporary CI switches for the GPU MLP estimator bring-up. They are documented in the commit message and should be replaced once the estimator runner/package flow is finalized.