test(systest): upload deps to local cluster cache without redirect server#10487
Open
basvandijk wants to merge 2 commits into
Open
test(systest): upload deps to local cluster cache without redirect server#10487basvandijk wants to merge 2 commits into
basvandijk wants to merge 2 commits into
Conversation
…rver upload_systest_dep.sh no longer queries the redirect server (artifacts.idx.dfinity.network). Instead it checks the local cluster's bazel-remote cache directly and uploads there if the dependency is missing. The local cluster name (needed to build the artifacts.<cluster>.dfinity.network download URL) is auto-detected from the in-cluster Kubernetes API server certificate SAN (api.<cluster>.dfinity.network), which works on both CI runners and devenvs. It can be overridden via the SYSTEST_UPLOAD_CLUSTER environment variable.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates rs/tests/upload_systest_dep.sh to eliminate reliance on the cross-cluster redirect server by checking/uploading system-test dependencies directly against the local cluster’s bazel-remote cache, while preserving the output contract (stdout emits the artifacts URL).
Changes:
- Replace redirect-server lookup with a direct
HEADexistence check against in-cluster bazel-remote (dep_in_cache). - Add local cluster auto-detection via Kubernetes API server certificate SANs, with
SYSTEST_UPLOAD_CLUSTERoverride (resolve_cluster). - Keep upload target the same (local bazel-remote) and re-check local cache after upload until the blob is served.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Validate SYSTEST_UPLOAD_CLUSTER (and the auto-detected name) against a strict cluster-name pattern before it is interpolated into the artifacts.<cluster>.dfinity.network URL. - Add an explicit openssl presence check with a clear error message. - Bound the API-server probe with 'timeout 15' so DNS/network stalls fail fast. - Use 'grep -m1' instead of 'grep | head' to avoid SIGPIPE under pipefail.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
rs/tests/upload_systest_dep.shno longer queries the redirect server (https://artifacts.idx.dfinity.network). Instead it checks the local cluster's bazel-remote cache directly and uploads the dependency there if it is missing.Why
To reduce flakiness and operational complexity 557d727 forced testnet allocation to the DC the github runner was hosted in. We would like to have similar behaviour for manually run system-tests from devenvs. To make this easier to achieve we should always serve images from the cluster we're running the test from. That is what this commit implements.
Users should then manually pass
--set-required-host-features=dc=$DCwhere$DCis their local DC to force testnet allocation to that DC. We could also consider automating this but it will be more tricky for cloud-engines and performance tests.How
dep_in_cache()does aHEADagainsthttp://server.bazel-remote.svc.cluster.local:8080/cas/<sha>(200= present,404= absent, anything else is a hard error).resolve_cluster()auto-detects the local cluster (e.g.zh1-idx1) from the in-cluster Kubernetes API server certificate SAN (api.<cluster>.dfinity.network). This works on both CI runners and devenvs and yields the exact cluster name. It can be overridden with theSYSTEST_UPLOAD_CLUSTERenvironment variable.https://artifacts.<cluster>.dfinity.network/cas/<sha>on stdout, preserving the contract withrun_systest.sh.The
NODE_NAME-based derivation was intentionally avoided: it only encodes the DC (e.g.dm1), not the full cluster (dm1-idx1vsdm1-idx2), and it does not exist in devenvs.Testing
Validated in a
zh1-idx1devenv:bash -nandshellcheckclean.artifacts.zh1-idx1.dfinity.network/cas/<sha>URL actually serves the uploaded blob.SYSTEST_UPLOAD_CLUSTER=dm1-idx9correctly overrides auto-detection.bazel test //rs/tests/idx:basic_health_testpasses.Notes
openssland reachability ofkubernetes.default.svc:443from the build container (both available on CI runners and devenvs).SYSTEST_UPLOAD_CLUSTERis the escape hatch otherwise.