Skip to content

ci(java): require explicit java_version for variant builds, drop repo-var fallback#35987

Open
sfreudenthaler wants to merge 8 commits into
mainfrom
issue-35986-explicit-java-variant
Open

ci(java): require explicit java_version for variant builds, drop repo-var fallback#35987
sfreudenthaler wants to merge 8 commits into
mainfrom
issue-35986-explicit-java-variant

Conversation

@sfreudenthaler

@sfreudenthaler sfreudenthaler commented Jun 4, 2026

Copy link
Copy Markdown
Member

Fixes #35986

What

Makes .sdkmanrc the single knob for the dotCMS Java version and makes the pipeline self-heal the JDK base image instead of failing when it's missing. Follow-up to #35914 (Java 25 default) and #35915 (variant auto-build retired).

Design pivot (agreed with @wezell, #feat-java25 Jun 5): the earlier version of this PR failed fast when a variant's Java version was empty/malformed/equal-to-default. That's a stepping stone, not the destination. Instead: check whether dotcms/java-base:${version} exists and build + push it on demand if not — reused both in the variant release and in the normal dotCMS image build, so ticking or overriding .sdkmanrc "just works".

Self-healing base image

The dotCMS image is FROM dotcms/java-base:${SDKMAN_JAVA_VERSION} (dotCMS/src/main/docker/original/Dockerfile). That base was previously only ever built by the manual cicd_manual_build-java-base.yml workflow, so bumping .sdkmanrc to a version whose base hadn't been pre-built would break every build.

  • New composite action core-cicd/ensure-java-base — anonymous existence check (docker buildx imagetools inspect); when the tag is missing it builds the base:
    • Docker Hub creds present → build multi-arch and push (publishes it for everyone).
    • No creds (fork PR) → build locally (--load, host arch) so the current build still resolves FROM — the build never fails just because the base was absent.
    • Restores the default Docker builder afterward so maven-job's subsequent plain docker build / docker save is unaffected.
  • Wired into maven-job between get-sdkman-version and the dotCMS image build, reusing the already-resolved SDKMAN_JAVA_VERSION so the base tag and the FROM build-arg stay in lockstep. This is the single chokepoint every Docker build flows through → covers PR, trunk, release, and variant builds.
  • Optional DOCKER_USERNAME / DOCKER_TOKEN secrets threaded through cicd_comp_build-phase.yml and passed by the publishing callers (release, variant, trunk, merge-queue, nightly, lts, manual-deploy) so a missing base gets published. PR builds don't pass them → graceful local-only build. The dotCMS image build itself still pulls the public base anonymously, so these are not required for a normal build.
  • cicd_manual_build-java-base.yml refactored to reuse the action (force: true), removing the duplicated buildx / build-push logic — one source of truth for how the base is built.
flowchart TD
    SDK[".sdkmanrc / -Dsdkman.java.version"] --> RES["maven-job: resolve<br/>SDKMAN_JAVA_VERSION"]
    RES --> ENS["ensure-java-base action"]
    ENS -->|"exists on Docker Hub"| NOOP["no-op"]
    ENS -->|"missing + creds"| PUSH["build multi-arch + PUSH<br/>dotcms/java-base:VERSION"]
    ENS -->|"missing, no creds"| LOAD["build amd64 + --load locally"]
    NOOP --> IMG["docker build dotCMS image<br/>FROM dotcms/java-base:VERSION"]
    PUSH --> IMG
    LOAD --> IMG
Loading

Variant workflow (cicd_7-release-java-variant.yml) — manual-only guard kept

workflow_dispatch-only, so its guard never affects automated runs. java_version stays required and an equals-default request still errors (it would be a redundant duplicate of the primary release image — the empty-dispatch protection that prevents overwriting primary Docker tags). A missing base is no longer a dead-end: it's self-healed by the shared maven-job step. Also added an explicit empty-default_java check so a wrong branch name or missing/malformed .sdkmanrc fails loudly instead of the equality check silently passing.

.sdkmanrc as single source of truth in the pom (retained from earlier review feedback)

Unchanged from the prior iteration and survives the rework intact:

  • jdk.min.version chains to the effective ${dotcms.core.compiler.release}.
  • A build-helper:regex-property execution derives sdkmanrc.java.major from the .sdkmanrc java entry.
  • An enforcer rule fails every build (local + CI, at validate) when dotcms.core.compiler.release.default doesn't match the .sdkmanrc major, naming both files in the remediation message.
  • .mvn/maven-build-cache-config.xml gains runAlways for the property-read / regex-property / enforcer executions so a cache hit can't skip the drift guard.

Net effect — a future default-Java cutover (e.g. java-29) is: update .sdkmanrc + one number in parent/pom.xml, dispatch/merge as usual, and the base image is produced automatically. Forget the pom number and every build fails at validate with remediation instructions.

Why

After #35915 the variant workflow is manual-only, but an empty dispatch still silently fell back to repo vars (now pointing at the Java 25 default) — a redundant duplicate of the primary image; deleting those vars without a guard would let an empty dispatch overwrite the primary release Docker tags. More broadly, the only way to introduce a new Java version's base image was a manual pre-build step. This PR removes both foot-guns: the workflow requires a deliberate variant version, and the base image builds itself on demand wherever it's needed.

Verification

  • actionlint clean on all changed workflows; every changed YAML parses.
  • ensure-java-base decision logic dry-run across all paths: exists→no-op, missing+creds→push multi-arch, missing+no-creds→load amd64, force+push=false→validation-only, push=true+no-creds→error.
  • pom files untouched, so prior ./mvnw validate results (default / variant-override / drift-simulation) still hold.
  • Remaining (CI): confirm a normal build logs "base exists → skip"; dispatch cicd_manual_build-java-base.yml for a throwaway version; dispatch the variant workflow against a Java version whose base is absent and confirm it self-heals then builds.

⚠️ Post-merge step

Delete the now-unused repository variables (settings, not code):

RELEASE_JAVA_VARIANT_VERSION   (currently 25.0.2-ms)
RELEASE_JAVA_VARIANT_SUFFIX    (currently java-25)

🤖 Generated with Claude Code

…-var fallback

The java-variant workflow still fell back to RELEASE_JAVA_VARIANT_VERSION /
RELEASE_JAVA_VARIANT_SUFFIX when dispatched with an empty java_version. With
Java 25 now the default (#35914) those vars point at the default itself, so an
empty dispatch produced a redundant duplicate of the primary release image —
and deleting the vars without a guard would have made an empty dispatch fall
through to .sdkmanrc with no suffix, overwriting the primary release tags.

Make .sdkmanrc the single source of truth for the default Java:
- java_version is now a required dispatch input; all vars.RELEASE_JAVA_VARIANT_*
  fallbacks are removed (build, deployment, release, concurrency).
- New fail-fast guard in extract-version reads .sdkmanrc from the release
  branch being built and errors when the requested version is empty or equal
  to the default.
- Header docs updated; run-name simplified for the dispatch-only trigger.

After this merges, the RELEASE_JAVA_VARIANT_VERSION and
RELEASE_JAVA_VARIANT_SUFFIX repository variables should be deleted.

Fixes #35986

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sfreudenthaler sfreudenthaler requested a review from a team as a code owner June 4, 2026 20:00
@github-actions github-actions Bot added the Area : CI/CD PR changes GitHub Actions/workflows label Jun 4, 2026
@claude

claude Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Claude finished @sfreudenthaler's task in 3m 50s —— View job


PR Review

  • Read changed files
  • Review workflow changes
  • Review pom.xml changes
  • Review new ensure-java-base action
  • Review maven-job changes
  • Post findings

Overall this is well-constructed. The approach is sound: single source of truth for the Java version, fail-fast guard on the variant workflow, self-healing base image. A few things worth a second look:

ensure-java-base/action.yml

built output is set at plan time, before the build step executes.
steps.plan.outputs.built is set to ${need_build} — if the actual Build Java Base Image step fails, the action still emits built=true. No callers in the current codebase consume this output, so it's harmless now, but could mislead if a caller is added later. Consider emitting it as a step output from a post-build step instead, or documenting that it means "build was attempted."

Restore default Docker builder is gated on steps.buildx.outcome == 'success' but runs under always().
This is the right pattern, but note: the buildx step ID appears at line 161 in the action, yet it's skipped (if: steps.plan.outputs.need_build == 'true') on the common no-op path. On the no-op path, steps.buildx.outcome is 'skipped', so the restore step is also skipped — which is correct since no builder was set up. This is fine, just confirm the no-op path leaves the default builder untouched (it does, since setup-buildx-action was never called).

No tr -d '\r' on the awk read in maven-job's get-sdkman-version step.
cicd_7-release-java-variant.yml correctly strips \r from the GitHub API response (line 141). But the get-sdkman-version step in maven-job (line 334) uses awk -F= '/^java=/ {print $2}' on the local .sdkmanrc without stripping carriage returns:

SDKMAN_JAVA_VERSION=$(awk -F= '/^java=/ {print $2}' .sdkmanrc)

If .sdkmanrc is ever committed with CRLF line endings (e.g. on a Windows dev workstation), this would produce a tag like dotcms/java-base:25.0.2-ms\r and silently fail or create a bad image tag. Low risk on Linux CI runners, but inconsistent with the CRLF guard elsewhere. Fix this →

parent/pom.xml

The enforcer <message> for the null-property case may mislead on a .sdkmanrc parse failure.
Line 1196:

<message>Property sdkmanrc.java.major was not derived from .sdkmanrc - the derive-sdkmanrc-java-major execution did not run before the enforcer.</message>

This fires when sdkmanrc.java.major is null/empty. With runAlways set in the cache config, the most likely cause of an empty value isn't a missing execution — it's a malformed .sdkmanrc (no java= line, CRLF breaking the regex, etc.). The message directing devs to check execution ordering could be a dead end. Consider: "sdkmanrc.java.major is unset — check that .sdkmanrc contains a valid java=<version> entry." Fix this →

jdk.min.version now follows -Ddotcms.core.compiler.release overrides — verify behavior for the release-11 pinned modules.
jdk.min.version = ${dotcms.core.compiler.release} means passing -Ddotcms.core.compiler.release=11 lowers the minimum JDK floor to 11. The PR says this passes the full reactor validate. If a submodule overrides dotcms.core.compiler.release in its own POM (rather than via -D), that module's jdk.min.version will also track its local override — which is probably correct. Worth confirming with the release-11-pinned modules the PR mentions.

.mvn/maven-build-cache-config.xml

runAlways with artifactId="maven-enforcer-plugin" affects all enforcer executions across all modules, not just enforce-java-version.
Every enforce goal in every submodule will be re-run on cache hits. This is likely the correct intent (all enforcement should fire regardless of cache state), but it widens scope beyond the stated goal and may measurably slow down cached builds in a large reactor. Consider scoping by execution ID if the enforcer plugin supports it, or document the intentional scope.

cicd_7-release-java-variant.yml

artifact_suffix input is still required: false even though the effective suffix is always derived.
The input description says "if not set, derived from java-version major." This is intentional flexibility, but it means a caller that supplies a non-normalized suffix (e.g. java-25 vs java25) gets a different effective suffix than one using the auto-derived value. The normalization strip at line 166 (while [[ "${artifact_suffix}" =~ ^[-_] ]]) only strips leading separators, not trailing ones or mixed ones. This is fine for the documented convention but could silently produce unexpected artifact names if a user passes something like _java-25_ms. Very low risk for a manual-only workflow.

No permissions: contents: read on the extract-version job.
The job calls gh api repos/…/contents/.sdkmanrc using ${{ github.token }}. The default token has contents: read unless the workflow-level permissions block restricts it, but the variant workflow has no top-level permissions: block. This is fine — the default is read-only — but worth adding explicitly given that the other jobs do declare permissions.

No issues found

  • The artifact_suffix derivation once in extract-version and passing it through all phases correctly fixes the java21 vs java-21 mismatch flagged by Codex.
  • The sdkman.java.version override in maven-job correctly wires variant JDK into the Docker image.
  • The manual build-java-base workflow correctly delegates to ensure-java-base with force=true.
  • Cache-hit runAlways entries correctly cover all three executions that must fire regardless of cache state.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ebcb99b3dc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/cicd_7-release-java-variant.yml Outdated

@sfreudenthaler sfreudenthaler left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ready for review

Addresses Codex P1 on #35987: with the repo-var fallback removed, an empty
artifact_suffix let each phase derive its own default — maven-job derives
java${major} while the deployment phase derives java-${major} — so the build
uploaded docker-build-context-java21 and deployment tried to download
docker-build-context-java-21, failing after a successful build.

Derive the effective suffix once in extract-version (explicit input wins,
else java-{major} per the release variant convention) and pass the same
value to the build, deployment, and release phases. Also validates that
java_version starts with a numeric major before deriving.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Strip \r when parsing .sdkmanrc: a CRLF-committed file (nothing in
  .gitattributes normalizes it) would leave a trailing \r in default_java,
  making the same-as-default equality guard silently fail open.
- Fail with a clear message when the release branch / .sdkmanrc cannot be
  read via the API, or when the file has no java= entry, instead of
  surfacing a raw gh error.

Addresses review feedback on #35987.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e blocks

Team is Mac/Linux only and the current blob is clean LF, so the
unreachable-in-practice error wrappers (unreadable branch, missing java=
entry) aren't worth the reading cost; a failing gh api call already fails
the step. Keep the zero-cost tr -d '\r' so a CRLF-committed file can never
make the same-as-default guard fail open.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@wezell wezell left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this, or can we make this, update the mvn parent/pox.xml variables ? These in particular:

        <jdk.min.version>25</jdk.min.version>
        <maven.compiler.argument.source>${maven.compiler.source}</maven.compiler.argument.source>
        <maven.compiler.argument.target>${maven.compiler.target}</maven.compiler.argument.target>
        <maven.compiler.argument.testSource>${maven.compiler.testSource}</maven.compiler.argument.testSource>
        <maven.compiler.argument.testTarget>${maven.compiler.testTarget}</maven.compiler.argument.testTarget>

Or is this not a good idea? It would be very nice to have .sdkman variable be the single source of truth.

@sfreudenthaler

Copy link
Copy Markdown
Member Author

Does this, or can we make this, update the mvn parent/pox.xml variables ? These in particular:

        <jdk.min.version>25</jdk.min.version>
        <maven.compiler.argument.source>${maven.compiler.source}</maven.compiler.argument.source>
        <maven.compiler.argument.target>${maven.compiler.target}</maven.compiler.argument.target>
        <maven.compiler.argument.testSource>${maven.compiler.testSource}</maven.compiler.argument.testSource>
        <maven.compiler.argument.testTarget>${maven.compiler.testTarget}</maven.compiler.argument.testTarget>

Or is this not a good idea? It would be very nice to have .sdkman variable be the single source of truth.

let me look tonight. agree and it's in the spirit of what i'm trying to push towards in this PR

…t Java version

Addresses review feedback on the variant workflow: changing the default Java
version previously required hunting down literals in parent/pom.xml alongside
.sdkmanrc. Now a bump is .sdkmanrc plus one number, and the build fails fast
at validate if they drift.

parent/pom.xml:
- jdk.min.version now chains to the effective dotcms.core.compiler.release
  (follows both the default and deliberate -D overrides) instead of a literal
- new build-helper regex-property execution derives sdkmanrc.java.major from
  the .sdkmanrc java entry (via the existing ext.java property load)
- new enforcer requireProperty rule fails the build when
  dotcms.core.compiler.release.default does not match the .sdkmanrc major
- dotcms.core.compiler.release.default stays a literal on purpose: IDEs derive
  the project language level from the static model and cannot evaluate
  plugin-derived properties; the enforcer guard makes drift impossible instead

.mvn/maven-build-cache-config.xml:
- force properties:read-project-properties, build-helper:regex-property and
  enforcer:enforce to run on cache hits; skipping the property loads while
  populating the regex-property mojo fails with unresolved parameters, and the
  drift guard must fire regardless of cache state

maven-job action:
- when java-version is overridden (variant builds), also pass
  -Dsdkman.java.version so the Docker image produced by the Maven build embeds
  the variant JDK instead of the committed .sdkmanrc default

Verified: default validate passes (cached and uncached back-to-back), drift
(.sdkmanrc=26 vs default 25) fails with a remediation message, variant
(-Dsdkman.java.version=21.0.8-ms -Ddotcms.core.compiler.release=21) and
back-compat (-Ddotcms.core.compiler.release=11) overrides pass, full reactor
validate passes, and runtime.docker.sdkman.java.version resolves to the
variant JDK under override.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added Area : Backend PR changes Java/Maven backend code and removed AI: Safe To Rollback labels Jun 5, 2026
@sfreudenthaler

sfreudenthaler commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

@wezell ready for rereivew. got it down to 2 changes. that's the best claude thinks we can do while keeping good ide support.

90fb73cb5e

@sfreudenthaler sfreudenthaler requested a review from wezell June 5, 2026 15:29
…tent comment, stale suffix doc

- run-name: java_version is a required input, so drop the dead || '' branch
- java-guard: document that .sdkmanrc is read from the branch tip while the
  build checks out the tag, and why divergence is harmless (explicit suffix
  means primary tags can never be overwritten)
- maven-job: clarify the artifact-suffix fallback is for callers without a
  later download phase; the variant workflow always passes an explicit value

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Matches the enforcer's default binding; makes the in-phase ordering
dependency on derive-sdkmanrc-java-major visible to readers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sfreudenthaler sfreudenthaler enabled auto-merge June 5, 2026 17:14
@sfreudenthaler sfreudenthaler marked this pull request as draft June 9, 2026 23:31
auto-merge was automatically disabled June 9, 2026 23:31

Pull request was converted to draft

@sfreudenthaler

Copy link
Copy Markdown
Member Author

🚧 Converting to draft — design pivot agreed with @wezell

Parking this pending a redesign. Capturing the agreed direction from the #feat-java25 thread (Jun 5) so the context isn't lost.

What this PR does today (the narrow hardening)

  • java_version becomes a required dispatch input; all vars.RELEASE_JAVA_VARIANT_* fallbacks removed
  • fail-fast guard in extract-version (empty / malformed / equals-default → error)
  • .sdkmanrc as single source of truth in parent/pom.xml via a regex-property + enforcer drift guard
  • variant Docker image fix in maven-job (-Dsdkman.java.version now passed through)

This is all valid and green — but it's fail-fast-if-the-version-is-wrong, which is a stepping stone, not the destination.

The direction instead (@wezell)

Make .sdkmanrc the single knob and have the pipeline self-heal the base image rather than erroring:

An action that checks whether the Java base exists — e.g. dotcms/java-base:25.0.2-ms on Docker Hub. If it does, return OK. If it does not, build and push that image for that version before building dotCMS. Reuse that action here (variant release) and also earlier when building the main dotCMS image, so we can tick or override the .sdkmanrc variable and everything just works.

Net design:

  1. New reusable action: ensure java-base:${version} exists → build+push if missing
  2. Wire it into both the normal dotCMS image build and the variant-release workflow
  3. .sdkmanrc (or an explicit override) is the only thing you change to switch Java versions — no manual base-image prep, no fail-fast dead-ends

Status / urgency

Java 25 prod rollout completed Jun 8, so the variant workflow is no longer time-pressured. Reopening for review once reworked around the base-image-existence action. The pom/enforcer single-source-of-truth piece here likely survives the rework intact.

…ing version

Redesign per @wezell: make .sdkmanrc the single knob and have the pipeline
build the JDK base image on demand, rather than fail-fast when it is absent.

- New composite action core-cicd/ensure-java-base: anonymous existence check
  (buildx imagetools inspect); when missing, build+push multi-arch if Docker Hub
  creds are present, else build locally (--load, amd64) so the current build can
  still proceed. Restores the default Docker builder afterward so maven-job's
  subsequent plain docker build/save is unaffected.
- Wire it into maven-job between get-sdkman-version and the dotCMS image build,
  reusing the already-resolved SDKMAN_JAVA_VERSION so base tag and FROM stay in
  lockstep. Single chokepoint => covers main, PR, release and variant builds.
- Thread optional DOCKER_USERNAME/DOCKER_TOKEN secrets through build-phase to the
  publishing callers (release, variant, trunk, merge-queue, nightly, lts,
  manual-deploy) so a missing base is published; PR builds degrade to local build.
- Refactor cicd_manual_build-java-base.yml to reuse the action (force build),
  removing the duplicated buildx/build-push logic.
- Variant workflow: keep the manual-only required/equals-default guard, add an
  explicit empty-default_java check (fails loudly on a bad branch/.sdkmanrc), and
  document that the base image now self-heals.

The pom .sdkmanrc single-source-of-truth + drift enforcer from earlier in this PR
are retained unchanged.

Refs #35986

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sfreudenthaler sfreudenthaler marked this pull request as ready for review June 9, 2026 23:56

@wezell wezell left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this. Once merged, let's exercise the job by ticking our .sdkmanrc java to 25.0.2-ms

@wezell wezell added this pull request to the merge queue Jun 10, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code Area : CI/CD PR changes GitHub Actions/workflows

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Require explicit java_version for manual java-variant builds; remove repo-var fallback

2 participants