Skip to content

Dockerfile for CMS Analytics#237

Open
Akila94 wants to merge 2 commits intowso2:mainfrom
Akila94:cms-analytics-docker
Open

Dockerfile for CMS Analytics#237
Akila94 wants to merge 2 commits intowso2:mainfrom
Akila94:cms-analytics-docker

Conversation

@Akila94
Copy link
Copy Markdown
Member

@Akila94 Akila94 commented Mar 6, 2026

Purpose

  • This PR introduces a Dockerfile for CMS analytics to be deployed on Choreo.

Summary by CodeRabbit

Chores

  • Added Docker image and container configuration for the analytics service deployment, including Ubuntu 24.04 base operating system, OpenJDK 21 Java runtime environment, Ballerina Swan Lake application framework, Fluent Bit monitoring and logging integration, and automated service initialization scripting.
  • Updated repository configuration to preserve required service artifacts.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 6, 2026

Walkthrough

A new Docker analytics image infrastructure is introduced, including a Dockerfile that builds an Ubuntu-based container with Ballerina, OpenJDK 21, and Fluent Bit. An entrypoint script orchestrates startup of both the FHIR service and Fluent Bit monitoring. The .gitignore is updated to include the FHIR service JAR artifact.

Changes

Cohort / File(s) Summary
Version Control Configuration
.gitignore
Added exception rule to track fhir-service/resources/analytics/docker/fhir.service.jar while maintaining general *.jar ignore pattern.
Docker Container Setup
fhir-service/resources/analytics/docker/Dockerfile, fhir-service/resources/analytics/docker/entrypoint.sh
New Docker image definition based on Ubuntu 24.04 with non-root user (choreo/UID 10001), OpenJDK 21, Ballerina Swan Lake, and Fluent Bit. Entrypoint script launches Java FHIR service in background followed by Fluent Bit server.

Sequence Diagram

sequenceDiagram
    participant Container as Container Runtime
    participant Entrypoint as entrypoint.sh
    participant FhirService as FHIR Service<br/>(Java JAR)
    participant FluentBit as Fluent Bit

    Container->>Entrypoint: Execute /home/ballerina/entrypoint.sh
    activate Entrypoint
    
    Entrypoint->>FhirService: exec java -jar fhir.service.jar &
    activate FhirService
    FhirService-->>Entrypoint: Running (background)
    
    Entrypoint->>FluentBit: /opt/fluent-bit/bin/fluent-bit -c fluent-bit.conf
    activate FluentBit
    FluentBit-->>Entrypoint: Monitoring active
    
    Note over FhirService,FluentBit: Both services running concurrently
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 In Docker's nest, a JAR takes flight,
Ballerina twirls with Fluent Bit's sight,
Non-root and secure, the setup's just right,
Analytics dancing through the night! 🌙

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is significantly incomplete. Only the Purpose section is provided with minimal detail; all other required sections (Goals, Approach, User stories, Release note, Documentation, Training, Certification, Marketing, Automation tests, Security checks, Samples, Related PRs, Migrations, Test environment, Learning) are missing. Complete the pull request description by filling in all required template sections, particularly Goals, Approach, test coverage details, security verification checklist, and test environment specifications.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: introducing a Dockerfile for CMS Analytics deployment. It is concise, clear, and directly reflects the primary objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
fhir-service/resources/analytics/docker/Dockerfile (3)

39-39: Consider adding a HEALTHCHECK instruction.

Without a HEALTHCHECK, container orchestrators cannot detect if the Java service inside becomes unresponsive. This is especially important given the dual-process setup.

💡 Example HEALTHCHECK
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
    CMD curl -f http://localhost:8080/health || exit 1

Adjust the port and endpoint to match your service's health endpoint.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fhir-service/resources/analytics/docker/Dockerfile` at line 39, Add a Docker
HEALTHCHECK to the Dockerfile (near the ENTRYPOINT
["/home/ballerina/entrypoint.sh"]) so orchestrators can detect an unresponsive
Java process; implement a HEALTHCHECK that periodically curls your service
health endpoint (adjust host, port and path to match the Java service, e.g.,
localhost:8080/health) with sensible --interval, --timeout, --start-period and
--retries values; place this instruction before or after the ENTRYPOINT line in
the Dockerfile so the container reports unhealthy when the HTTP check fails.

10-15: Consider consolidating apt-get update calls to reduce layers and build time.

There are four separate apt-get update calls (lines 6, 10, 17, 21). Consolidating package installations into fewer RUN instructions reduces image layers and speeds up builds.

♻️ Example consolidated structure
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        openjdk-21-jre-headless \
        wget \
        curl \
        gnupg \
        ca-certificates \
    && wget https://dist.ballerina.io/downloads/2201.12.3/ballerina-2201.12.3-swan-lake-linux-x64.deb \
    && dpkg -i ballerina-2201.12.3-swan-lake-linux-x64.deb \
    && rm ballerina-2201.12.3-swan-lake-linux-x64.deb \
    # ... fluent-bit installation ... \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fhir-service/resources/analytics/docker/Dockerfile` around lines 10 - 15,
Multiple RUN blocks call apt-get update separately which increases image layers
and build time; consolidate package installs and related commands into a single
RUN block (the existing RUN that installs wget and dpkg-installs Ballerina is a
good location) by merging all apt-get install targets (e.g.,
openjdk-21-jre-headless, wget, curl, gnupg, ca-certificates, and any fluent-bit
deps) into that RUN, run apt-get update once, perform installs, run the wget +
dpkg -i ballerina-2201.12.3-swan-lake-linux-x64.deb, then clean with apt-get
clean and rm -rf /var/lib/apt/lists/* so the Dockerfile's RUN that contains wget
and dpkg -i ballerina-2201.12.3-swan-lake-linux-x64.deb is the consolidated
place to remove the extra apt-get update calls and reduce layers.

6-8: Add --no-install-recommends to reduce image size.

Installing the full JDK without --no-install-recommends pulls in many unnecessary packages. Since only a runtime is needed to execute the JAR, consider using openjdk-21-jre-headless instead.

♻️ Proposed fix
 RUN apt-get update \
-    && apt-get install -y openjdk-21-jdk \
+    && apt-get install -y --no-install-recommends openjdk-21-jre-headless \
     && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fhir-service/resources/analytics/docker/Dockerfile` around lines 6 - 8, In
the Dockerfile RUN step that currently installs openjdk-21-jdk, replace the full
JDK install with a minimal runtime and disable recommended packages: use apt-get
install -y --no-install-recommends openjdk-21-jre-headless (keep the apt-get
update and the rm -rf /var/lib/apt/lists/* cleanup) so the image size is reduced
and only the runtime needed to run the JAR is installed; look for the RUN line
that mentions openjdk-21-jdk and update it accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@fhir-service/resources/analytics/docker/Dockerfile`:
- Around line 21-28: The Dockerfile RUN step currently pipes curl to sh (curl
https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh)
which is a supply-chain risk; replace this by adding the official Fluent Bit APT
repository with a pinned package version (import the fluentbit.key via curl ->
gpg --dearmor > /usr/share/keyrings/fluentbit-keyring.gpg, add the apt source
list for fluent-bit, apt-get update, then apt-get install
fluent-bit=<desired-version>) or, if you must run the installer script, download
the script first (curl -o install.sh), verify its checksum/signature, then
execute it; update the RUN block to remove any direct curl | sh usage and ensure
package version pinning and signature/checksum verification for the fluent-bit
installer.

In `@fhir-service/resources/analytics/docker/entrypoint.sh`:
- Around line 1-8: The entrypoint.sh starts the Java service in the background
with "exec java -jar /home/ballerina/fhir.service.jar &" and then runs Fluent
Bit in foreground, so if Java crashes the container stays up; also "exec" is
meaningless when backgrounding. Fix by running the primary service in foreground
and the sidecar in background (or use a supervisor): stop backgrounding the Java
process (remove "&" and the incorrect use of exec for it) and launch
/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf in the
background, or replace the script with a proper process supervisor (e.g.,
supervisord or tini) to manage both processes and ensure container exit/restart
on Java failure; also add or ensure health checks for the Java process.

---

Nitpick comments:
In `@fhir-service/resources/analytics/docker/Dockerfile`:
- Line 39: Add a Docker HEALTHCHECK to the Dockerfile (near the ENTRYPOINT
["/home/ballerina/entrypoint.sh"]) so orchestrators can detect an unresponsive
Java process; implement a HEALTHCHECK that periodically curls your service
health endpoint (adjust host, port and path to match the Java service, e.g.,
localhost:8080/health) with sensible --interval, --timeout, --start-period and
--retries values; place this instruction before or after the ENTRYPOINT line in
the Dockerfile so the container reports unhealthy when the HTTP check fails.
- Around line 10-15: Multiple RUN blocks call apt-get update separately which
increases image layers and build time; consolidate package installs and related
commands into a single RUN block (the existing RUN that installs wget and
dpkg-installs Ballerina is a good location) by merging all apt-get install
targets (e.g., openjdk-21-jre-headless, wget, curl, gnupg, ca-certificates, and
any fluent-bit deps) into that RUN, run apt-get update once, perform installs,
run the wget + dpkg -i ballerina-2201.12.3-swan-lake-linux-x64.deb, then clean
with apt-get clean and rm -rf /var/lib/apt/lists/* so the Dockerfile's RUN that
contains wget and dpkg -i ballerina-2201.12.3-swan-lake-linux-x64.deb is the
consolidated place to remove the extra apt-get update calls and reduce layers.
- Around line 6-8: In the Dockerfile RUN step that currently installs
openjdk-21-jdk, replace the full JDK install with a minimal runtime and disable
recommended packages: use apt-get install -y --no-install-recommends
openjdk-21-jre-headless (keep the apt-get update and the rm -rf
/var/lib/apt/lists/* cleanup) so the image size is reduced and only the runtime
needed to run the JAR is installed; look for the RUN line that mentions
openjdk-21-jdk and update it accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 08749126-9dae-47a0-aff7-83e5462fe58e

📥 Commits

Reviewing files that changed from the base of the PR and between 29499a7 and 6302d00.

📒 Files selected for processing (3)
  • .gitignore
  • fhir-service/resources/analytics/docker/Dockerfile
  • fhir-service/resources/analytics/docker/entrypoint.sh

Comment on lines +21 to +28
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
curl \
ca-certificates \
&& sh -c 'curl https://packages.fluentbit.io/fluentbit.key | gpg --dearmor > /usr/share/keyrings/fluentbit-keyring.gpg' \
&& curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Piping curl output directly to sh is a supply chain risk.

Executing curl ... | sh without verifying the script's integrity exposes the build to potential supply chain attacks if the remote script is compromised.

Consider:

  1. Pinning to a specific Fluent Bit version via APT repository
  2. Or downloading the script, verifying a checksum, then executing
🔒 Safer alternative using APT repository with pinned version
 RUN apt-get update \
     && apt-get install -y --no-install-recommends \
         curl \
         ca-certificates \
-    && sh -c 'curl https://packages.fluentbit.io/fluentbit.key | gpg --dearmor > /usr/share/keyrings/fluentbit-keyring.gpg' \
-    && curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh \
+        gnupg \
+        lsb-release \
+    && curl https://packages.fluentbit.io/fluentbit.key | gpg --dearmor -o /usr/share/keyrings/fluentbit-keyring.gpg \
+    && echo "deb [signed-by=/usr/share/keyrings/fluentbit-keyring.gpg] https://packages.fluentbit.io/ubuntu/noble noble main" \
+       > /etc/apt/sources.list.d/fluent-bit.list \
+    && apt-get update \
+    && apt-get install -y --no-install-recommends fluent-bit \
     && apt-get clean \
     && rm -rf /var/lib/apt/lists/*
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fhir-service/resources/analytics/docker/Dockerfile` around lines 21 - 28, The
Dockerfile RUN step currently pipes curl to sh (curl
https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh)
which is a supply-chain risk; replace this by adding the official Fluent Bit APT
repository with a pinned package version (import the fluentbit.key via curl ->
gpg --dearmor > /usr/share/keyrings/fluentbit-keyring.gpg, add the apt source
list for fluent-bit, apt-get update, then apt-get install
fluent-bit=<desired-version>) or, if you must run the installer script, download
the script first (curl -o install.sh), verify its checksum/signature, then
execute it; update the RUN block to remove any direct curl | sh usage and ensure
package version pinning and signature/checksum verification for the fluent-bit
installer.

Comment on lines +1 to +8
#!/bin/bash
set -e

# Start the Ballerina service
exec java -jar /home/ballerina/fhir.service.jar &

# Start Fluent Bit server
/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No process supervision—container stays up if Java crashes.

The Java service runs in background while Fluent Bit runs in foreground. If the Java process crashes, Fluent Bit keeps running and the container won't restart, leaving the service unavailable without alerting Kubernetes/Choreo.

Additionally, exec ... & is misleading—exec has no effect when backgrounding.

Consider:

  1. Using a process supervisor (e.g., supervisord, tini with a wrapper)
  2. Or running Java in foreground and Fluent Bit in background (if Java is the primary service)
  3. Adding health checks to detect Java failures
🔧 Simpler alternative: Java foreground, Fluent Bit background
 #!/bin/bash
 set -e

+# Start Fluent Bit in background
+/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf &
+
 # Start the Ballerina service
-exec java -jar /home/ballerina/fhir.service.jar &
-
-# Start Fluent Bit server
-/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
+exec java -jar /home/ballerina/fhir.service.jar

This way, if Java crashes the container exits and orchestrator can restart it.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#!/bin/bash
set -e
# Start the Ballerina service
exec java -jar /home/ballerina/fhir.service.jar &
# Start Fluent Bit server
/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
#!/bin/bash
set -e
# Start Fluent Bit in background
/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf &
# Start the Ballerina service
exec java -jar /home/ballerina/fhir.service.jar
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fhir-service/resources/analytics/docker/entrypoint.sh` around lines 1 - 8,
The entrypoint.sh starts the Java service in the background with "exec java -jar
/home/ballerina/fhir.service.jar &" and then runs Fluent Bit in foreground, so
if Java crashes the container stays up; also "exec" is meaningless when
backgrounding. Fix by running the primary service in foreground and the sidecar
in background (or use a supervisor): stop backgrounding the Java process (remove
"&" and the incorrect use of exec for it) and launch
/opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf in the
background, or replace the script with a proper process supervisor (e.g.,
supervisord or tini) to manage both processes and ensure container exit/restart
on Java failure; also add or ensure health checks for the Java process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant