feat: add health check endpoint and related schemas with tests by jokob-sk · Pull Request #1521 · netalertx/NetAlertX

jokob-sk · 2026-02-17T23:04:32Z

Summary by CodeRabbit

New Features
- Added a new /health API endpoint returning system health metrics (DB size, memory use, 1m CPU load, disk usage, CPU temp); accessible via authenticated requests.
Tests
- Added comprehensive tests validating endpoint auth, response structure, metric ranges, error handling, and HTTP method behavior.
Documentation
- Updated Code Standards with pre-step review, planning steps, and explicit maintainability principles (DRY, no duplicates, size guideline).

coderabbitai · 2026-02-17T23:04:54Z

📝 Walkthrough

Walkthrough

Adds a new health-check feature: collects system metrics (DB size, memory, load, storage, CPU temp), exposes a secured GET /health endpoint with a HealthCheckResponse schema, updates startup registration, and adds comprehensive endpoint tests and code-standard guidance.

Changes

Cohort / File(s)	Summary
Health check implementation `server/api_server/health_endpoint.py`	New module providing functions to collect system metrics: `get_db_size_mb`, `get_mem_usage_pct`, `get_load_avg_1m`, `get_storage_pct`, `get_cpu_temp`, and an aggregator `get_health_status`. Includes error handling and logging.
API integration `server/api_server/api_server_start.py`	Registers a new authenticated GET `/health` route, imports the health collector, exposes `check_health` route handler and `HealthCheckResponse` in the public OpenAPI surface. Returns 200 with metrics or 500 on error.
OpenAPI schema `server/api_server/openapi/schemas.py`	Adds `HealthCheckResponse` (inherits `BaseResponse`) with fields: `db_size_mb`, `mem_usage_pct`, `load_1m`, `storage_pct`, `cpu_temp`; includes example and `extra="allow"`.
Tests `test/api_endpoints/test_health_endpoints.py`	New comprehensive tests: authentication (missing/invalid/valid), response shape and types, metric range/consistency checks, error-mocking path, HTTP method restrictions, and token-via-query support.
Documentation / Standards `.github/skills/code-standards/SKILL.md`	Adds pre-step review and planning guidance, clarifies review process, mandates test creation as final step, and adds maintainability bullets (avoid duplicates, DRY, sub-500 LOC guideline).

Sequence Diagram

sequenceDiagram
    participant Client
    participant API as API Server
    participant Handler as check_health()
    participant Collector as get_health_status()
    participant System as System Resources

    Client->>API: GET /health (Authorization)
    API->>API: validate auth
    alt auth valid
        API->>Handler: invoke check_health()
        Handler->>Collector: request aggregated metrics
        Collector->>System: get_db_size_mb()
        System-->>Collector: db_size_mb
        Collector->>System: get_mem_usage_pct()
        System-->>Collector: mem_usage_pct
        Collector->>System: get_load_avg_1m()
        System-->>Collector: load_1m
        Collector->>System: get_storage_pct()
        System-->>Collector: storage_pct
        Collector->>System: get_cpu_temp()
        System-->>Collector: cpu_temp
        Collector-->>Handler: health_dict
        Handler-->>API: 200 HealthCheckResponse
        API-->>Client: 200 OK with metrics
    else auth invalid
        API-->>Client: 403 Forbidden
    end
    alt exception during collection
        Handler-->>API: error
        API-->>Client: 500 Error payload
    end

Poem

🐰 I hopped through bytes and midnight logs,
I fetched the bays where data clogs,
DB size, load, and memory hum,
CPU whispers, disk drums drum,
A rabbit cheers — the system's plumb!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: adding a health check endpoint with schemas and tests across multiple files.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chore_timestamps

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

.github/skills/code-standards/SKILL.md (1)

8-14: Duplicated guidance within the same document.

Lines 12–14 restate the "DRY" and "< 500 LOC" rules that already appear in the dedicated sections at Lines 18–22. Consider removing the duplicates from the bullet list to keep this skill file itself DRY, or consolidating into one place.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/skills/code-standards/SKILL.md around lines 8 - 14, The SKILL.md
bullet list duplicates the "DRY" and "< 500 LOC" rules that also appear in the
dedicated sections at Lines 18–22; remove the duplicate bullets from the
top-level list (the lines containing "follow DRY principle" and "code files
should be less than 500 LOC for better maintainability") or consolidate them by
keeping the guidance only in the dedicated "DRY" and "500 LOC" sections and
replacing the bullets with a single cross-reference to those sections so the
document itself follows DRY.

server/api_server/health_endpoint.py (2)

110-127: Minor static-analysis clean-ups in get_cpu_temp.

A few quick wins flagged by Ruff:

RUF019 (Line 116): Replace the redundant key-check with dict.get.
B007 (Line 120): Rename the unused loop variable sensor_type to _.

♻️ Proposed diff

-        if "coretemp" in temps and temps["coretemp"]:
-            return int(temps["coretemp"][0].current)
+        coretemp = temps.get("coretemp")
+        if coretemp:
+            return int(coretemp[0].current)
 
         # Fallback to first sensor with data
-        for sensor_type, readings in temps.items():
+        for _, readings in temps.items():
             if readings:
                 return int(readings[0].current)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@server/api_server/health_endpoint.py` around lines 110 - 127, In
get_cpu_temp, simplify the 'coretemp' presence check by using
temps.get("coretemp") instead of temps["coretemp"] and guard with truthiness,
and rename the unused loop variable sensor_type to _ in the fallback for-loop
(for _, readings in temps.items()) to address the B007 unused-variable warning
while keeping readings as-is and preserving existing behavior and exception
handling.

49-52: Use psutil.virtual_memory().percent directly.

psutil.virtual_memory() already exposes a .percent attribute, avoiding the manual used / total * 100 calculation and potential floating-point edge cases.

♻️ Proposed simplification

     try:
         vm = psutil.virtual_memory()
-        pct = int((vm.used / vm.total) * 100)
-        return max(0, min(100, pct))  # Clamp to 0-100
+        return max(0, min(100, int(vm.percent)))
     except Exception as e:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@server/api_server/health_endpoint.py` around lines 49 - 52, Replace the
manual calculation using vm.used/vm.total with psutil's built-in percent: get vm
via psutil.virtual_memory(), set pct = int(vm.percent) (or
int(psutil.virtual_memory().percent) directly) and then clamp with max(0,
min(100, pct)). Update the references to vm and pct in the current function (the
block that currently computes pct = int((vm.used / vm.total) * 100)) to use
vm.percent to avoid floating-point edge cases.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/api_server/api_server_start.py`:
- Around line 1948-1959: In check_health, avoid returning the raw exception text
to callers; retain the detailed error in the server log via the existing mylog
call and change the response body to use a fixed, generic error message (e.g.,
"Internal server error" or "Failed to retrieve health status") instead of str(e)
so it matches other 500 handlers like handle_500_error; update the return in the
except block of check_health (which calls get_health_status and logs via mylog)
to remove str(e) from the "message" field and return only the generic message
and success: False.

In `@server/api_server/health_endpoint.py`:
- Around line 42-55: get_mem_usage_pct and get_storage_pct currently return -1
on exception which violates HealthCheckResponse.mem_usage_pct/storage_pct (ge=0,
le=100); change both functions to return None on error (instead of -1) and
update the OpenAPI schema types for HealthCheckResponse.mem_usage_pct and
storage_pct to Optional[int] (allow null) while keeping the ge=0, le=100
constraints for non-null values so the response stays compliant and the
generated JSON schema permits null values.

---

Nitpick comments:
In @.github/skills/code-standards/SKILL.md:
- Around line 8-14: The SKILL.md bullet list duplicates the "DRY" and "< 500
LOC" rules that also appear in the dedicated sections at Lines 18–22; remove the
duplicate bullets from the top-level list (the lines containing "follow DRY
principle" and "code files should be less than 500 LOC for better
maintainability") or consolidate them by keeping the guidance only in the
dedicated "DRY" and "500 LOC" sections and replacing the bullets with a single
cross-reference to those sections so the document itself follows DRY.

In `@server/api_server/health_endpoint.py`:
- Around line 110-127: In get_cpu_temp, simplify the 'coretemp' presence check
by using temps.get("coretemp") instead of temps["coretemp"] and guard with
truthiness, and rename the unused loop variable sensor_type to _ in the fallback
for-loop (for _, readings in temps.items()) to address the B007 unused-variable
warning while keeping readings as-is and preserving existing behavior and
exception handling.
- Around line 49-52: Replace the manual calculation using vm.used/vm.total with
psutil's built-in percent: get vm via psutil.virtual_memory(), set pct =
int(vm.percent) (or int(psutil.virtual_memory().percent) directly) and then
clamp with max(0, min(100, pct)). Update the references to vm and pct in the
current function (the block that currently computes pct = int((vm.used /
vm.total) * 100)) to use vm.percent to avoid floating-point edge cases.

server/api_server/api_server_start.py

server/api_server/health_endpoint.py

…y and storage usage

…ertX into chore_timestamps

coderabbitai

🧹 Nitpick comments (1)

server/api_server/health_endpoint.py (1)

106-112: Two minor style issues in get_cpu_temp (Ruff RUF019 + B007)

Line 106 — the "coretemp" in temps and guard is redundant; temps.get("coretemp") expresses the same intent in one step (Ruff RUF019).
Line 110 — sensor_type is never referenced inside the loop body; rename it to _ (Ruff B007).

♻️ Proposed fix

-        if "coretemp" in temps and temps["coretemp"]:
-            return int(temps["coretemp"][0].current)
+        coretemp = temps.get("coretemp")
+        if coretemp:
+            return int(coretemp[0].current)

         # Fallback to first sensor with data
-        for sensor_type, readings in temps.items():
+        for _, readings in temps.items():
             if readings:
                 return int(readings[0].current)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@server/api_server/health_endpoint.py` around lines 106 - 112, get_cpu_temp
contains two stylistic issues: replace the redundant membership check "coretemp"
in temps and temps["coretemp"] with temps.get("coretemp") to simplify the guard,
and rename the unused loop variable sensor_type to _ in the for-loop that
iterates over temps.items() so the unused variable is explicit; update the
return logic to use temps.get("coretemp") and change "for sensor_type, readings
in temps.items()" to "for _, readings in temps.items()" in the get_cpu_temp
function.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/api_server/health_endpoint.py`:
- Around line 106-112: get_cpu_temp contains two stylistic issues: replace the
redundant membership check "coretemp" in temps and temps["coretemp"] with
temps.get("coretemp") to simplify the guard, and rename the unused loop variable
sensor_type to _ in the for-loop that iterates over temps.items() so the unused
variable is explicit; update the return logic to use temps.get("coretemp") and
change "for sensor_type, readings in temps.items()" to "for _, readings in
temps.items()" in the get_cpu_temp function.

feat: add health check endpoint and related schemas with tests

264cae3

Merge branch 'main' into chore_timestamps

fa36adb

coderabbitai bot reviewed Feb 17, 2026

View reviewed changes

server/api_server/api_server_start.py Show resolved Hide resolved

server/api_server/health_endpoint.py Outdated Show resolved Hide resolved

jokob-sk added 2 commits February 17, 2026 23:16

fix: update health check response and schema to handle nullable memor…

bc97a80

…y and storage usage

Merge branch 'chore_timestamps' of https://github.com/netalertx/NetAl…

f2af4ff

…ertX into chore_timestamps

coderabbitai bot reviewed Feb 17, 2026

View reviewed changes

jokob-sk merged commit 32e2d57 into main Feb 17, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add health check endpoint and related schemas with tests#1521

feat: add health check endpoint and related schemas with tests#1521
jokob-sk merged 4 commits intomainfrom
chore_timestamps

jokob-sk commented Feb 17, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 17, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Uh oh!

Conversation

jokob-sk commented Feb 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

jokob-sk commented Feb 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 17, 2026 •

edited

Loading