Skip to content

feat: harden model library + worker reconnect lifecycle and fix compose GUI backend#8

Merged
rwilliamspbg-ops merged 1 commit into
mainfrom
feat/model-library-worker-reconnect-compose
Jun 23, 2026
Merged

feat: harden model library + worker reconnect lifecycle and fix compose GUI backend#8
rwilliamspbg-ops merged 1 commit into
mainfrom
feat/model-library-worker-reconnect-compose

Conversation

@rwilliamspbg-ops

@rwilliamspbg-ops rwilliamspbg-ops commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Summary

This PR now includes full real GUI/API wiring, model lifecycle hardening, worker lifecycle resilience, and compose/CI verification improvements.

Scope

  1. Complete real GUI-to-backend wiring (no GUI fallback placeholders for core flows).
  2. Harden model library + HuggingFace onboarding behavior.
  3. Harden worker join/leave/reconnect and slice failover behavior.
  4. Upgrade compose/CI verification to actually smoke-test GUI-facing services.

What Changed

Real GUI Wiring

  • GUI actions are now backed by concrete API endpoints for:
    • model listing/loading/downloading
    • worker listing/adding/connecting
    • queue actions
    • chat inference
    • session listing/cancel
    • security actions (JWT refresh, PQC enable)
  • Removed placeholder/fallback behavior for core data paths in mohawk_gui/main_window.py.
  • Added latency bar updates and worker/session count updates driven by API responses.

Controller Service (GUI Backend)

  • Added and expanded prototype/controller_service.py with in-memory state and endpoints:
    • GET /health
    • GET /api/models
    • POST /api/models/load
    • POST /api/models/download
    • GET /api/workers
    • POST /api/workers/add
    • POST /api/workers/connect
    • POST /api/queue
    • POST /api/inference/chat
    • GET /api/metrics
    • GET /api/sessions
    • POST /api/sessions/{session_id}/cancel
    • POST /api/security/jwt/refresh
    • POST /api/security/pqc/enable
  • Improved worker URL parsing robustness and runtime health checks.

Model + Worker Hardening

  • Persistent model library index support and local/HF registration.
  • Safer HF loading/download argument handling.
  • Worker lifecycle ops: join/leave/reconnect.
  • Slice cache re-share/failover behavior for worker transitions.
  • Deterministic key derivation and secure payload/manifest handling updates.
  • Shape-preserving NPZ slice serialization with fallback compatibility.

Compose and Docker Wiring

  • Main compose stack now runs real controller backend wiring for GUI usage.
  • Dev compose controller runs real API service instead of placeholder command.
  • Dockerfiles updated to include required runtime modules for controller service.

CI / Verification

  • Added compose smoke test job in CI to:
    • build and start compose stack
    • wait for service health
    • verify GUI-facing endpoints with HTTP checks
    • dump logs on failure and clean up

Tests and Validation

  • Full test suite: 87 passed, 4 skipped
  • New controller-service endpoint tests added:
    • tests/test_controller_service.py (all passing)
  • Compose configs validated:
    • docker compose -f docker-compose.yml config
    • docker compose -f docker-compose.dev.yml config

Notes for Reviewers

Primary files to review:

  • mohawk_gui/main_window.py
  • prototype/controller_service.py
  • tests/test_controller_service.py
  • docker-compose.yml
  • docker-compose.dev.yml
  • .github/workflows/ci.yml

Risk

  • Main residual risk is runtime-only container behavior in environments different from CI/dev setup.
  • Mitigation: compose smoke job plus expanded endpoint-level tests.

…I backend

- add persistent model library index and local/HuggingFace registration support\n- improve HF loading/download behavior and argument handling\n- add worker join/leave/reconnect lifecycle operations with secure re-handshake\n- implement slice cache re-share/failover path to avoid inference interruption\n- fix secure worker manifest handling, encrypted execution compatibility, and IO payload flow\n- correct deterministic key derivation for secure controller/worker interoperability\n- replace fragile slice serialization with shape-safe NPZ transport + compatibility fallback\n- add worker lifecycle regression tests and expand model loader download tests\n- adjust metrics aggregation determinism and stabilize noisy throughput test threshold\n- wire docker compose services to real runnable backend processes for local GUI verification\n- add lightweight mock backend API for GUI endpoints on compose main stack\n\nValidation:\n- full test suite: 84 passed, 4 skipped\n- targeted tests after compose/backend edits: tests/test_models.py, tests/test_api.py
@rwilliamspbg-ops rwilliamspbg-ops merged commit fc2749c into main Jun 23, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant