A secure, scalable GUI for managing multi-device inference sessions with enterprise-grade features and an easy-to-use interface.
| Feature | Description |
|---|---|
| π Model Library Manager | LM Studio-style model browsing with quantization options |
| π¬ Chat Interface | Multi-turn conversations with context management |
| π Real-time Metrics | GPU/CPU/Memory charts with PyQtGraph |
| π Session Queue Manager | Priority-based job scheduling |
| βοΈ Worker Configuration | Multi-device layer splitting support |
| π Security Center | PQC + mTLS + JWT authentication |
| π Conversation History | Usage tracking and analytics |
- β JWT Authentication with RSA signatures
- β mTLS Support for secure worker communication
- β Post-Quantum Cryptography (PQC) hybrid KEM support
- β Encrypted Configuration using Fernet encryption
- β Role-Based Access Control ready
- β Connection Pooling - 100+ concurrent connections
- β Real-time Metrics - PyQtGraph charts for GPU/CPU/Memory
- β Memory Efficiency - Deque with maxlen limits
- β Multi-device Layer Splitting across workers
- β Easy Model Management - Download/Upload with quantization
- β Intuitive Chat Interface - Like LM Studio's chat panel
- β Live Performance Monitoring - Throughput and latency charts
- β Session Queue System - Priority job management
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π¦
Mohawk Inference Engine v2.1.0 β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β [π Model Library] [π¬ Chat Interface] β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Tab Navigation: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β π Models | π¬ Chat | π Metrics | π Sessions β β
β β βοΈ Workers | π Security | π History β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Status: π’ All Systems Operational β
β Throughput: 1,250 req/s | Latency p50: 12ms β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Clone repository
cd C:\Users\rwill\Mohawk-Inference-Engine
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Generate auth key (first run only)
mkdir -p certs
python mohawk_gui/main.py --key-file certs/auth_key.pem# Development mode
python mohawk_gui/main.py
# Production mode with SSL
python mohawk_gui/main.py \
--host 0.0.0.0 \
--port 8003 \
--ssl-enabled \
--key-file certs/auth_key.pem# Run build script
build_windows.bat
# Output: dist/Mohawk-Inference-Engine.exe- Click "π Models" tab
- Click "β¬οΈ Download" or "β¬οΈ Upload" to get models
- Select quantization: Q4_K_M (recommended for balance)
- Configure device splitting if using multi-GPU
- Click "π Load Model"
- Click "π¬ Chat" tab
- Type your message in the input box
- Adjust settings:
- Temperature: 0.7 (balanced creativity)
- Max Tokens: 2048 (good for most tasks)
- Press β€ Send or hit Enter
- Click "π Metrics" tab
- Watch real-time throughput and latency charts
- Monitor GPU/CPU/Memory usage
- View conversation statistics
- Model Browser - Browse with search and filters
- Download/Upload - Get models from any source
- Quantization Selector - Q4_K_M, Q5_K_M, Q8_0, FP16
- Device Split Config - Multi-device layer splitting
- Status Tracking - Ready/Loading/Failed states
- Conversation History - Scrollable message history
- Parameter Controls:
- Temperature (0.0 - 2.0)
- Top-p sampling
- Max tokens generation
- System Prompt Editor - Customizable instructions
- Context Management - Token usage tracking
- Throughput Chart - Requests per second (real-time)
- Latency Monitoring:
- p50 latency (median)
- p95 latency (95th percentile)
- p99 latency (99th percentile)
- Resource Usage Charts:
- CPU utilization
- Memory consumption
- GPU utilization per device
- Statistics Summary with totals
- Session Table - View all active sessions
- Queue Configuration - Max size and priority levels
- Job Management - Queue, cancel, monitor sessions
- Worker List - View connected workers
- Multi-device Config - Layer splitting across devices
- Worker Actions - Connect/Disconnect/Restart
- JWT Authentication - Token status and refresh
- mTLS Configuration - Certificate management
- PQC Support - Hybrid KEM for quantum resistance
- Security Event Log - Immutable audit trail
- History Table - All conversations with timestamps
- Usage Statistics - Total tokens, average latency
- Model Usage Tracking - Which models were used
# Token expiry: 24 hours
# Algorithm: RS256 (RSA signatures)
# Refresh window: 1 hour- Client certificate authentication
- Encrypted configuration (Fernet)
- Certificate validity monitoring
- Optional hybrid KEM support
- X25519 + Kyber key exchange
- Quantum-resistant security layer
# Configure device splitting
Format: 'cpu_threads;gpu_ids'
Example: 'cpu;0,1,2,3;cuda:0,1'- Supports 100+ concurrent connections
- WebSocket metrics streaming
- Configurable buffer windows
- PyQtGraph charts for smooth rendering
- Sub-second metric updates
- Memory-efficient data structures
build_windows.bat # Windows
./build_linux.sh # Linux/macOSpyinstaller \
--name=Mohawk-Inference-Engine \
--onefile \
--windowed \
mohawk_gui/main.pydocker build -t mohawk-gui:latest .
docker run -d \
--name mohawk-gui \
-p 8003:8003 \
-v $(pwd)/certs:/app/certs \
-v $(pwd)/logs:/app/logs \
mohawk-gui:latest- π Dashboard Features Guide - Complete feature documentation
- β‘ Quick Start Guide - 3-minute setup guide
- ποΈ Implementation Plan - Architecture details
- β Production Readiness - Quality checklist
# Run unit tests
pytest mohawk_gui/ -v
# Run security tests
pytest tests/test_security.py -v
# Run performance benchmarks
pytest tests/test_performance.py -v --benchmark
# Code quality checks
black --check mohawk_gui/
flake8 mohawk_gui/
mypy mohawk_gui/- PyQt6 >= 6.5.0 - GUI framework
- cryptography >= 41.0.0 - Security
- PyJWT >= 2.8.0 - Token handling
- psutil >= 5.9.0 - System monitoring
- pyqtgraph >= 0.13.0 - Charts and plots
- PyInstaller - Build executables
- pytest - Testing framework
- black, flake8, mypy - Code quality
Install all with: pip install -r requirements.txt
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
See CONTRIBUTING.md for details.
MIT License - See LICENSE file for details
| Feature | Status | Implementation |
|---|---|---|
| JWT Authentication | β Complete | RSA signatures, token expiry |
| mTLS Support | β Complete | Certificate management ready |
| PQC Hybrid Mode | β Optional | X25519 + Kyber support |
| Connection Pooling | β Complete | 100+ connections supported |
| Real-time Metrics | β Complete | PyQtGraph charts |
| Error Recovery | β Complete | Retry, degrade, abort strategies |
| Multi-device Splitting | β Complete | Layer partitioning across workers |
| Docker Support | β Complete | Multi-stage builds ready |
| Cross-platform | β Complete | Windows, Linux, macOS |
For issues and questions, please open an issue on GitHub or contact the Mohawk Inference Engine team.
Mohawk Inference Engine v2.1.0 - Production Ready! π¦