-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Overview
This epic tracks the transition from embedded workers (implemented in #10) to a fully distributed service architecture for horizontal scaling.
Context
- Issue Task Execution Workflow and State Management #10 implements embedded workers in the API server for simplicity and development ease
- This epic plans the future separation into independent services for production scalability
Goals
Transform VoidRunner from single-process to multi-service architecture:
Current (Issue #10): Future (This Epic):
┌─────────────────┐ ┌─────────────┐ ┌─────────────────┐
│ API Server │ │ API Server │ │ Scheduler Service│
│ ┌───────────┐ │ → │ │ │ ┌───────────┐ │
│ │ Workers │ │ │ │ │ │ Workers │ │
│ └───────────┘ │ │ │ │ └───────────┘ │
└─────────────────┘ └─────────────┘ └─────────────────┘
Implementation Tasks
1. Service Communication
- Implement service discovery between API and scheduler
- Add health check coordination between services
- Create inter-service communication patterns
2. Configuration Management
- Split configuration between API and scheduler services
- Add service-specific configuration validation
- Environment variable management for both services
3. Deployment Architecture
- Update Docker compose for multi-service deployment
- Create Kubernetes manifests for both services
- Add load balancing and service mesh integration
4. Monitoring and Observability
- Distributed tracing across services
- Service-specific metrics and dashboards
- Centralized logging configuration
5. Development Experience
- Multi-service development environment setup
- Hot reload for both services during development
- Integration testing framework for distributed setup
Benefits
- Horizontal Scaling: Scale API and workers independently
- Resource Optimization: Allocate resources based on service needs
- Fault Isolation: API failures don't affect worker processing
- Technology Flexibility: Different services can use different tech stacks
Prerequisites
- ✅ Issue Task Execution Workflow and State Management #10: Task Execution Workflow and State Management (embedded workers)
- ✅ Stable queue system with Redis
- ✅ Comprehensive worker pool implementation
Success Criteria
- API and scheduler can run as independent services
- Zero-downtime deployment for each service
- Linear scaling of worker capacity
- Maintained API backward compatibility
- Production deployment documentation
Related Issues
- Closes: Task Execution Workflow and State Management #10 (prerequisite)
- Related: Epic Container Execution Engine #8 Container Execution Engine
Deployment Strategy
- Phase 1: Implement service separation while maintaining embedded option
- Phase 2: Add service orchestration and communication
- Phase 3: Production deployment with monitoring
- Phase 4: Deprecate embedded workers option
This epic enables VoidRunner to scale horizontally for enterprise deployments while maintaining the simplicity of embedded workers for development and small deployments.