Dynamically deploy models to match request targets#1
Merged
Conversation
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Revamp configuration logic using Pydantic models for better validation and maintainability and extend settings with options related to model deployment through the gateway. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Introduce model records to keep track of deployed models, their deployment type, and idle TTLs. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Purge containers based on deployment type, implementing the following removal strategy for each one: - Static: Never remove - Manual: Remove if the TTL specified during manual deployment has been exceeded - Auto: Remove if the model has been idle for longer than the TTL specified during auto-deployment, according to the database model record Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Downgrade mlflow to avoid compatibility issues with CMS MLflow server. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Rename internal services to avoid clashes with services in the CMS network. If we move to Docker Swarm or Kubernetes in the future, we should be able to use FQDNs to avoid conflicts, but for now this appears to be the simplest solution: * minio -> object-store * postgres -> db * rabbitmq -> queue Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Update integration tests to work with the new config. Tweak config handling and service discovery to fix integration tests: * Explicitly pass config.json path when loading config * Ensure the API can work with IPs as model identifiers since we're forced to use them in the integration tests environment (i.e. accessing containers from the localhost) Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
The integration tests update the config.json file when running, adding the MLflow and MinIO connection details. These are specific to the local testing environment and likely the given run, and therefore don't need to be committed to the repository. This commit adds the config.json file to the .gitignore to prevent accidental commits in the future. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
8736059 to
0eb9b94
Compare
Provide an admin API to expose on-demand model configuration management, including creation, updating, retrieval, listing, and soft-deletion. Configuration are now stored in the database, with versioning support. The Python client is also updated to support these operations. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
0eb9b94 to
fec32bf
Compare
There was a problem hiding this comment.
Pull request overview
This pull request introduces dynamic model deployment capabilities to the CogStack Model Gateway, enabling automatic deployment of models when they're requested. The changes include:
- A comprehensive refactoring of the configuration system from a simple dictionary-based approach to a Pydantic-validated schema
- Introduction of three deployment types (AUTO, MANUAL, STATIC) with different lifecycle management strategies
- New database models for tracking deployed models and on-demand model configurations
- Auto-deployment functionality that can deploy models on-demand when requests target them
- Enhanced ripper service to support multiple TTL strategies (fixed TTL for manual deployments, idle TTL for auto deployments)
Key Changes
- Refactored configuration system with Pydantic validation and hierarchical structure
- Added model lifecycle management with database tracking for usage and idle time
- Implemented auto-deployment with health checking and concurrent deployment protection
- Updated all services (gateway, scheduler, ripper) to use the new config and model management systems
Reviewed changes
Copilot reviewed 48 out of 51 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/ripper/test_main.py | Comprehensive test coverage for new multi-TTL ripper logic with different deployment types |
| tests/unit/common/test_models.py | New file with extensive tests for ModelManager and OnDemandModelConfig |
| tests/unit/common/test_config.py | Updated tests for Pydantic-based config system |
| tests/unit/common/test_tasks.py | Fixed logging import path |
| tests/unit/client/test_client.py | Updated client tests for renamed timeout parameters and new API endpoints |
| tests/integration/test_api.py | Added comprehensive integration tests for deployment and on-demand config APIs |
| tests/integration/utils.py | Added utilities for managing deployed containers in tests |
| tests/conftest.py | Added shared db_manager fixture |
| cogstack_model_gateway/common/config/ | Complete config system refactor with Pydantic models |
| cogstack_model_gateway/common/models.py | New model management system with database tracking |
| cogstack_model_gateway/common/containers.py | Enhanced container discovery and management utilities |
| cogstack_model_gateway/common/tracking.py | Extended tracking client with model type resolution |
| cogstack_model_gateway/gateway/core/auto_deploy.py | New auto-deployment implementation with health checks |
| cogstack_model_gateway/gateway/routers/ | Updated model and admin API routers |
| cogstack_model_gateway/ripper/main.py | Enhanced ripper with multi-TTL support |
| cogstack_model_gateway/scheduler/main.py | Updated to use new config system |
| cogstack_model_gateway/migrations/versions/ | New database migrations for model tracking |
| docker-compose.yaml | Service name updates and config file volume mounts |
| config.json | New JSON-based configuration file |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| mock_client_instance.request.return_value = mock_response | ||
|
|
||
| async with GatewayClient(base_url="http://test-gateway.com") as client: | ||
| config = await client.update_on_demand_config( |
There was a problem hiding this comment.
Variable config is not used.
| }, | ||
| ) | ||
|
|
||
| initial_count = count_deployed_model_containers() |
There was a problem hiding this comment.
Variable initial_count is not used.
| task_uuid = response.json()["uuid"] | ||
|
|
||
| # Completion should take at least a few seconds for container startup | ||
| task = wait_for_task_completion(task_uuid, tm, expected_status=Status.SUCCEEDED) |
| # for 'autogenerate' support | ||
| # from myapp import mymodel | ||
| # target_metadata = mymodel.Base.metadata | ||
| from cogstack_model_gateway.common.models import Model # noqa: E402, F401 |
There was a problem hiding this comment.
Import of 'Model' is not used.
Suggested change
| from cogstack_model_gateway.common.models import Model # noqa: E402, F401 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.