-
Notifications
You must be signed in to change notification settings - Fork 70
🐛 Workload should still resilient when catalog is deleted #2439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
🐛 Workload should still resilient when catalog is deleted #2439
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive end-to-end tests to verify that installed OLM extensions continue functioning correctly when their source catalog is deleted. The tests cover both standard runtime and experimental Boxcutter runtime scenarios.
Changes:
- Added new feature file with 8 scenarios testing catalog deletion resilience
- Implemented
CatalogIsDeletedfunction to support catalog deletion in tests - Added step registrations for ClusterExtension update operations
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| test/e2e/steps/steps.go | Adds CatalogIsDeleted function and step registrations for testing catalog deletion and ClusterExtension updates |
| test/e2e/features/catalog-deletion-resilience.feature | Defines 8 test scenarios covering extension resilience, resource restoration, config changes, version upgrades, and revision behavior when catalog is deleted |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d3cbb5a to
f31b184
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f31b184 to
dce6d68
Compare
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Show resolved
Hide resolved
dce6d68 to
b15c262
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b15c262 to
b1d259e
Compare
b1d259e to
c6870c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Show resolved
Hide resolved
c6870c5 to
36e9069
Compare
36e9069 to
6799025
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextension_controller_test.go
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_controller_test.go
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
043d689 to
773f156
Compare
773f156 to
84e6cc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
84e6cc6 to
865ac9b
Compare
865ac9b to
23b7677
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2439 +/- ##
==========================================
+ Coverage 73.00% 73.31% +0.31%
==========================================
Files 100 100
Lines 7641 7727 +86
==========================================
+ Hits 5578 5665 +87
+ Misses 1625 1620 -5
- Partials 438 442 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
23b7677 to
e14ff89
Compare
e14ff89 to
db1c787
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Show resolved
Hide resolved
Enables installed extensions to continue working when their source catalog becomes unavailable or is deleted. When resolution fails due to catalog unavailability, the operator now continues reconciling with the currently installed bundle instead of failing. Changes: - Resolution falls back to installed bundle when catalog unavailable - Unpacking skipped when maintaining current installed state - Helm and Boxcutter appliers handle nil contentFS gracefully - Version upgrades properly blocked without catalog access This ensures workloads remain stable and operational even when the catalog they were installed from is temporarily unavailable or deleted, while appropriately preventing version changes that require catalog access.
db1c787 to
e59f517
Compare
|
/hold TBD internal discussion: https://redhat-internal.slack.com/archives/C06KP34REFJ/p1768241320285279?thread_ts=1768235428.251779&cid=C06KP34REFJ |
Problem
When a catalog becomes unavailable (deleted, registry offline, network issues), installed extensions break or stop being maintained. This PR ensures extensions continue working with their installed version until the catalog becomes available again.
What This Fixes
Issues on main when catalog is unavailable/deleted:
Note: Boxcutter already maintains resources via CER controller; Helm did not.
Solution
Added smart fallback logic:
Key Changes
reconcileExistingRelease()to maintain resources whencontentFS == nilcontentFS == nil(CER controller maintains)What "Extension Continue Working" Means
An extension continues working when:
Installed=TrueTesting
Added comprehensive e2e test suite in
test/e2e/features/catalog-deletion-resilience.feature:All scenarios tested for both Helm and Boxcutter runtimes where applicable.
What Still Requires Catalog (Correct Behavior)
Resolution Fails?