Conversation
There was a problem hiding this comment.
Pull request overview
This PR makes the load_and_authorize_remix method deterministic by explicitly ordering remix records when multiple records exist with the same (remixed_from_id, user_id) pair. The fix addresses a production issue where Experience CS was receiving inconsistent project identifiers due to nondeterministic query results. The change ensures that users always receive their most recently created remix instead of a random one.
Changes:
- Modified
RemixesController#load_and_authorize_remixto use.where().order().first!instead offind_by!, ordering bycreated_at: :desc, updated_at: :descto deterministically return the most recent remix
| @project = Project.where(remixed_from_id: project.id, user_id: current_user&.id) | ||
| .order(created_at: :desc, updated_at: :desc) | ||
| .first! |
There was a problem hiding this comment.
The query filters on both remixed_from_id and user_id, but there's only a single-column index on remixed_from_id (see db/schema.rb:241). Consider adding a composite index on (remixed_from_id, user_id) to improve query performance, especially since this query is now also ordering by created_at and updated_at. This would make the WHERE clause more efficient and potentially help with the ordering if the index also includes those columns.
There was a problem hiding this comment.
I have a follow-up task for this in my epic.
| def load_and_authorize_remix | ||
| @project = Project.find_by!(remixed_from_id: project.id, user_id: current_user&.id) | ||
| @project = Project.where(remixed_from_id: project.id, user_id: current_user&.id) | ||
| .order(created_at: :desc, updated_at: :desc) |
There was a problem hiding this comment.
The ordering here (created_at: :desc, updated_at: :desc) is inconsistent with a similar query in app/controllers/api/lessons_controller.rb:98 which orders remixes by created_at: :asc and only uses created_at for ordering. Consider whether both places should use the same ordering logic, or if there's a specific reason for the difference. If the goal is to consistently return the "most recent" remix, then LessonsController#user_remix should also be updated to use descending order.
There was a problem hiding this comment.
Probably both should use the most-recently-created logic. Would appreciate a reviewer's view on this.
There was a problem hiding this comment.
😵💫 This is a bit of a muddle, I was hoping that there wasn't that many affected but I saw you said in the epic it's in the 10s of thousands.
I agree on making everywhere that loads projects consistent before we clear the duplicates and put the unique index in.
Probably both should use the most-recently-created logic.
The thing I'd want to minimize is hiding/losing projects that people have worked on. Picking the most recently created sounds like it might be a good approach, but would it be possible to confirm that? Maybe running a query to see if many duplicate projects with an older created_at have a more recent updated_at?
There are things that hang off of projects ('components' and 'feedback' via 'school_projects') so it might be good to check if they would become hidden.
There was a problem hiding this comment.
Yes, I agree. I'll do some more analysis on this before we merge it. I have some SQL in my notes that I can adapt to show some of this, I think. I'm about out of brainpower for today but I'll keep you in the loop.
ecaa5e2 to
529bb80
Compare
This commit changes RemixesController#load_and_authorize_remix to deterministically always return the most recent Project record based on created_at and, secondarily, updated_at dates. While, in principle, there should not be multiple records where (remixed_from_id:, user_id:) are the same, in practice there are and there is no database constraint to prevent there being. The previous code would nondeterministically return one of possibly many records, causing downstream errors in Experience CS since each would have a different identifier, not matching the one stored in Experience CS. Overarching epic: RaspberryPiFoundation/experience-cs#1826
529bb80 to
dac5345
Compare
This commit changes
RemixesController#load_and_authorize_remixto deterministically always return the most recent Project record based oncreated_atand, secondarily,updated_atdates.While, in principle, there should not be multiple records where
(remixed_from_id:, user_id:)are the same, in practice there are and there is no database constraint to prevent there being.The previous code would nondeterministically return one of possibly many records, causing downstream errors in Experience CS since each would have a different identifier, not matching the one stored in Experience CS.
Overarching epic:
https://github.com/RaspberryPiFoundation/experience-cs/issues/1826
Status
Points for consideration:
What's changed?
In
RemixesController#load_ and_authorize_remixinstead of usingProject.find_by!we now useProejct.where(...).order(created_at: :desc, updated_at: :desc).first!Steps to perform after deploying to production
There are no additional steps required and this behaviour change should only affect users who have multiple remix records - these users will now only be able to access their most-recently-created remix instead of getting a random one back.