Conversation
|
Hi! I'm the It looks like you correctly set up a CI job that uses the autofix.ci GitHub Action, but the autofix.ci GitHub App has not been installed for this repository. This means that autofix.ci unfortunately does not have the permissions to fix this pull request. If you are the repository owner, please install the app and then restart the CI workflow! 😃 |
on-demand Sync Mode
ca40064 to
a41deb1
Compare
on-demand Sync Modeon-demand Sync Mode
on-demand Sync Modeon-demand Sync Mode
stevensJourney
left a comment
There was a problem hiding this comment.
The implementation looks good to me. Pending mutations is the only possible concern I can think of.
| const oldDataWhenClause = toInlinedWhereClause(compiledOldData) | ||
| const viewWhereClause = toInlinedWhereClause(compiledView) | ||
|
|
||
| await disposeTracking?.() |
There was a problem hiding this comment.
One important case to consider here is pending mutations.
The current flow for mutations is:
- Code creates the mutation on the collection, e..g
collection.insert(...) - The mutation runs through
PowerSyncTransactorwhich asynchronously writes the operation to SQLite. Writing the change to SQLite will immediately create a diff record intrackedTableName.PowerSyncTransactorwill then record that latest diff record inPendingOperationStore- waiting for that record to have been observed by theonChangehandler. Where observing results in the change being written to the TanStackDB collection. The fact that the mutation promise waits for the change to have been reported to the collection - allows the TanStackDB collection to drop the optimistic state at the correct time.
What we should confirm and cater for in this case is that we're potentially dropping the trigger (which also deletes the destination table) and replacing it with a trigger which may have a different filter. This means that there could be a potential case where the diff record being awaited for processing in PendingOperationStore is never detected, and the pending mutation never resolves.
On-demand collection sync
This PR introduces an
on-demandsync mode for collections, building on top of the existing eager implementation.Instead of copying the entire source table into the collection upfront,
on-demandmode only syncs the subset of data relevant to active live queries. We achieve this by implementing theloadSubsetandunloadSubsethandlers that TanstackDB calls when live queries are registered or deregistered.How it works:
When
loadSubsetis called, we receive the query's where expression from the TanstackDB query API. We compile this down to a SQLiteWHEREclause (taking a comparable approach to what Electric does for PostgreSQL), and the PoC covers every where expression supported by the TanstackDB query API. The compiled expression is added to our set of tracked expressions, and we refresh the diff trigger with all accumulated expressions OR'd together.unloadSubsetremoves the expression and refreshes the diff trigger accordingly.The existing diff trigger and tracking table infrastructure is reused - the only difference is that the trigger now watches a constrained dataset defined by the combined query expressions rather than the full source table.
Stale data eviction on unload
Since where expressions are OR'd together, adding queries only ever widens the synced dataset. When a query is deregistered, however, its data may become stale since it's no longer actively synced. To handle this,
unloadSubsetevicts entries from the collection that match the departing query but not any of the remaining queries, effectively:SELECT id FROM ${viewName} WHERE (${departingWhereSQL}) AND NOT (${remainingWhereSQL}).Future Work (red area)
Map out the possible queries we can expect with the TanstackDB, and map what it would look like if we derived sync stream information from the predicates. This would allow us to help constrain/optimise data synced to from the PowerSync service to the local PowerSync SQLite data, which would have a potential effect of making the client side
on-demandsync faster.