fix(export): stream large database dumps with keyset pagination (#59)#233
Open
trongtruong110-ux wants to merge 1 commit into
Open
fix(export): stream large database dumps with keyset pagination (#59)#233trongtruong110-ux wants to merge 1 commit into
trongtruong110-ux wants to merge 1 commit into
Conversation
…rbase#59) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
dumpDatabaseRoutebuilds the entire dump — every row of every table — into a single in-memory string before responding (let dumpContent = ''… thennew Blob([dumpContent])). For a large database this exceeds the isolate's memory limit and the request fails outright, which is what #59 reports.Changes
Rewrites the SQL dump path to stream instead of buffer:
pull-drivenReadableStream. The runtime only asks for the next page once the previous chunk has flushed downstream, so peak memory is ~one page regardless of database size.WHERE _rowid_ > ? ORDER BY _rowid_ LIMIT ?rather thanLIMIT/OFFSET.OFFSETre-scans every skipped row on each page (O(n²)), which is unusable on a large table.WITHOUT ROWIDtables (which have no_rowid_) fall back toLIMIT/OFFSET.NULL, numbers,0/1for booleans,X'..'for blobs, and escaped strings. Internalsqlite_*tables are skipped.500; the table list is resolved eagerly for that reason.No route or public API change —
dumpDatabaseRoute(dataSource, config)is unchanged (an optionalpageSizeargument is added for tests and tuning).Tests
src/export/dump.test.tsis rewritten to cover: streamed output with headers, multi-page keyset pagination (asserting the cursor query is used andOFFSETis not), value encoding (NULL / number / boolean / blob / quote-escaping), theWITHOUT ROWIDfallback, an empty database, and the 500 error path.Follow-up
This fixes the memory blow-up for the synchronous
/export/dumpendpoint. For databases large enough to also exceed the Worker CPU-time limit within a single request, a follow-up can offload the dump to R2 via a Durable Object alarm. Happy to do that as a separate PR./claim #59