fix(prisma): add retry for Aurora Serverless v2 connection errors#121
Open
fix(prisma): add retry for Aurora Serverless v2 connection errors#121
Conversation
7900894 to
db668e9
Compare
…, #105) Why: Aurora Serverless v2 with auto-pause (0 ACU) drops connections on idle_session_timeout and takes ~15s to resume. Without retry, both runtime queries and CDK deployment migrations fail on transient errors. Also, DATABASE_URL (including password) was logged to CloudWatch. What: - Remove console.log(DATABASE_URL) that leaked credentials to CloudWatch - Add Prisma client extension with retry on transient connection errors (P2024, P1001, P1017, idle-session timeout, ECONNRESET) - Add exponential backoff retry to migration-runner for prisma db push - Optimize connection params: connection_limit=1, connect_timeout=30
The default pool_timeout (10s) is insufficient for Aurora Serverless v2 auto-pause resume (~15s). Also, PrismaClientInitializationError for pool timeout has errorCode=undefined, so message-based detection is needed.
d94e77e to
908ab82
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue
close #104
close #105
Problem
The starter kit has three issues with Prisma + Aurora Serverless v2 (auto-pause enabled with
minCapacity: 0):Credential leak:
console.log(process.env.DATABASE_URL)inprisma.tsoutputs the full connection string including password to CloudWatch Logs.No runtime retry: Aurora drops idle connections after
idle_session_timeout(60s) and takes ~15s to resume from auto-pause (docs). Without retry, queries fail with transient errors (P1017, ECONNRESET) and do not recover.No migration retry:
migration-runner.tsrunsprisma db pushwithout retry. Duringcdk deploy, Aurora may still be resuming, causing P1001 ("Can't reach database server") and failing the entire deployment.Solution
console.log(DATABASE_URL)to fix the credential leak.Prisma.defineExtensionwith$allModels.$allOperations) that retries transient connection errors with exponential backoff. Retryable errors: P2024, P1001, P1017, idle-session timeout, ECONNRESET. Non-retryable errors (auth failures, schema errors) are thrown immediately.migration-runner.tsforprisma db pushwith exponential backoff (base 3s, max 5 attempts, ~100s worst case within Lambda 5min timeout). Only P1001 / connection refused are retried.connection_limit=1(Lambda handles one request per instance),connect_timeout=30(accommodates auto-pause resume time).Changes
webapp/src/lib/prisma.ts— Removeconsole.log, remove verboselogoption, add retry extension via$extendswebapp/src/jobs/migration-runner.ts— ExtractrunPrismaDbPushwith retry loop, structured loggingcdk/lib/constructs/database.ts— Change connection options to?connection_limit=1&connect_timeout=30Verification
console.log(process.env.DATABASE_URL)is removedcdk deploysucceeds even when Aurora is resuming from 0 ACUtsc --noEmitpassesprettier --checkpasses