Add plan for gathering data for public profile#855
Conversation
| ) | ||
| ``` | ||
|
|
||
| The `feature_cte` already exists in the CTE chain. Rows where `feature_id` is null (untagged usage) are excluded — the `SELECT` returns no rows if the subquery is null, so the upsert is a no-op. |
There was a problem hiding this comment.
WARNING: The claim that "the SELECT returns no rows if the subquery is null" is incorrect in PostgreSQL.
A scalar subquery that returns NULL (either because feature_cte returns zero rows, or returns a row with feature_id = NULL) still causes the outer SELECT to produce one row — with NULL in the feature_id column. It does not produce zero rows.
Since the table schema defines feature_id integer NOT NULL (line 156), the INSERT would hit a NOT NULL constraint violation, which would fail the entire CTE chain — potentially blocking the primary microdollar_usage insert.
The CTE needs an explicit filter, e.g.:
SELECT ... WHERE (SELECT feature_id FROM feature_cte) IS NOT NULLor rewrite as a JOIN to feature_cte so that no row is produced when there's no matching feature.
|
|
||
| 1. Looks up `user_period_cache` by `shared_url_token` | ||
| 2. If fresh (computed_at within 6h), returns `data` directly | ||
| 3. If stale or missing, calls `computePublicProfileData()`, then returns the result |
There was a problem hiding this comment.
WARNING: The read path doesn't address the thundering herd / cache stampede problem.
If a popular profile's cache expires, multiple concurrent requests will all see "stale or missing" simultaneously and all trigger computePublicProfileData(). While the recomputation is cheap (~50-100ms per the plan), under high concurrency this still means redundant work and unnecessary DB writes.
The implementation sketch should mention one of:
- A lock/mutex (e.g.
SELECT ... FOR UPDATE SKIP LOCKEDorpg_advisory_lock) so only one request recomputes - Stale-while-revalidate: serve the stale cache immediately and trigger an async single refresh (mentioned as a "possible" pro on line 122 but not reflected in the implementation sketch)
Line 122 already acknowledges this as a possibility — it should be promoted to the actual implementation plan.
| AND mum.created_at >= $batch_start | ||
| AND mum.created_at < $batch_end | ||
| GROUP BY mu.kilo_user_id, mum.feature_id, mum.created_at::date | ||
| ON CONFLICT (kilo_user_id, feature_id, usage_date) |
There was a problem hiding this comment.
WARNING: The backfill ON CONFLICT clause is additive, not idempotent.
request_count = user_feature_daily_usage.request_count + EXCLUDED.request_count adds to the existing value. If a batch is partially processed and then retried (e.g., batch for Jan 1-31 commits Jan 1-15, then the full batch is retried), Jan 1-15 rows get double-counted.
Two safer alternatives:
- Delete before insert:
DELETE FROM user_feature_daily_usage WHERE usage_date >= $batch_start AND usage_date < $batch_endbefore each batch insert, making retries safe. - Overwrite instead of add: Use
= EXCLUDED.request_countand= EXCLUDED.total_tokens(without adding to the existing value), so re-running a batch produces the same result.
Option 2 is simpler but only works if each batch covers non-overlapping date ranges and the write-path upserts haven't started yet.
Code Review SummaryStatus: 3 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (1 file)
|
This adds the plan for gathering data we would need for users to be able to expose a public profile to the world