Skip to content

Add plan for gathering data for public profile#855

Closed
olearycrew wants to merge 1 commit intomainfrom
bdo/public-profile-data
Closed

Add plan for gathering data for public profile#855
olearycrew wants to merge 1 commit intomainfrom
bdo/public-profile-data

Conversation

@olearycrew
Copy link
Contributor

This adds the plan for gathering data we would need for users to be able to expose a public profile to the world

@olearycrew olearycrew requested a review from jeanduplessis March 5, 2026 16:30
)
```

The `feature_cte` already exists in the CTE chain. Rows where `feature_id` is null (untagged usage) are excluded — the `SELECT` returns no rows if the subquery is null, so the upsert is a no-op.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: The claim that "the SELECT returns no rows if the subquery is null" is incorrect in PostgreSQL.

A scalar subquery that returns NULL (either because feature_cte returns zero rows, or returns a row with feature_id = NULL) still causes the outer SELECT to produce one row — with NULL in the feature_id column. It does not produce zero rows.

Since the table schema defines feature_id integer NOT NULL (line 156), the INSERT would hit a NOT NULL constraint violation, which would fail the entire CTE chain — potentially blocking the primary microdollar_usage insert.

The CTE needs an explicit filter, e.g.:

SELECT ... WHERE (SELECT feature_id FROM feature_cte) IS NOT NULL

or rewrite as a JOIN to feature_cte so that no row is produced when there's no matching feature.


1. Looks up `user_period_cache` by `shared_url_token`
2. If fresh (computed_at within 6h), returns `data` directly
3. If stale or missing, calls `computePublicProfileData()`, then returns the result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: The read path doesn't address the thundering herd / cache stampede problem.

If a popular profile's cache expires, multiple concurrent requests will all see "stale or missing" simultaneously and all trigger computePublicProfileData(). While the recomputation is cheap (~50-100ms per the plan), under high concurrency this still means redundant work and unnecessary DB writes.

The implementation sketch should mention one of:

  • A lock/mutex (e.g. SELECT ... FOR UPDATE SKIP LOCKED or pg_advisory_lock) so only one request recomputes
  • Stale-while-revalidate: serve the stale cache immediately and trigger an async single refresh (mentioned as a "possible" pro on line 122 but not reflected in the implementation sketch)

Line 122 already acknowledges this as a possibility — it should be promoted to the actual implementation plan.

AND mum.created_at >= $batch_start
AND mum.created_at < $batch_end
GROUP BY mu.kilo_user_id, mum.feature_id, mum.created_at::date
ON CONFLICT (kilo_user_id, feature_id, usage_date)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: The backfill ON CONFLICT clause is additive, not idempotent.

request_count = user_feature_daily_usage.request_count + EXCLUDED.request_count adds to the existing value. If a batch is partially processed and then retried (e.g., batch for Jan 1-31 commits Jan 1-15, then the full batch is retried), Jan 1-15 rows get double-counted.

Two safer alternatives:

  1. Delete before insert: DELETE FROM user_feature_daily_usage WHERE usage_date >= $batch_start AND usage_date < $batch_end before each batch insert, making retries safe.
  2. Overwrite instead of add: Use = EXCLUDED.request_count and = EXCLUDED.total_tokens (without adding to the existing value), so re-running a batch produces the same result.

Option 2 is simpler but only works if each batch covers non-overlapping date ranges and the write-path upserts haven't started yet.

@kilo-code-bot
Copy link
Contributor

kilo-code-bot bot commented Mar 5, 2026

Code Review Summary

Status: 3 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 3
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
plans/public-profile-cache.md 186 Incorrect claim about NULL subquery behavior — the SELECT still produces a row with NULL feature_id, which would violate the NOT NULL constraint and fail the entire CTE chain. The CTE needs a WHERE ... IS NOT NULL filter or a JOIN instead.
plans/public-profile-cache.md 201 Read path doesn't address thundering herd / cache stampede when multiple concurrent requests hit a stale cache. Should specify a locking strategy or promote the stale-while-revalidate idea from the pros into the implementation sketch.
plans/public-profile-cache.md 221 Backfill ON CONFLICT is additive, not idempotent — retrying a partially-processed batch would double-count rows. Should either delete-before-insert or use overwrite semantics.
Files Reviewed (1 file)
  • plans/public-profile-cache.md - 3 issues

Fix these issues in Kilo Cloud

@olearycrew olearycrew closed this Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant