feat: add custom 404 page with Rudderstack docs_404 tracking#191
Conversation
Overrides Starlight's built-in 404 to fire a `docs_404` track event via Rudderstack with the originally-requested broken URL and referrer. Without this, every 404 hit logs https://docs.warp.dev/404/ as the page URL in analytics — hiding which paths are actually broken. The new event captures `broken_url` (window.location.href) and `referrer` (document.referrer) so we can identify the top missing redirects and fix them systematically. The script uses the RS queue pattern (is:inline + window.rudderanalytics) so it fires reliably before the async SDK finishes loading. Part of the broken-redirects fix plan. Co-Authored-By: Oz <oz-agent@warp.dev>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR adds a custom src/pages/404.astro page that uses Starlight chrome and emits a docs_404 RudderStack event for missing docs URLs. No approved spec context was available, so this review is based on the attached PR description and annotated diff.
Concerns
- The new telemetry payload sends the full requested URL and full referrer to analytics, which can include query strings or fragments containing sensitive values. Capture only the redirect-monitoring data needed, or redact sensitive components before tracking.
Security
- Full 404 URLs and referrers can contain tokens, email addresses, OAuth codes, or other private values; avoid sending them unredacted to RudderStack.
Verdict
Found: 0 critical, 1 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
Adds a new Oz skill + helper script for a recurring Monday morning agent that surfaces broken docs.warp.dev URLs, diffs them against vercel.json, and posts a Slack summary. The skill (SKILL.md) defines the full runbook: - Queries stg_website_events for docs_404 track events (past 7 days) - Loads vercel.json redirect sources from disk or GitHub raw URL - Identifies uncovered broken URLs and computes week-over-week delta - Posts a Block Kit Slack summary to the docs team channel - Writes a CSV artifact to data/404-reports/ (run artifact, not committed) - Handles no-data state (when PR #191 hasn't been live long enough) - Handles failure gracefully with Slack error notices The helper (run_404_report.py) implements the BigQuery queries via Metabase API, diff logic, and CSV writing. Outputs JSON summary to stdout for the agent to use when constructing the Slack message. Schedule: every Monday 9am PT (cron: 0 17 * * 1 UTC). Deploy via oz.warp.dev once METABASE_API_KEY, SLACK_BOT_TOKEN, and SLACK_CHANNEL_ID are sSLACK_CHANNEL_ID are sSLACK_CHANNEL_ID a Part of the broken-redirecPart of the broken-redirecPart of the broken-regent@warp.dev>
) * fix: add 52 missing redirects from GitBook → Astro migration audit Ran an audit comparing all paths from old GitBook SUMMARY.md files against the live Astro pages and existing vercel.json redirect sources. Found 52 paths that were live on GitBook but returned 404 on the new site. Most gaps are in the guides section, which was reorganized from a developer-workflows/beginner/power-user/etc hierarchy into flat category sections (agent-workflows, configuration, devops, external-tools, frontend, getting-started, build-an-app-in-warp). Also adds the two redirect audit tooling scripts for future use: - scripts/audit_redirects.py — extracts GitBook paths and diffs against vercel.json - scripts/add_missing_redirects.py — applies the mapping to vercel.json Part of the broken-redirects fix plan (Steps 3-4). Co-Authored-By: Oz <oz-agent@warp.dev> * feat: add weekly-404-monitor skill for scheduled 404 gap detection Adds a new Oz skill + helper script for a recurring Monday morning agent that surfaces broken docs.warp.dev URLs, diffs them against vercel.json, and posts a Slack summary. The skill (SKILL.md) defines the full runbook: - Queries stg_website_events for docs_404 track events (past 7 days) - Loads vercel.json redirect sources from disk or GitHub raw URL - Identifies uncovered broken URLs and computes week-over-week delta - Posts a Block Kit Slack summary to the docs team channel - Writes a CSV artifact to data/404-reports/ (run artifact, not committed) - Handles no-data state (when PR #191 hasn't been live long enough) - Handles failure gracefully with Slack error notices The helper (run_404_report.py) implements the BigQuery queries via Metabase API, diff logic, and CSV writing. Outputs JSON summary to stdout for the agent to use when constructing the Slack message. Schedule: every Monday 9am PT (cron: 0 17 * * 1 UTC). Deploy via oz.warp.dev once METABASE_API_KEY, SLACK_BOT_TOKEN, and SLACK_CHANNEL_ID are sSLACK_CHANNEL_ID are sSLACK_CHANNEL_ID a Part of the broken-redirecPart of the broken-redirecPart of the broken-regent@warp.dev> * fix: consolidate redirects with globs, add %2B variants for + paths Addresses two review comments on PR #192: Petra's suggestion: collapse 3 groups of redirects with identical slug-to-slug mappings into Vercel :slug* glob rules: - /guides/integrations/:slug* → /guides/external-tools/:slug/ - /guides/mcp-servers/:slug* → /guides/external-tools/:slug/ - /guides/developer-workflows/devops/:slug* → /guides/devops/:slug/ 13 individual rules → 3 globs (future GitBook paths in these sections will also be covered automatically). oz-for-oss bot: add %2B-encoded duplicates for the 6 source paths that contain literal '+' characters. Vercel uses path-to-regexp for source matching where '+' may be misinterpreted. The %2B variants ensure URL-encoded requests are also covered. Co-Authored-By: Oz <oz-agent@warp.dev> * fix: remove invalid %2B source patterns, fix glob catch-all destinations Vercel deployment was failing with 'invalid-route-source-pattern' due to two issues in the glob/redirect additions: 1. %2B-encoded source patterns — Vercel's routing engine rejects URL-encoded characters (%) in source patterns. Removed 6 %2B duplicates. They're unnecessary anyway: Vercel normalises paths before route matching, so a literal + source already covers requests where the client sent %2B. 2. Glob destination mismatch — when the source uses :slug* (catch-all), the destination must also reference :slug* to substitute the full captured path. Changed /:slug/ → /:slug*/ in all 3 glob rules. Co-Authored-By: Oz <oz-agent@warp.dev> * fix: remove 6 source patterns with regex metacharacters (+ and .) Vercel uses path-to-regexp v6.1.0 for source pattern matching. In this version, '+' and '.' are regex metacharacters (+ = one-or-more, . = any-char) that cause 'invalid-route-source-pattern' errors when they appear as literals in non-parameter positions. The 6 affected patterns all came from old GitBook URLs where '+' was used as a separator (e.g. 'react-+-tailwind', 'd3.js-+-javascript'). These are obscure legacy URLs with negligible bookmark traffic. The 46 other redirects and 3 :slug* glob patterns are unaffected. Co-Authored-By: Oz <oz-agent@warp.dev> * docs: update weekly-404-monitor skill to use buzz environment - Use existing buzz environment (already has warpdotdev/docs checked out) - Target Slack channel: #growth-docs - Document the 3 secrets to verify in buzz: METABASE_API_KEY, SLACK_BOT_TOKEN, SLACK_CHANNEL_ID Co-Authored-By: Oz <oz-agent@warp.dev> --------- Co-authored-by: Oz <oz-agent@warp.dev>
Summary
Adds a custom
src/pages/404.astropage that overrides Starlight's built-in 404 to fire a Rudderstackdocs_404track event with the originally-requested broken URL.Why this matters: Every 404 visit currently logs
https://docs.warp.dev/404/as the page URL in analytics. This means we have ~3,800 broken-URL visits/week but zero visibility into which URLs are broken. With this change, the event capturesbroken_url(the actual path the visitor tried to reach) andreferrer, enabling the weekly automated 404 monitoring agent (Step 5 of the redirect-fix plan) to surface the top missing redirects.Changes
src/pages/404.astro(new file)StarlightPageto inherit full site chrome (nav, custom Head, Rudderstack snippet, Vercel Analytics)is:inlinescript that fireswindow.rudderanalytics.track('docs_404', { broken_url, referrer })via the existing RS queue pattern — works reliably before the async SDK finishes loadingTesting
npm run buildpasses (337 pages built)Part of
Broken redirects fix plan — Step 1 (instrumentation prerequisite for all other steps).
Co-Authored-By: Oz oz-agent@warp.dev