Skip to content

feat: add custom 404 page with Rudderstack docs_404 tracking#191

Merged
rachaelrenk merged 2 commits into
mainfrom
oz/fix-404-instrumentation
Jun 8, 2026
Merged

feat: add custom 404 page with Rudderstack docs_404 tracking#191
rachaelrenk merged 2 commits into
mainfrom
oz/fix-404-instrumentation

Conversation

@rachaelrenk

Copy link
Copy Markdown
Contributor

Summary

Adds a custom src/pages/404.astro page that overrides Starlight's built-in 404 to fire a Rudderstack docs_404 track event with the originally-requested broken URL.

Why this matters: Every 404 visit currently logs https://docs.warp.dev/404/ as the page URL in analytics. This means we have ~3,800 broken-URL visits/week but zero visibility into which URLs are broken. With this change, the event captures broken_url (the actual path the visitor tried to reach) and referrer, enabling the weekly automated 404 monitoring agent (Step 5 of the redirect-fix plan) to surface the top missing redirects.

Changes

src/pages/404.astro (new file)

  • Uses StarlightPage to inherit full site chrome (nav, custom Head, Rudderstack snippet, Vercel Analytics)
  • Adds an is:inline script that fires window.rudderanalytics.track('docs_404', { broken_url, referrer }) via the existing RS queue pattern — works reliably before the async SDK finishes loading
  • Minimal "Page not found" content with a link back to home

Testing

  • npm run build passes (337 pages built)
  • Page inherits the same Rudderstack, Vercel Analytics, and Kapa AI setup as all other pages

Part of

Broken redirects fix plan — Step 1 (instrumentation prerequisite for all other steps).

Co-Authored-By: Oz oz-agent@warp.dev

Overrides Starlight's built-in 404 to fire a `docs_404` track event
via Rudderstack with the originally-requested broken URL and referrer.

Without this, every 404 hit logs https://docs.warp.dev/404/ as the
page URL in analytics — hiding which paths are actually broken. The
new event captures `broken_url` (window.location.href) and
`referrer` (document.referrer) so we can identify the top missing
redirects and fix them systematically.

The script uses the RS queue pattern (is:inline + window.rudderanalytics)
so it fires reliably before the async SDK finishes loading.

Part of the broken-redirects fix plan.

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label Jun 8, 2026
@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Jun 8, 2026 6:31pm

Request Review

@oz-for-oss

oz-for-oss Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@rachaelrenk

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@oz-for-oss oz-for-oss Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds a custom src/pages/404.astro page that uses Starlight chrome and emits a docs_404 RudderStack event for missing docs URLs. No approved spec context was available, so this review is based on the attached PR description and annotated diff.

Concerns

  • The new telemetry payload sends the full requested URL and full referrer to analytics, which can include query strings or fragments containing sensitive values. Capture only the redirect-monitoring data needed, or redact sensitive components before tracking.

Security

  • Full 404 URLs and referrers can contain tokens, email addresses, OAuth codes, or other private values; avoid sending them unredacted to RudderStack.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread src/pages/404.astro Outdated
@rachaelrenk rachaelrenk self-assigned this Jun 8, 2026
Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
@rachaelrenk rachaelrenk merged commit cfa40ed into main Jun 8, 2026
8 checks passed
@rachaelrenk rachaelrenk deleted the oz/fix-404-instrumentation branch June 8, 2026 18:32
rachaelrenk added a commit that referenced this pull request Jun 8, 2026
Adds a new Oz skill + helper script for a recurring Monday morning
agent that surfaces broken docs.warp.dev URLs, diffs them against
vercel.json, and posts a Slack summary.

The skill (SKILL.md) defines the full runbook:
- Queries stg_website_events for docs_404 track events (past 7 days)
- Loads vercel.json redirect sources from disk or GitHub raw URL
- Identifies uncovered broken URLs and computes week-over-week delta
- Posts a Block Kit Slack summary to the docs team channel
- Writes a CSV artifact to data/404-reports/ (run artifact, not committed)
- Handles no-data state (when PR #191 hasn't been live long enough)
- Handles failure gracefully with Slack error notices

The helper (run_404_report.py) implements the BigQuery queries via
Metabase API, diff logic, and CSV writing. Outputs JSON summary to
stdout for the agent to use when constructing the Slack message.

Schedule: every Monday 9am PT (cron: 0 17 * * 1 UTC).
Deploy via oz.warp.dev once METABASE_API_KEY, SLACK_BOT_TOKEN, and
SLACK_CHANNEL_ID are sSLACK_CHANNEL_ID are sSLACK_CHANNEL_ID a
Part of the broken-redirecPart of the broken-redirecPart of the broken-regent@warp.dev>
rachaelrenk added a commit that referenced this pull request Jun 9, 2026
)

* fix: add 52 missing redirects from GitBook → Astro migration audit

Ran an audit comparing all paths from old GitBook SUMMARY.md files
against the live Astro pages and existing vercel.json redirect sources.
Found 52 paths that were live on GitBook but returned 404 on the new site.

Most gaps are in the guides section, which was reorganized from a
developer-workflows/beginner/power-user/etc hierarchy into flat
category sections (agent-workflows, configuration, devops, external-tools,
frontend, getting-started, build-an-app-in-warp).

Also adds the two redirect audit tooling scripts for future use:
- scripts/audit_redirects.py — extracts GitBook paths and diffs against vercel.json
- scripts/add_missing_redirects.py — applies the mapping to vercel.json

Part of the broken-redirects fix plan (Steps 3-4).

Co-Authored-By: Oz <oz-agent@warp.dev>

* feat: add weekly-404-monitor skill for scheduled 404 gap detection

Adds a new Oz skill + helper script for a recurring Monday morning
agent that surfaces broken docs.warp.dev URLs, diffs them against
vercel.json, and posts a Slack summary.

The skill (SKILL.md) defines the full runbook:
- Queries stg_website_events for docs_404 track events (past 7 days)
- Loads vercel.json redirect sources from disk or GitHub raw URL
- Identifies uncovered broken URLs and computes week-over-week delta
- Posts a Block Kit Slack summary to the docs team channel
- Writes a CSV artifact to data/404-reports/ (run artifact, not committed)
- Handles no-data state (when PR #191 hasn't been live long enough)
- Handles failure gracefully with Slack error notices

The helper (run_404_report.py) implements the BigQuery queries via
Metabase API, diff logic, and CSV writing. Outputs JSON summary to
stdout for the agent to use when constructing the Slack message.

Schedule: every Monday 9am PT (cron: 0 17 * * 1 UTC).
Deploy via oz.warp.dev once METABASE_API_KEY, SLACK_BOT_TOKEN, and
SLACK_CHANNEL_ID are sSLACK_CHANNEL_ID are sSLACK_CHANNEL_ID a
Part of the broken-redirecPart of the broken-redirecPart of the broken-regent@warp.dev>

* fix: consolidate redirects with globs, add %2B variants for + paths

Addresses two review comments on PR #192:

Petra's suggestion: collapse 3 groups of redirects with identical
slug-to-slug mappings into Vercel :slug* glob rules:
  - /guides/integrations/:slug* → /guides/external-tools/:slug/
  - /guides/mcp-servers/:slug* → /guides/external-tools/:slug/
  - /guides/developer-workflows/devops/:slug* → /guides/devops/:slug/
13 individual rules → 3 globs (future GitBook paths in these sections
will also be covered automatically).

oz-for-oss bot: add %2B-encoded duplicates for the 6 source paths
that contain literal '+' characters. Vercel uses path-to-regexp for
source matching where '+' may be misinterpreted. The %2B variants
ensure URL-encoded requests are also covered.

Co-Authored-By: Oz <oz-agent@warp.dev>

* fix: remove invalid %2B source patterns, fix glob catch-all destinations

Vercel deployment was failing with 'invalid-route-source-pattern' due
to two issues in the glob/redirect additions:

1. %2B-encoded source patterns — Vercel's routing engine rejects
   URL-encoded characters (%) in source patterns. Removed 6 %2B
   duplicates. They're unnecessary anyway: Vercel normalises paths
   before route matching, so a literal + source already covers
   requests where the client sent %2B.

2. Glob destination mismatch — when the source uses :slug* (catch-all),
   the destination must also reference :slug* to substitute the full
   captured path. Changed /:slug/ → /:slug*/ in all 3 glob rules.

Co-Authored-By: Oz <oz-agent@warp.dev>

* fix: remove 6 source patterns with regex metacharacters (+ and .)

Vercel uses path-to-regexp v6.1.0 for source pattern matching. In this
version, '+' and '.' are regex metacharacters (+ = one-or-more, . =
any-char) that cause 'invalid-route-source-pattern' errors when they
appear as literals in non-parameter positions.

The 6 affected patterns all came from old GitBook URLs where '+' was
used as a separator (e.g. 'react-+-tailwind', 'd3.js-+-javascript').
These are obscure legacy URLs with negligible bookmark traffic.

The 46 other redirects and 3 :slug* glob patterns are unaffected.

Co-Authored-By: Oz <oz-agent@warp.dev>

* docs: update weekly-404-monitor skill to use buzz environment

- Use existing buzz environment (already has warpdotdev/docs checked out)
- Target Slack channel: #growth-docs
- Document the 3 secrets to verify in buzz: METABASE_API_KEY, SLACK_BOT_TOKEN, SLACK_CHANNEL_ID

Co-Authored-By: Oz <oz-agent@warp.dev>

---------

Co-authored-by: Oz <oz-agent@warp.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants