Skip to content

docs(TSP-1291): expand Monitor dashboard task status filter documentation#671

Open
claude[bot] wants to merge 4 commits into
mainfrom
docs/TSP-1291
Open

docs(TSP-1291): expand Monitor dashboard task status filter documentation#671
claude[bot] wants to merge 4 commits into
mainfrom
docs/TSP-1291

Conversation

@claude

@claude claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Summary

  • Renamed "Conversation status filter" to Task status filter in the Monitor dashboard setup steps
  • Documented all available agent task statuses with descriptions (completed, escalated, pending approval, execution limit reached, errored)
  • Added guidance on when to use status filtering and why (focus evaluation credit on specific outcomes)
  • Added a Note callout documenting new workforce dashboard behavior: evaluations trigger on every state transition, and status options are tailored to the resource type

Linear issue

https://linear.app/relevance/issue/TSP-1291/

…tion

Update step 5 in the Monitor dashboard setup to document the renamed
task status filter, available agent statuses, and new workforce
evaluation behavior where evaluations trigger on every state transition.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude claude Bot added the docs-drafter Documentation drafted by Claude label Jun 11, 2026
@linear

linear Bot commented Jun 11, 2026

Copy link
Copy Markdown

TSP-1291

@mintlify

mintlify Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
relevanceai 🟢 Ready View Preview Jun 11, 2026, 3:23 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@github-actions

This comment was marked as outdated.

Replace non-existent agent statuses with the real labels from
AGENT_STATUS_GROUPS / STATE_LABELS:
- 'Execution limit reached' is workforce-only, not an agent status
- 'Errored' is not a status label
Real agent failed states: Timed out, Unrecoverable error, Exhausted retries.
Also corrected the workforce note (evals do not trigger on every state
transition; transient states are intentionally omitted).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jordanc-relevanceai

Copy link
Copy Markdown
Collaborator

Fact-check fix pushed ✅

Corrected the agent task-status filter list — it listed two statuses that don't exist for agents:

  • Execution limit reached is a workforce-only status (WORKFORCE_STATUS_GROUPS), not an agent one — removed.
  • Errored isn't a real status label — removed.

Replaced with the actual agent failed-state labels from AGENT_STATUS_GROUPS / STATE_LABELS: Timed out, Unrecoverable error, Exhausted retries.

Also fixed the workforce note: evals do not trigger on every state transition (active/transient states are intentionally omitted — observability evals on them would grade an in-flight trace with no output). Kept the accurate point that workforce dashboards expose different statuses (e.g. Execution limit reached).

Source: apps/builder-app/features/evals/dashboardStatusFilters.ts (relevance-api-node).

@github-actions

This comment was marked as outdated.

@jordanc-relevanceai jordanc-relevanceai marked this pull request as draft June 11, 2026 03:49
Remove the agent status list, usage guidance, and workforce note from the
Monitor dashboard setup steps. Rename the field reference to **Filter tasks**.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Note blank filter evaluates all tasks; use "statuses" over "outcomes".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

🎯 Vibe check

Reviewed: 1 file (1 with issues, 0 clean)

Scores

Dimension Score What's holding it back
🟢 Consistency 8/10 Bold label inside <Info> callout (line 8); bullet list inside <Accordion> (lines 359–362) — both violate CLAUDE.md rules. No banned words, no British spellings, capitalization is clean throughout.
🟢 Technical clarity 9/10 UI element names are precise, edge cases are well-covered (truncation behavior, credit calculation, 10-Check limit). The "Filter tasks" option in Monitor step 5 could name the actual status values rather than just saying "leave blank to evaluate all tasks."
🟢 Non-technical clarity 9/10 Concepts are explained before instructions, the progressive structure from overview → sections → deep-dive works well. No jargon drops.
🟡 Structure 7/10 <CardGroup> used for the Best practices section — CLAUDE.md says this component is not appropriate for best practices that read naturally as flowing bullets.

Score key: 🟢 9–10, 🟡 6–8, 🔴 1–5.

Overall vibe: This is a well-crafted, comprehensive feature page — specific UI references, good edge-case coverage, and a logical flow from concept through how-to through reference. Three CLAUDE.md compliance items to fix (two component-content mismatches, one wrong component choice) before it's fully clean.

🔧 Issues (2)
  • build/agents/build-your-agent/evals.mdx:8**Rollout Status**: is a bold label inside a callout. CLAUDE.md explicitly prohibits bold labels inside callouts. Remove the bold label: <Info>Evals is rolling out progressively, starting with Enterprise customers. If you don't see this feature in your account yet, reach out to your account manager to discuss access.</Info>

  • build/agents/build-your-agent/evals.mdx:359–362 — Bullet list inside an <Accordion>. CLAUDE.md says accordion content must use flowing sentences, not bullet lists. Rewrite as: "Credits are calculated by adding the Agent task run, the simulator (which uses an LLM to generate the user persona's messages), and every Check that runs on the conversation — each Check, especially LLM Judge, makes its own LLM call."

🧩 Component suggestions (1)
  • build/agents/build-your-agent/evals.mdx:323–342<CardGroup cols={2}> for the Best practices section. CLAUDE.md says CardGroup is not appropriate for best practices that read naturally as flowing bullets. These six tips are practical guidance, not navigable destinations or parallel feature choices — a plain bulleted list under ## Best practices serves the reader better and matches the pattern used in sibling pages like tools.mdx.
⚠️ Contradictions (0)

No contradictions found between this file and the context pages read.

🔋 Credit usage
Item Count
Files reviewed 1
Context pages read 2
Total lines processed ~527

Files read: build/agents/build-your-agent/evals.mdx (390 lines), build/agents/build-your-agent/build-overview.mdx (26 lines), build/agents/build-your-agent/tools.mdx (111 lines)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-drafter Documentation drafted by Claude

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants