Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions src/lib/api-navigation.ts
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,26 @@ export const apiNavigation: ApiNavGroup[] = [
}
]
},
{
"title": "Simulation Analytics",
"items": [
{
"title": "Get Simulation Metrics",
"href": "/docs/api/simulation-analytics/metrics",
"method": "GET"
},
{
"title": "Get Simulation Runs",
"href": "/docs/api/simulation-analytics/runs",
"method": "GET"
},
{
"title": "Get Simulation Analytics",
"href": "/docs/api/simulation-analytics/analytics",
"method": "GET"
}
]
},
{
"title": "Annotation Scores",
"items": [
Expand Down
8 changes: 8 additions & 0 deletions src/lib/navigation.ts
Original file line number Diff line number Diff line change
Expand Up @@ -743,6 +743,14 @@ export const tabNavigation: NavTab[] = [
{ title: 'Execute Run Test', href: '/docs/api/run-tests/executeruntest' },
]
},
{
title: 'Simulation Analytics',
items: [
{ title: 'Get Simulation Metrics', href: '/docs/api/simulation-analytics/metrics' },
{ title: 'Get Simulation Runs', href: '/docs/api/simulation-analytics/runs' },
{ title: 'Get Simulation Analytics', href: '/docs/api/simulation-analytics/analytics' },
]
},
{
title: 'Annotation Scores',
items: [
Expand Down
239 changes: 239 additions & 0 deletions src/pages/docs/api/simulation-analytics/analytics.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
---
title: "Get Simulation Analytics"
description: "Retrieve aggregated analytics — eval scores, eval averages, system summary, and FMA suggestions — for a simulation run."
---

# Get Simulation Analytics

Returns the aggregated analytics view for a simulation run. This corresponds to the **Analytics tab** in the FutureAGI UI — eval scores (radar chart data), per-metric averages, system summary, and critical issues with Fix My Agent suggestions.

<ApiPlayground
method="GET"
endpoint="/sdk/api/v1/simulation/analytics/"
baseUrl="https://api.futureagi.com"
parameters={[
{"name": "run_test_name", "in": "query", "required": false, "description": "Name of the run test. Uses the latest completed execution.", "type": "string"},
{"name": "execution_id", "in": "query", "required": false, "description": "UUID of a specific test execution.", "type": "string"},
{"name": "eval_name", "in": "query", "required": false, "description": "Comma-separated list of eval names to filter.", "type": "string"},
{"name": "summary", "in": "query", "required": false, "description": "Include FMA explanation summary. Default: true.", "type": "boolean"}
]}
/>

## Authentication

This endpoint uses API key authentication. Include both headers in every request:

```bash
X-Api-Key: YOUR_API_KEY
X-Secret-Key: YOUR_SECRET_KEY
```

## Query Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `run_test_name` | string | One of these is required | Name of the run test. Returns analytics for the latest completed execution. |
| `execution_id` | UUID | | UUID of a test execution. Returns analytics for that execution. |
| `eval_name` | string | No | Comma-separated eval names to filter. Only matching evals are included. |
| `summary` | boolean | No | Include FMA explanation summary and critical issues. Default: `true`. |

## Responses

### 200 — Analytics for an execution

Returns eval scores, averages, system summary, and optionally FMA suggestions.

```json
{
"status": true,
"result": {
"execution_id": "d2fa3f2c-...",
"run_test_name": "My Agent Test",
"status": "completed",
"eval_results": [
{
"name": "conversation_coherence",
"id": "...",
"output_type": "Pass/Fail",
"total_pass_rate": 85.0,
"result": [
{
"name": "coherence_check",
"id": "...",
"total_cells": 48,
"output": {
"pass": 85.0,
"fail": 15.0,
"pass_count": 41,
"fail_count": 7
}
}
]
},
{
"name": "conversation_resolution",
"id": "...",
"output_type": "Pass/Fail",
"total_pass_rate": 92.0,
"result": [...]
}
],
"eval_averages": {
"avg_conversation_coherence": 85.0,
"avg_conversation_resolution": 92.0,
"avg_bias_detection": 100.0
},
"system_summary": {
"total_calls": 50,
"completed_calls": 48,
"failed_calls": 2,
"avg_score": 82.5,
"avg_response_time_ms": 290.0,
"total_duration_seconds": 6000
},
"eval_explanation_summary": {
"coherence_check": [
{
"cluster_name": "Pricing contradictions",
"call_execution_ids": ["uuid1", "uuid2"],
"description": "Agent gives different prices when asked about the same product."
}
]
},
"eval_explanation_summary_status": "completed"
}
}
```

### 200 — By `run_test_name` with no completed executions

```json
{
"status": true,
"result": {
"run_test_name": "My Agent Test",
"message": "No completed executions found.",
"eval_results": [],
"eval_averages": {},
"system_summary": {}
}
}
```

### 200 — With `summary=false`

Same response but without `eval_explanation_summary` and `eval_explanation_summary_status` fields.

### 400

Missing or invalid parameters.

### 404

The specified run test or execution was not found.

### 500

Internal server error.

## Response Fields

### `eval_results`

Detailed eval scores broken down by eval template and config. Each entry includes pass/fail counts, rates, or score percentiles depending on the eval type.

### `eval_averages`

Flat key-value map of averaged eval scores across all calls. Keys follow the pattern `avg_{eval_name}`. Useful for quick comparisons and threshold checks.

### `system_summary`

Aggregated system-level metrics: call counts, average score, response time, and total duration.

### `eval_explanation_summary`

LLM-generated analysis that clusters failure reasons and provides actionable improvement suggestions. This is the same data shown in the **Critical Issues** panel in the UI.

## Code Examples

### cURL

```bash
# Get full analytics for latest execution of a run test
curl "https://api.futureagi.com/sdk/api/v1/simulation/analytics/?run_test_name=My%20Agent%20Test" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "X-Secret-Key: YOUR_SECRET_KEY"

# Get analytics for a specific execution, no FMA
curl "https://api.futureagi.com/sdk/api/v1/simulation/analytics/?execution_id=YOUR_EXECUTION_ID&summary=false" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "X-Secret-Key: YOUR_SECRET_KEY"

# Filter to specific evals only
curl "https://api.futureagi.com/sdk/api/v1/simulation/analytics/?execution_id=YOUR_EXECUTION_ID&eval_name=Coherence,Resolution" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "X-Secret-Key: YOUR_SECRET_KEY"
```

### Python — Automated promotion gate

```python
import requests

url = "https://api.futureagi.com/sdk/api/v1/simulation/analytics/"
headers = {
"X-Api-Key": "YOUR_API_KEY",
"X-Secret-Key": "YOUR_SECRET_KEY",
}

response = requests.get(url, headers=headers, params={
"run_test_name": "My Agent Test",
})
data = response.json()["result"]

# Check if agent meets promotion criteria
eval_averages = data["eval_averages"]
min_threshold = 80.0

all_passing = all(
score >= min_threshold
for key, score in eval_averages.items()
if key.startswith("avg_")
)

if all_passing:
print("Agent meets quality bar — promoting to production.")
else:
# Feed critical issues into your LLM for improvement suggestions
issues = data.get("eval_explanation_summary", {})
for eval_name, clusters in issues.items():
for cluster in clusters:
print(f"[{eval_name}] {cluster['cluster_name']}: {cluster['description']}")
```

### JavaScript — Dashboard integration

```javascript
const response = await fetch(
"https://api.futureagi.com/sdk/api/v1/simulation/analytics/?run_test_name=My%20Agent%20Test",
{
headers: {
"X-Api-Key": "YOUR_API_KEY",
"X-Secret-Key": "YOUR_SECRET_KEY",
},
}
);

const { result } = await response.json();

// Build radar chart data from eval_results
const radarData = result.eval_results.map((eval) => ({
label: eval.name,
value: eval.total_pass_rate ?? eval.total_avg ?? 0,
}));

// Display system summary
console.log(`Calls: ${result.system_summary.total_calls}`);
console.log(`Avg Score: ${result.system_summary.avg_score}`);
console.log(`Avg Response Time: ${result.system_summary.avg_response_time_ms}ms`);
```
Loading