Complete API documentation for all PhishGuardAI endpoints.
- Base URLs
- Authentication
- Prediction Endpoints
- Explainability Endpoints
- Observability Endpoints
- Error Responses
- Rate Limiting
- Example Workflows
| Environment | Base URL |
|---|---|
| Local Development | http://localhost:8000 |
| Docker Compose | http://gateway:8000 |
| Production | https://phishguard.example.com |
Current Status: No authentication required (local development)
Planned: API key authentication via X-API-Key header
# Future implementation
curl -X POST "https://api.phishguard.com/predict" \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{"url":"http://example.com"}'Classify a URL as legitimate, phishing, or uncertain.
Request Body:
{
"url": "string" // Required: URL to analyze
}Response (200 OK):
{
"url": "string", // Analyzed URL
"decision": "string", // ALLOW | REVIEW | BLOCK
"reason": "string", // Decision rationale
"p_malicious": "float", // Phishing probability [0,1]
"source": "string", // whitelist | model
"model_name": "string", // Model identifier
"features": { // Extracted features
"TLDLegitimateProb": "float",
"CharContinuationRate": "float",
"SpacialCharRatioInURL": "float",
"URLCharProb": "float",
"LetterRatioInURL": "float",
"NoOfOtherSpecialCharsInURL": "int",
"DomainLength": "int"
},
"judge": { // Present if gray zone
"verdict": "string", // LEAN_PHISH | LEAN_LEGIT | UNCERTAIN
"rationale": "string", // Human-readable explanation
"judge_score": "float", // Judge confidence [0,1]
"context": {
"backend": "string", // llm | stub_fallback
"model": "string" // LLM model name
}
}
}Examples:
Example 1: Whitelisted Domain
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{"url":"https://github.com"}'{
"url": "https://github.com",
"decision": "ALLOW",
"reason": "domain-whitelist",
"p_malicious": 0.01,
"source": "whitelist",
"features": null,
"judge": null
}Example 2: High-Confidence Phishing
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{"url":"http://verify-account-urgent.tk"}'{
"url": "http://verify-account-urgent.tk",
"decision": "BLOCK",
"reason": "policy-band",
"p_malicious": 0.9995,
"source": "model",
"model_name": "7-feature-production-v1",
"features": {
"TLDLegitimateProb": 0.12,
"CharContinuationRate": 0.08,
"SpacialCharRatioInURL": 0.19,
"URLCharProb": 1.0,
"LetterRatioInURL": 0.81,
"NoOfOtherSpecialCharsInURL": 5,
"DomainLength": 24
},
"judge": null
}Example 3: Gray Zone with Judge
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{"url":"http://npm.org"}'{
"url": "http://npm.org",
"decision": "ALLOW",
"reason": "judge-short-domain-lean-legit",
"p_malicious": 0.35,
"source": "model",
"model_name": "7-feature-production-v1",
"features": {
"TLDLegitimateProb": 0.85,
"CharContinuationRate": 0.0,
"SpacialCharRatioInURL": 0.125,
"URLCharProb": 1.0,
"LetterRatioInURL": 0.875,
"NoOfOtherSpecialCharsInURL": 1,
"DomainLength": 7
},
"judge": {
"verdict": "LEAN_LEGIT",
"rationale": "Domain 'npm.org' is a well-known package manager. Short domain length (7 chars) is expected for legitimate tech infrastructure. TLD .org is commonly used by open-source projects.",
"judge_score": 0.15,
"context": {
"backend": "llm",
"model": "llama3.2:1b",
"is_short_domain_case": true
}
}
}Error Responses:
// 400 Bad Request - Missing URL
{
"error": "Missing required field: url"
}
// 422 Unprocessable Entity - Invalid URL
{
"error": "Invalid URL format"
}
// 503 Service Unavailable - Model service down
{
"error": "Model service unavailable",
"retry_after": 60
}Get SHAP feature importance values for a URL prediction.
Request Body:
{
"url": "string" // Required: URL to explain
}Response (200 OK):
{
"url": "string",
"p_malicious": "float",
"base_value": "float", // Model baseline
"features": {
"feature_name": {
"value": "float", // Actual feature value
"shap_value": "float", // SHAP contribution
"importance": "float" // |shap_value|
}
},
"top_features": ["string"], // Top 3 by importance
"model_name": "string",
"explanation": "string",
"note": "string"
}Example:
curl -X POST "http://localhost:8000/predict/explain" \
-H "Content-Type: application/json" \
-d '{"url":"http://suspicious-login.info"}'{
"url": "http://suspicious-login.info",
"p_malicious": 0.8542,
"base_value": 0.318,
"features": {
"CharContinuationRate": {
"value": 0.1,
"shap_value": -0.523,
"importance": 0.523
},
"NoOfOtherSpecialCharsInURL": {
"value": 6,
"shap_value": 0.342,
"importance": 0.342
},
"TLDLegitimateProb": {
"value": 0.43,
"shap_value": -0.026,
"importance": 0.026
},
"SpacialCharRatioInURL": {
"value": 0.19,
"shap_value": 0.145,
"importance": 0.145
},
"URLCharProb": {
"value": 1.0,
"shap_value": 0.0,
"importance": 0.0
},
"LetterRatioInURL": {
"value": 0.81,
"shap_value": 0.089,
"importance": 0.089
},
"DomainLength": {
"value": 21,
"shap_value": -0.042,
"importance": 0.042
}
},
"top_features": [
"CharContinuationRate",
"NoOfOtherSpecialCharsInURL",
"SpacialCharRatioInURL"
],
"model_name": "7-feature-production-v1",
"explanation": "Positive SHAP values push towards phishing; negative towards legitimate",
"note": "SHAP computed on base estimator (before calibration) for approximate feature importance"
}Error Responses:
// 500 Internal Server Error - SHAP computation failed
{
"error": "SHAP explanation failed: <error_message>",
"details": "<traceback>"
}
// 503 Service Unavailable - SHAP not installed
{
"error": "SHAP not installed. Install with: pip install shap"
}Dashboard Access:
# Visual SHAP dashboard
open http://localhost:8000/explainService health check.
Response (200 OK):
{
"status": "healthy",
"model_loaded": true,
"model_service": "connected",
"judge_backend": "llm",
"timestamp": "2025-10-23T12:34:56Z"
}Response (503 Service Unavailable):
{
"status": "unhealthy",
"model_loaded": false,
"model_service": "disconnected",
"timestamp": "2025-10-23T12:34:56Z"
}Decision statistics.
Response (200 OK):
{
"policy": {
"ALLOW": 5234, // Policy band ALLOWs
"REVIEW": 678, // Policy band REVIEWs
"BLOCK": 3421 // Policy band BLOCKs
},
"judge": {
"LEAN_PHISH": 234, // Judge phishing verdicts
"LEAN_LEGIT": 312, // Judge legitimate verdicts
"UNCERTAIN": 132 // Judge uncertain verdicts
},
"final": {
"ALLOW": 5546, // Final ALLOWs (policy + judge)
"REVIEW": 132, // Final REVIEWs (human escalation)
"BLOCK": 3655 // Final BLOCKs (policy + judge)
},
"uptime_seconds": 3600
}Current configuration.
Response (200 OK):
{
"thresholds": {
"low": 0.011,
"high": 0.998,
"optimal": 0.5
},
"model_name": "7-feature-production-v1",
"judge_backend": "llm",
"judge_model": "llama3.2:1b",
"gray_zone_rate": 0.12
}{
"error": "string", // Human-readable error message
"details": "string", // Optional: Additional context
"timestamp": "string" // ISO 8601 timestamp
}| Code | Meaning | Common Causes |
|---|---|---|
| 200 | OK | Successful request |
| 400 | Bad Request | Missing required field, invalid JSON |
| 422 | Unprocessable Entity | Invalid URL format |
| 500 | Internal Server Error | Model inference error, SHAP failure |
| 503 | Service Unavailable | Model service down, dependencies missing |
Current Status: No rate limiting (local development)
Planned:
Rate Limit: 100 requests/minute per IP
Headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1698012000
Response (429 Too Many Requests):
{
"error": "Rate limit exceeded",
"retry_after": 60
}#!/bin/bash
# Scan a list of URLs
urls=(
"https://google.com"
"http://phishing-site.tk"
"http://npm.org"
)
for url in "${urls[@]}"; do
echo "Scanning: $url"
response=$(curl -s -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d "{\"url\":\"$url\"}")
decision=$(echo $response | jq -r '.decision')
p_malicious=$(echo $response | jq -r '.p_malicious')
echo " Decision: $decision (p=$p_malicious)"
echo ""
doneOutput:
Scanning: https://google.com
Decision: ALLOW (p=0.01)
Scanning: http://phishing-site.tk
Decision: BLOCK (p=0.9995)
Scanning: http://npm.org
Decision: ALLOW (p=0.35)
import requests
import json
def scan_url(url: str):
"""Scan URL and get SHAP explanation if suspicious."""
# Get prediction
response = requests.post(
"http://localhost:8000/predict",
json={"url": url}
)
result = response.json()
# If suspicious, get explanation
if result["p_malicious"] > 0.5:
explain_response = requests.post(
"http://localhost:8000/predict/explain",
json={"url": url}
)
result["shap"] = explain_response.json()
return result
# Scan URLs
urls = [
"http://example-shop.com",
"http://verify-account.tk",
"http://bit.ly/abc123"
]
for url in urls:
result = scan_url(url)
print(f"URL: {url}")
print(f" Decision: {result['decision']}")
print(f" p_malicious: {result['p_malicious']:.4f}")
if "shap" in result:
top_features = result["shap"]["top_features"]
print(f" Top features: {', '.join(top_features)}")
print()# Get stats every 5 seconds
watch -n 5 'curl -s http://localhost:8000/stats | jq .'Output:
{
"policy": {
"ALLOW": 5234,
"REVIEW": 678,
"BLOCK": 3421
},
"judge": {
"LEAN_PHISH": 234,
"LEAN_LEGIT": 312,
"UNCERTAIN": 132
},
"final": {
"ALLOW": 5546,
"REVIEW": 132,
"BLOCK": 3655
},
"uptime_seconds": 3600
}- README.md - Project overview
- DEPLOYMENT.md - Deployment guide
- ARCHITECTURE.md - Design decisions
- JUDGE.md - LLM judge system
Last Updated: October 23, 2025
Version: 1.0.0