skill-evaluation

Star

Here are 9 public repositories matching this topic...

Evol-ai / SkillCompass

Star

Evaluate agent skill quality. Find the weakest link. Fix it. Prove it worked.

ai-agents skill-evaluation anthropic agent-skills claude-code skill-rating claude-code-skill openclaw openclaw-skill

Updated Apr 23, 2026
JavaScript

SirryChen / triage-skill-creator

Star

Triage-trainer：从零为您的个人助手构建定制化的导诊 Skill，赋予精准的就诊科室推荐能力

skill triage skill-evaluation skill-creator agent-trainer

Updated Mar 28, 2026
HTML

Evaluation framework for LLM knowledge inputs — prompts, RAG corpora, skills, agent workflows. Fix the model, vary the artifact. Built-in statistical rigor: bootstrap CI, Krippendorff α, length-debias, saturation curves.

benchmark ai evaluation-framework claude knowledge-engineering skill-evaluation llm prompt-engineering prompt-testing llm-evaluation rag-evaluation llm-judge claude-code agent-evaluation bootstrap-ci krippendorff-alpha evaluation-as-code multi-judge-ensemble

Updated May 4, 2026
TypeScript

duck-ai-yy / skill-safety-reviewer

Star

A skill that reviews whether skills found online are safe to install for non-tech-background developers

ai-safety cowork skill-evaluation tool-evaluation claude-ai claude-skills safety-reviewer

Updated Mar 22, 2026

WilliamWJHuang / agent-skill-evaluator

Star

Evaluate agent SKILL.md files for structure, security, quality, and domain correctness.

linter quality-assurance security-analysis ai-agents skill-evaluation agent-skills

Updated Apr 18, 2026
Python

saniyaacharya04 / interviewforge

Star

AI-powered mock interview platform with automated scoring, role-based questions, modern React UI, FastAPI backend, and a fully implemented freemium SaaS architecture.

machine-learning-application technical-interviews fastapi skill-evaluation ai-tools assessment-platform interview-simulator ai-scoring