Skip to content
#

skill-evaluation

Here are 9 public repositories matching this topic...

oh-my-knowledge

Evaluation framework for LLM knowledge inputs — prompts, RAG corpora, skills, agent workflows. Fix the model, vary the artifact. Built-in statistical rigor: bootstrap CI, Krippendorff α, length-debias, saturation curves.

  • Updated May 4, 2026
  • TypeScript

Detect malicious code and security risks in AI skill files before installation to protect AI agents from hidden threats and obfuscation techniques.

  • Updated May 4, 2026
  • Python

Improve this page

Add a description, image, and links to the skill-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the skill-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more