Content Preview
--- name: evaluation description: "Build evaluation frameworks for agent systems" risk: safe source: "https://github.com/muratcankoylan/Agent-Skills-for-Context-Engineering/tree/main/skills/evaluation" date_added: "2026-02-27" --- ## When to Use This Skill Build evaluation frameworks for agent systems Use this skill when working with build evaluation frameworks for agent systems. # Evaluation Methods for Agent Systems Evaluation of agent systems requires different approaches than traditional
How to Use
Recommended: Install to project (local)
mkdir -p .claude/skills
curl -o .claude/skills/evaluation.md \
https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/skills/evaluation/SKILL.mdSkill is scoped to this project only. Add .claude/skills/ to your .gitignoreif you don't want to commit it.
Alternative: Clone full repo
git clone https://github.com/sickn33/antigravity-awesome-skillsThen reference at skills/evaluation/SKILL.md
Related Skills
evaluation_methodology
Multi-Agent System Evaluation Methodology
engineeringevaluationmethodologyagent
by alirezarezvani · alirezarezvani-claude-skills
advanced-evaluation
This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.
data-aiadvancedevaluation
by sickn33 (Antigravity) · antigravity-awesome-skills
ai-engineering-toolkit
6 production-ready AI engineering workflows: prompt evaluation (8-dimension scoring), context budget planning, RAG pipeline design, agent security audit (65-point checklist), eval harness building, and product sense coaching.
securityprompt-engineeringragsecurity
by sickn33 (Antigravity) · antigravity-awesome-skills
hugging-face-evaluation
Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations with vLLM/lighteval. Works with the model-index metadata format.
developmenthuggingfaceevaluation
by sickn33 (Antigravity) · antigravity-awesome-skills