advanced-evaluation
This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.
Content Preview
--- name: advanced-evaluation description: This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment. risk: safe source: community date_added: 2026-03-18 --- # Advanced Evaluation This skill covers production-grade techniques for evaluating LLM outputs using LLMs as judges. It synt
How to Use
Recommended: Install to project (local)
mkdir -p .claude/skills
curl -o .claude/skills/advanced-evaluation.md \
https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/skills/advanced-evaluation/SKILL.mdSkill is scoped to this project only. Add .claude/skills/ to your .gitignoreif you don't want to commit it.
Alternative: Clone full repo
git clone https://github.com/sickn33/antigravity-awesome-skillsThen reference at skills/advanced-evaluation/SKILL.md
Related Skills
data-scientist
Expert data scientist for advanced analytics, machine learning, and statistical modeling. Handles complex data analysis, predictive modeling, and business intelligence.
data-aidatascientist
by sickn33 (Antigravity) · antigravity-awesome-skills
data
Data engineering for Apache Airflow and Astronomer. Author DAGs with best practices, debug pipeline failures, trace data lineage, profile tables, migrate Airflow 2 to 3, and manage local and cloud deployments.
pluginpluginmarketplacedata
by Anthropic · anthropic-official-plugins
data-engineering-data-driven-feature
Build features guided by data insights, A/B testing, and continuous measurement using specialized agents for analysis, implementation, and experimentation.
data-aidataengineeringdriven
by sickn33 (Antigravity) · antigravity-awesome-skills
data-quality-frameworks
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.
data-aidataqualityframeworks
by sickn33 (Antigravity) · antigravity-awesome-skills