advanced-evaluation

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

Content Preview
---
name: advanced-evaluation
description: This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.
risk: safe
source: community
date_added: 2026-03-18
---

# Advanced Evaluation

This skill covers production-grade techniques for evaluating LLM outputs using LLMs as judges. It synt
How to Use

Recommended: Install to project (local)

mkdir -p .claude/skills
curl -o .claude/skills/advanced-evaluation.md \
  https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/skills/advanced-evaluation/SKILL.md

Skill is scoped to this project only. Add .claude/skills/ to your .gitignoreif you don't want to commit it.

Alternative: Clone full repo

git clone https://github.com/sickn33/antigravity-awesome-skills

Then reference at skills/advanced-evaluation/SKILL.md

Related Skills