llm_evaluation_frameworks

Name: llm_evaluation_frameworks
Author: alirezarezvani

Concrete metrics, scoring methods, comparison tables, and A/B testing frameworks.

Content Preview

# LLM Evaluation Frameworks

Concrete metrics, scoring methods, comparison tables, and A/B testing frameworks.

## Frameworks Index

1. [Evaluation Metrics Overview](#1-evaluation-metrics-overview)
2. [Text Generation Metrics](#2-text-generation-metrics)
3. [RAG-Specific Metrics](#3-rag-specific-metrics)
4. [Human Evaluation Frameworks](#4-human-evaluation-frameworks)
5. [A/B Testing for Prompts](#5-ab-testing-for-prompts)
6. [Benchmark Datasets](#6-benchmark-datasets)
7. [Evaluation Pipeline De

How to Use

Recommended: Install to project (local)

mkdir -p .claude/skills
curl -o .claude/skills/llm_evaluation_frameworks.md \
  https://raw.githubusercontent.com/alirezarezvani/claude-skills/main/engineering-team/senior-prompt-engineer/references/llm_evaluation_frameworks.md

Skill is scoped to this project only. Add .claude/skills/ to your .gitignoreif you don't want to commit it.

Alternative: Clone full repo

git clone https://github.com/alirezarezvani/claude-skills

Then reference at engineering-team/senior-prompt-engineer/references/llm_evaluation_frameworks.md