data-engineering

Data engineering patterns for ETL pipelines, data warehousing, Apache Spark, and data quality validation

Content Preview
---
name: data-engineering
description: Data engineering patterns for ETL pipelines, data warehousing, Apache Spark, and data quality validation
---

# Data Engineering

## ETL Pipeline Pattern

```python
from datetime import datetime
from dataclasses import dataclass

@dataclass
class PipelineResult:
    records_extracted: int
    records_transformed: int
    records_loaded: int
    errors: list[str]
    duration_seconds: float

class OrderPipeline:
    def __init__(self, source_db, warehouse_d
How to Use

Recommended: Install to project (local)

mkdir -p .claude/skills
curl -o .claude/skills/data-engineering.md \
  https://raw.githubusercontent.com/rohitg00/awesome-claude-code-toolkit/main/skills/data-engineering/SKILL.md

Skill is scoped to this project only. Add .claude/skills/ to your .gitignoreif you don't want to commit it.

Alternative: Clone full repo

git clone https://github.com/rohitg00/awesome-claude-code-toolkit

Then reference at skills/data-engineering/SKILL.md

Related Skills