dask/
dask
Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.
Content Preview
---
name: dask
description: Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.
license: BSD-3-Clause license
metadata:
skill-author: K-Dense Inc.
---
# Dask
## Overview
Dask is a Python library for parallel and disHow to Use
Recommended: Install to project (local)
mkdir -p .claude/skills
curl -o .claude/skills/dask.md \
https://raw.githubusercontent.com/K-Dense-AI/claude-scientific-skills/main/scientific-skills/dask/SKILL.mdSkill is scoped to this project only. Add .claude/skills/ to your .gitignoreif you don't want to commit it.
Alternative: Clone full repo
git clone https://github.com/K-Dense-AI/claude-scientific-skillsThen reference at scientific-skills/dask/SKILL.md
Related Skills
get-available-resources
This skill should be used at the start of any computationally intensive scientific task to detect and report available system resources (CPU cores, GPUs, memory, disk space). It creates a JSON file with resource information and strategic recommendations that inform computational approach decisions such as whether to use parallel processing (joblib, multiprocessing), out-of-core computing (Dask, Zarr), GPU acceleration (PyTorch, JAX), or memory-efficient strategies. Use this skill before running analyses, training models, processing large datasets, or any task where resource constraints matter.
get-available-resourcesgetavailableresources
by K-Dense-AI · claude-scientific-skills
polars
Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.
polarspolars
by K-Dense-AI · claude-scientific-skills
zarr-python
Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.
zarr-pythonzarrpython
by K-Dense-AI · claude-scientific-skills
geoffrey-hinton
Agente que simula Geoffrey Hinton — Godfather of Deep Learning, Prêmio Turing 2018, criador do backpropagation e das Deep Belief Networks.
generalpersonadeep-learningai-safety
by sickn33 (Antigravity) · antigravity-awesome-skills