Data Tools

Data Tools

RepoSiteStarsIssuesContributorsVersionLast PublishForksWatchers
pandas (opens in a new tab) (opens in a new tab)43407360930Pandas 2.2.32024-09-20178301110
polars (opens in a new tab) (opens in a new tab)29512208730Python Polars 1.9.02024-10-011875163
spark (opens in a new tab) (opens in a new tab)3939520830282222022
dask (opens in a new tab) (opens in a new tab)124541109302024.9.12024-09-281698212
modin (opens in a new tab) (opens in a new tab)980666130Modin 0.32.02024-09-11651114
duckdb (opens in a new tab) (opens in a new tab)2312333230v1.1.1 Bugfix Release2024-09-241843205
vaex (opens in a new tab) (opens in a new tab)826953030Version linked to the paper2018-03-29590143
fugue (opens in a new tab) (opens in a new tab)19753622Support `dict[str,Any]` as transformer input and output2024-06-289424
ibis (opens in a new tab) (opens in a new tab)5130235309.5.02024-09-1159084
Daft (opens in a new tab) (opens in a new tab)215622930v0.3.52024-10-0114516

pandas

PyPIPyPI downloadsPyPI - Support Python Versions
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

polars

PyPIPyPI downloadsPyPI - Support Python Versions
Dataframes powered by a multithreaded, vectorized query engine, written in Rust

spark

PyPIPyPI downloadsPyPI - Support Python Versions
Apache Spark - A unified analytics engine for large-scale data processing

dask

PyPIPyPI downloadsPyPI - Support Python Versions
Parallel computing with task scheduling

modin

PyPIPyPI downloadsPyPI - Support Python Versions
Modin: Scale your Pandas workflows by changing a single line of code

duckdb

PyPIPyPI downloadsPyPI - Support Python Versions
DuckDB is an analytical in-process SQL database management system

vaex

PyPIPyPI downloadsPyPI - Support Python Versions
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

fugue

PyPIPyPI downloadsPyPI - Support Python Versions
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

ibis

PyPIPyPI downloadsPyPI - Support Python Versions
the portable Python dataframe library

Daft

PyPIPyPI downloadsPyPI - Support Python Versions
Distributed DataFrame for Python designed for the cloud, powered by Rust