Data Tools
Repo | Site | Stars | Issues | Contributors | Version | Last Publish | Forks | Watchers |
---|---|---|---|---|---|---|---|---|
pandas (opens in a new tab) | (opens in a new tab) | 43721 | 3618 | 30 | Pandas 2.2.3 | 2024-09-20 | 17937 | 1110 |
polars (opens in a new tab) | (opens in a new tab) | 30235 | 2179 | 30 | Rust Polars 0.44.2 | 2024-11-01 | 1954 | 167 |
spark (opens in a new tab) | (opens in a new tab) | 39817 | 227 | 30 | 28300 | 2021 | ||
dask (opens in a new tab) | (opens in a new tab) | 12577 | 1096 | 30 | 2024.11.0 | 2024-11-08 | 1709 | 212 |
modin (opens in a new tab) | (opens in a new tab) | 9876 | 666 | 30 | Modin 0.32.0 | 2024-09-11 | 651 | 117 |
duckdb (opens in a new tab) | (opens in a new tab) | 24094 | 329 | 30 | v1.1.3 Bugfix Release | 2024-11-04 | 1909 | 207 |
vaex (opens in a new tab) | (opens in a new tab) | 8290 | 534 | 30 | Version linked to the paper | 2018-03-29 | 590 | 144 |
fugue (opens in a new tab) | (opens in a new tab) | 2005 | 37 | 22 | Support `dict[str,Any]` as transformer input and output | 2024-06-28 | 94 | 25 |
ibis (opens in a new tab) | (opens in a new tab) | 5281 | 257 | 30 | 9.5.0 | 2024-09-11 | 595 | 85 |
Daft (opens in a new tab) | (opens in a new tab) | 2311 | 259 | 30 | v0.3.11 | 2024-11-07 | 160 | 16 |
pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
polars
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
spark
Apache Spark - A unified analytics engine for large-scale data processing
dask
Parallel computing with task scheduling
modin
Modin: Scale your Pandas workflows by changing a single line of code
duckdb
DuckDB is an analytical in-process SQL database management system
vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
ibis
the portable Python dataframe library
Daft
Distributed data engine for Python/SQL designed for the cloud, powered by Rust