in Uncategorized

Dica de Python: Dask

Para quem não aguenta mais sofrer com o Pandas e não quer lidar com as inúmeras limitações do Scala o Dask é uma ótima biblioteca para manipulação de dados e computação em Python.

Direto da documentação:

Familiar: Provides parallelized NumPy array and Pandas DataFrame objects

Flexible: Provides a task scheduling interface for more custom workloads and integration with other projects.

Native: Enables distributed computing in pure Python with access to the PyData stack.

Fast: Operates with low overhead, low latency, and minimal serialization necessary for fast numerical algorithms

Scales up: Runs resiliently on clusters with 1000s of cores

Scales down: Trivial to set up and run on a laptop in a single process

Responsive: Designed with interactive computing in mind, it provides rapid feedback and diagnostics to aid humans

Write a Comment

Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.