dask | 易学教程

24招加速你的Python，超级实用！

阅读更多关于 24招加速你的Python，超级实用！

云哥前期从以下九个方面讨论了加速Python的具体方法，一共24个，每个都带有优化前后的对比，非常实用。分析代码运行时间加速查找加速循环加速函数实用标准库加速 Numpy向量化加速加速Pandas Dask加速多线程多进程加速我在此基础上主要美化了编辑，方便读者更容易阅读学习。 “ 一、分析代码运行时间 ” 1 测算代码单次运行时间平凡法：快捷法(Jupyter)： 2 测算代码重复执行多次平均用时平凡法：快捷法(Jupyter)： 3 按调用函数分析代码运行时间平凡法：快捷法(Jupyter)： 4 按行分析代码运行时间平凡法：快捷法(Jupyter)： “ 二、加速你的查找 ” 5 用set而非list进行in查找低速法：高速法： 6 用dict而非两个list进行匹配查找低速法：高速法： “ 三、加速你的循环 ” 7 优先使用for循环而不是while循环低速法：高速法： 8 循环体中避免重复运算低速法：高速法： “ 四、加速你的函数 ” 9、用缓存机制加速递归函数低速法：高速法： 10、用循环取代递归低速法：高速法： 11、使用Numba加速Python函数低速法：高速法： “ 五、使用标准库函数进行加速 ” 12、使用collections.Counter类加速计数低速法：高速法： 13

Flask 作者 Armin Ronacher：我不觉得有异步压力

阅读更多关于 Flask 作者 Armin Ronacher：我不觉得有异步压力

https://zhuanlan.zhihu.com/p/102307133 英文 | I'm not feeling the async pressure 【1】原作 | Armin Ronacher，2020.01.01 译者 | 豌豆花下猫@Python猫声明：本翻译基于 CC BY-NC-SA 4.0 【2】授权协议，内容略有改动，转载请保留原文出处，请勿用于商业或非法用途。异步（async）正风靡一时。异步Python、异步Rust、go、node、.NET，任选一个你最爱的语言生态，它都在使用着一些异步。异步这东西有多好，这在很大程度上取决于语言的生态及其运行时间，但总体而言，它有一些不错的好处。它使得这种事情变得非常简单：等待可能需要一些时间才能完成的操作。它是如此简单，以至于创造了无数新的方法来坑人（blow ones foot off）。我想讨论的一种情况是，直到系统出现超载，你才意识到自己踩到了脚的那一种，这就是背压（back pressure）管理的主题。在协议设计中有一个相关术语是流量控制（flow control）。什么是背压关于背压的解释有很多，我推荐阅读的一个很好的解释是： Backpressure explained — the resisted flow of data through software 【3】。因此

How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

阅读更多关于 How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

问题 I have 3 rather large files (67gb, 36gb, 30gb) that I need to train models on. However, the features are rows and the samples are columns. Since Dask hasn't implemented transpose and stores DataFrames split by row, I need to write something to do this myself. Is there a way I can efficiently transpose without loading into memory? I've got 16 gb of ram at my disposal and am using jupyter notebook. I have written some rather slow code, but would really appreciate a faster solution. The speed of

How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

阅读更多关于 How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

阅读更多关于 How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

阅读更多关于 How can I efficiently transpose a 67 gb file/Dask dataframe without loading it entirely into memory?

Using Dask's NEW to_sql for improved efficiency (memory/speed) or alternative to get data from dask dataframe into SQL Server Table

阅读更多关于 Using Dask's NEW to_sql for improved efficiency (memory/speed) or alternative to get data from dask dataframe into SQL Server Table

问题 My ultimate goal is to use SQL/Python together for a project with too much data for pandas to handle (at least on my machine). So, I have gone with dask to: read in data from multiple sources (mostly SQL Server Tables/Views) manipulate/merge the data into one large dask dataframe table of ~10 million+ rows and 52 columns, some of which have some long unique strings write it back to SQL Server on a daily basis, so that my PowerBI report can automatically refresh the data. For #1 and #2, they

dask handle delayed failures

阅读更多关于 dask handle delayed failures

问题 How can I port the following function to dask in order to parallelize it? from time import sleep from dask.distributed import Client from dask import delayed client = Client(n_workers=4) from tqdm import tqdm tqdm.pandas() # linear things = [1,2,3] _x = [] _y = [] def my_slow_function(foo): sleep(2) x = foo y = 2 * foo assert y < 5 return x, y for foo in tqdm(things): try: x_v, y_v = my_slow_function(foo) _x.append(x_v) if y_v is not None: _y.append(y_v) except AssertionError: print(f'failed:

dask handle delayed failures

阅读更多关于 dask handle delayed failures

dask handle delayed failures

阅读更多关于 dask handle delayed failures

订阅 dask