Sorting in pandas for large datasets

前端 未结 5 1407
故里飘歌
故里飘歌 2020-12-08 11:21

I would like to sort my data by a given column, specifically p-values. However, the issue is that I am not able to load my entire data into memory. Thus, the following doesn

5条回答
  •  时光说笑
    2020-12-08 11:55

    Here is my Honest sugg./ Three options you can do.

    1. I like Pandas for its rich doc and features but I been suggested to use NUMPY as it feel faster comparatively for larger datasets. You can think of using other tools as well for easier job.

    2. In case you are using Python3, you can break your big data chunk into sets and do Congruent Threading. I am too lazy for this and it does nt look cool, you see Panda, Numpy, Scipy are build with Hardware design perspectives to enable multi threading I believe.

    3. I prefer this, this is easy and lazy technique acc. to me. Check the document at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort.html

    You can also use 'kind' parameter in your pandas-sort function you are using.

    Godspeed my friend.

提交回复
热议问题