Handling large dense matrices in python

前端 未结 6 1182
[愿得一人]
[愿得一人] 2021-02-04 21:33

Basically, what is the best way to go about storing and using dense matrices in python?

I have a project that generates similarity metrics between every item in an array

6条回答
  •  故里飘歌
    2021-02-04 22:03

    You can reduce the memory use by using uint8, but be careful to avoid overflow errors. A uint16 requires two bytes, so the minimal memory requirement in your example is 8000*8000*30*2 bytes = 3.84 Gb.

    If the second example fails then you need a new machine. The memory requirement is 20000*20000*2*bytes =800 Mb.

    My advice is that you try to create smaller matrices and use "top", "ps v" or the gnome system monitor to check the memory used by your python proces. Start out examining a single thread with a small matrix and do the math. Note that you can release the memory of a variable x by writing del(x). This is useful for testing.

    What is the memory on your machine? How much memory does pytables use to create a 20000*20000 table? How much memory does numpy use to create a 20000*20000 table using uint8?

提交回复
热议问题