发表新帖

发表新帖

Create large random boolean matrix with numpy

前端未结

关注

 3  1249

孤街浪徒 2021-01-03 20:06

I am trying to create a huge boolean matrix which is randomly filled with True and False with a given probability p. At f

3条回答

清酒与你 (楼主)

2021-01-03 21:03
The problem is your RAM, the values are being stored in memory as it's being created. I just created this matrix using this command:

np.random.choice(a=[False, True], size=(N, N), p=[p, 1-p])

I used an AWS i3 instance with 64GB of RAM and 8 cores. To create this matrix, htop shows that it takes up ~20GB of RAM. Here is a benchmark in case you care:
```
time np.random.choice(a=[False, True], size=(N, N), p=[p, 1-p])

CPU times: user 18.3 s, sys: 3.4 s, total: 21.7 s
Wall time: 21.7 s


 def mask_method(N, p):
    for i in range(N):
        mask[i] = np.random.choice(a=[False, True], size=N, p=[p, 1-p])
        if (i % 100 == 0):
            print(i)

time mask_method(N,p)

CPU times: user 20.9 s, sys: 1.55 s, total: 22.5 s
Wall time: 22.5 s
```
Note that the mask method only takes up ~9GB of RAM at it's peak.

Edit: The first method flushes the RAM after the process is done where as the function method retains all of it.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题