Create large random boolean matrix with numpy

前端 未结 3 1245
孤街浪徒
孤街浪徒 2021-01-03 20:06

I am trying to create a huge boolean matrix which is randomly filled with True and False with a given probability p. At f

3条回答
  •  有刺的猬
    2021-01-03 21:00

    Another possibility could be to generate it in a batch (i.e. compute many sub-arrays and stack them together at the very end). But, consider not to update one array (mask) in a for loop as OP is doing. This would force the whole array to load in main memory during every indexing update.

    Instead for example: to get 30000x30000, have 9000 100x100 separate arrays, update each of this 100x100 array accordingly in a for loop and finally stack these 9000 arrays together in a giant array. This would definitely need not more than 4GB of RAM and would be very fast as well.

    Minimal Example:

    In [9]: a
    Out[9]: 
    array([[0, 1],
           [2, 3]])
    
    In [10]: np.hstack([np.vstack([a]*5)]*5)
    Out[10]: 
    array([[0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
           [2, 3, 2, 3, 2, 3, 2, 3, 2, 3],
           [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
           [2, 3, 2, 3, 2, 3, 2, 3, 2, 3],
           [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
           [2, 3, 2, 3, 2, 3, 2, 3, 2, 3],
           [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
           [2, 3, 2, 3, 2, 3, 2, 3, 2, 3],
           [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
           [2, 3, 2, 3, 2, 3, 2, 3, 2, 3]])
    
    In [11]: np.hstack([np.vstack([a]*5)]*5).shape
    Out[11]: (10, 10)
    

提交回复
热议问题