Is numpy.transpose reordering data in memory?

后端 未结 2 1039
余生分开走
余生分开走 2021-01-01 16:11

In order to speed up the functions like np.std, np.sum etc along an axis of an n dimensional huge numpy array, it is recommended to apply along the last axis.

When I

2条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-01 16:47

    To elaborate on larsman's answer, here are some timings:

    # normal C (row-major) order array
    >>> %%timeit a = np.random.randn(500, 400)
    >>> np.sum(a, axis=1)
    1000 loops, best of 3: 272 us per loop
    
    # transposing and summing along the first axis makes no real difference 
    # to performance
    >>> %%timeit a = np.random.randn(500, 400)
    >>> np.sum(a.T, axis=0)
    1000 loops, best of 3: 269 us per loop
    
    # however, converting to Fortran (column-major) order does improve speed...
    >>> %%timeit a = np.asfortranarray(np.random.randn(500,400))
    >>> np.sum(a, axis=1)
    10000 loops, best of 3: 114 us per loop
    
    # ... but only if you don't count the conversion in the timed operations
    >>> %%timeit a = np.random.randn(500, 400)
    >>> np.sum(np.asfortranarray(a), axis=1)
    1000 loops, best of 3: 599 us per loop
    

    In summary, it might make sense to convert your arrays to Fortran order if you're going to apply a lot of operations over the columns, but the conversion itself is costly and almost certainly not worth it for a single operation.

提交回复
热议问题