Fastest approach to read thousands of images into one big numpy array

后端 未结 2 1532
無奈伤痛
無奈伤痛 2020-12-03 03:15

I\'m trying to find the fastest approach to read a bunch of images from a directory into a numpy array. My end goal is to compute statistics such as the max, min, and nth pe

相关标签:
2条回答
  • 2020-12-03 03:30

    In this case, most of the time will be spent reading the files from disk, and I wouldn't worry too much about the time to populate a list.

    In any case, here is a script comparing four method, without the overhead of reading an actual image from disk, but just read an object from memory.

    import numpy as np
    import time
    from functools import wraps
    
    
    x, y = 512, 512
    img = np.random.randn(x, y)
    n = 1000
    
    
    def timethis(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            start = time.perf_counter()
            r = func(*args, **kwargs)
            end = time.perf_counter()
            print('{}.{} : {} milliseconds'.format(func.__module__, func.__name__, (end - start)*1e3))
            return r
        return wrapper
    
    
    @timethis
    def static_list(n):
        imgs = [None]*n
        for i in range(n):
            imgs[i] = img
        return imgs
    
    
    @timethis
    def dynamic_list(n):
        imgs = []
        for i in range(n):
            imgs.append(img)
        return imgs
    
    
    @timethis
    def list_comprehension(n):
        return [img for i in range(n)]
    
    
    @timethis
    def numpy_flat(n):
        imgs = np.ndarray((x*n, y))
        for i in range(n):
            imgs[x*i:(i+1)*x, :] = img
    
    static_list(n)
    dynamic_list(n)
    list_comprehension(n)
    numpy_flat(n)
    

    The results show:

    __main__.static_list : 0.07004200006122119 milliseconds
    __main__.dynamic_list : 0.10294799994881032 milliseconds
    __main__.list_comprehension : 0.05021800006943522 milliseconds
    __main__.numpy_flat : 309.80870099983804 milliseconds
    

    Obviously your best bet is list comprehension, however even with populating a numpy array, its just 310 ms for reading 1000 images (from memory). So again, the overhead will be the disk read.

    Why numpy is slower?

    It is the way numpy stores array in memory. If we modify the python list functions to convert the list to a numpy array, the times are similar.

    The modified functions return values:

    @timethis
    def static_list(n):
        imgs = [None]*n
        for i in range(n):
            imgs[i] = img
        return np.array(imgs)
    
    
    @timethis
    def dynamic_list(n):
        imgs = []
        for i in range(n):
            imgs.append(img)
        return np.array(imgs)
    
    
    @timethis
    def list_comprehension(n):
        return np.array([img for i in range(n)])
    

    and the timing results:

    __main__.static_list : 303.32892100022946 milliseconds
    __main__.dynamic_list : 301.86925499992867 milliseconds
    __main__.list_comprehension : 300.76925699995627 milliseconds
    __main__.numpy_flat : 305.9309459999895 milliseconds
    

    So it is just a numpy thing that it takes more time, and it is constant value relative to array size...

    0 讨论(0)
  • 2020-12-03 03:35

    Part A : Accessing and assigning NumPy arrays

    Going by the way elements are stored in row-major order for NumPy arrays, you are doing the right thing when storing those elements along the last axis per iteration. These would occupy contiguous memory locations and as such would be the most efficient for accessing and assigning values into. Thus initializations like np.ndarray((512*25,512), dtype='uint16') or np.ndarray((25,512,512), dtype='uint16') would work the best as also mentioned in the comments.

    After compiling those as funcs for testing on timings and feeding in random arrays instead of images -

    N = 512
    n = 25
    a = np.random.randint(0,255,(N,N))
    
    def app1():
        imgs = np.empty((N,N,n), dtype='uint16')
        for i in range(n):
            imgs[:,:,i] = a
            # Storing along the first two axes
        return imgs
    
    def app2():
        imgs = np.empty((N*n,N), dtype='uint16')
        for num in range(n):    
            imgs[num*N:(num+1)*N, :] = a
            # Storing along the last axis
        return imgs
    
    def app3():
        imgs = np.empty((n,N,N), dtype='uint16')
        for num in range(n):    
            imgs[num,:,:] = a
            # Storing along the last two axes
        return imgs
    
    def app4():
        imgs = np.empty((N,n,N), dtype='uint16')
        for num in range(n):    
            imgs[:,num,:] = a
            # Storing along the first and last axes
        return imgs
    

    Timings -

    In [45]: %timeit app1()
        ...: %timeit app2()
        ...: %timeit app3()
        ...: %timeit app4()
        ...: 
    10 loops, best of 3: 28.2 ms per loop
    100 loops, best of 3: 2.04 ms per loop
    100 loops, best of 3: 2.02 ms per loop
    100 loops, best of 3: 2.36 ms per loop
    

    Those timings confirm the performance theory proposed at the start, though I expected the timings for the last setup to have timings in between the ones for app3 and app1, but maybe the effect of going from last to the first axis for accessing and assigning isn't linear. More investigations on this one could be interesting (follow up question here).

    To claify schematically, consider that we are storing image arrays, denoted by x (image 1) and o (image 2), we would have :

    App1 :

    [[[x 0]
      [x 0]
      [x 0]
      [x 0]
      [x 0]]
    
     [[x 0]
      [x 0]
      [x 0]
      [x 0]
      [x 0]]
    
     [[x 0]
      [x 0]
      [x 0]
      [x 0]
      [x 0]]]
    

    Thus, in memory space, it would be : [x,o,x,o,x,o..] following row-major order.

    App2 :

    [[x x x x x]
     [x x x x x]
     [x x x x x]
     [o o o o o]
     [o o o o o]
     [o o o o o]]
    

    Thus, in memory space, it would be : [x,x,x,x,x,x...o,o,o,o,o..].

    App3 :

    [[[x x x x x]
      [x x x x x]
      [x x x x x]]
    
     [[o o o o o]
      [o o o o o]
      [o o o o o]]]
    

    Thus, in memory space, it would be same as previous one.


    Part B : Reading image from disk as arrays

    Now, the part on reading image, I have seen OpenCV's imread to be much faster.

    As a test, I downloaded Mona Lisa's image from wiki page and tested performance on image reading -

    import cv2 # OpenCV
    
    In [521]: %timeit io.imread('monalisa.jpg')
    100 loops, best of 3: 3.24 ms per loop
    
    In [522]: %timeit cv2.imread('monalisa.jpg')
    100 loops, best of 3: 2.54 ms per loop
    
    0 讨论(0)
提交回复
热议问题