Resizing numpy.memmap arrays

后端 未结 2 1796
眼角桃花
眼角桃花 2021-02-05 06:02

I\'m working with a bunch of large numpy arrays, and as these started to chew up too much memory lately, I wanted to replace them with numpy.memmap instances. The p

2条回答
  •  北荒
    北荒 (楼主)
    2021-02-05 06:27

    The issue is that the flag OWNDATA is False when you create your array. You can change that by requiring the flag to be True when you create the array:

    >>> a = np.require(np.memmap('bla.bin', dtype=int), requirements=['O'])
    >>> a.shape
    (10,)
    >>> a.flags
      C_CONTIGUOUS : True
      F_CONTIGUOUS : True
      OWNDATA : True
      WRITEABLE : True
      ALIGNED : True
      UPDATEIFCOPY : False
    >>> a.resize(20, refcheck=False)
    >>> a.shape
    (20,)
    

    The only caveat is that it may create the array and make a copy to be sure the requirements are met.

    Edit to address saving:

    If you want to save the re-sized array to disk, you can save the memmap as a .npy formatted file and open as a numpy.memmap when you need to re-open it and use as a memmap:

    >>> a[9] = 1
    >>> np.save('bla.npy',a)
    >>> b = np.lib.format.open_memmap('bla.npy', dtype=int, mode='r+')
    >>> b
    memmap([0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
    

    Edit to offer another method:

    You may get close to what you're looking for by re-sizing the base mmap (a.base or a._mmap, stored in uint8 format) and "reloading" the memmap:

    >>> a = np.memmap('bla.bin', dtype=int)
    >>> a
    memmap([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
    >>> a[3] = 7
    >>> a
    memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])
    >>> a.flush()
    >>> a = np.memmap('bla.bin', dtype=int)
    >>> a
    memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])
    >>> a.base.resize(20*8)
    >>> a.flush()
    >>> a = np.memmap('bla.bin', dtype=int)
    >>> a
    memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
    

提交回复
热议问题