Can NumPy take care that an array is (nonstrictly) increasing along one axis?

后端 未结 2 1010
北海茫月
北海茫月 2021-01-21 07:36

Is there a function in numpy to guarantee or rather fix an array such that it is (nonstrictly) increasing along one particular axis? For example, I have the following 2D array:<

相关标签:
2条回答
  • 2021-01-21 08:00

    pandas offers you the df.cummax function:

    import pandas as pd
    pd.DataFrame(X).cummax(axis=1).values
    
    array([[1, 2, 2, 4, 5],
           [0, 3, 3, 5, 5]])
    

    It's useful to know that there's a first class function on hand in case your data is already loaded into a dataframe.

    0 讨论(0)
  • 2021-01-21 08:04

    Use np.maximum.accumulate for a running (accumulated) max value along that axis to ensure the strictly increasing criteria -

    np.maximum.accumulate(X,axis=1)
    

    Sample run -

    In [233]: X
    Out[233]: 
    array([[1, 2, 1, 4, 5],
           [0, 3, 1, 5, 4]])
    
    In [234]: np.maximum.accumulate(X,axis=1)
    Out[234]: 
    array([[1, 2, 2, 4, 5],
           [0, 3, 3, 5, 5]])
    

    For memory efficiency, we can assign it back to the input for in-situ changes with its out argument.

    Runtime tests

    Case #1 : Array as input

    In [254]: X = np.random.rand(1000,1000)
    
    In [255]: %timeit np.maximum.accumulate(X,axis=1)
    1000 loops, best of 3: 1.69 ms per loop
    
    # @cᴏʟᴅsᴘᴇᴇᴅ's pandas soln using df.cummax
    In [256]: %timeit pd.DataFrame(X).cummax(axis=1).values
    100 loops, best of 3: 4.81 ms per loop
    

    Case #2 : Dataframe as input

    In [257]: df = pd.DataFrame(np.random.rand(1000,1000))
    
    In [258]: %timeit np.maximum.accumulate(df.values,axis=1)
    1000 loops, best of 3: 1.68 ms per loop
    
    # @cᴏʟᴅsᴘᴇᴇᴅ's pandas soln using df.cummax
    In [259]: %timeit df.cummax(axis=1)
    100 loops, best of 3: 4.68 ms per loop
    
    0 讨论(0)
提交回复
热议问题