How to fill elements between intervals of a list

后端 未结 6 2332
无人及你
无人及你 2021-02-14 18:19

I have a list like this:

list_1 = [np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN]

So the

相关标签:
6条回答
  • 2021-02-14 18:27

    Here's a NumPy based one -

    def fill_inbetween(a):
        m1 = a==1
        m2 = a==0
        id_ar = m1.astype(int)-m2
        idc = id_ar.cumsum()
        idc[len(m1)-m1[::-1].argmax():] =  0
        return np.where(idc.astype(bool), 1, a)
    

    Sample run -

    In [44]: a # input as array
    Out[44]: 
    array([nan, nan,  1., nan, nan, nan,  0., nan,  1., nan,  0.,  1., nan,
            0., nan,  1., nan])
    
    In [45]: fill_inbetween(a)
    Out[45]: 
    array([nan, nan,  1.,  1.,  1.,  1.,  0., nan,  1.,  1.,  0.,  1.,  1.,
            0., nan,  1., nan])
    

    Benchmarking on NumPy solutions with array input

    To keep things simple, we will just scaled up the given sample to 10,000x by tiling and test out the NumPy based ones.

    Other NumPy solutions -

    #@yatu's soln
    def func_yatu(a):
        ix0 = (a == 0).cumsum()
        ix1 = (a == 1).cumsum()
        dec = (ix1 - ix0).astype(float)
        ix = len(a)-(a[::-1]==1).argmax()
        last = ix1[-1]-ix0[-1]
        if last > 0:
            dec[ix:] = a[ix:]
        out = np.where(dec==1, dec, a)
        return out
    
    # @FBruzzesi's soln (with the output returned in a separate array)
    def func_FBruzzesi(a, value=1):
        ones = np.squeeze(np.argwhere(a==1))
        zeros = np.squeeze(np.argwhere(a==0))   
        if ones[0]>zeros[0]:
            zeros = zeros[1:]   
        out = a.copy()
        for i,j in zip(ones,zeros):
            out[i+1:j] = value
        return out
    
    # @Ehsan's soln (with the output returned in a separate array)
    def func_Ehsan(list_1):
        zeros_ind = np.where(list_1 == 0)[0]
        ones_ind = np.where(list_1 == 1)[0]
        ones_ind = ones_ind[:zeros_ind.size]        
        indexer = np.r_[tuple([np.s_[i:j] for (i,j) in zip(ones_ind,zeros_ind)])]
        out = list_1.copy()
        out[indexer] = 1
        return out
    

    Timings -

    In [48]: list_1 = [np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN]
        ...: a = np.array(list_1)
    
    In [49]: a = np.tile(a,10000)
    
    In [50]: %timeit func_Ehsan(a)
        ...: %timeit func_FBruzzesi(a)
        ...: %timeit func_yatu(a)
        ...: %timeit fill_inbetween(a)
    4.86 s ± 325 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    253 ms ± 29.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    3.39 ms ± 205 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    2.01 ms ± 168 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    The copying process doesn't take much of runtime, so that can be ignored -

    In [51]: %timeit a.copy()
    78.3 µs ± 571 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
    0 讨论(0)
  • 2021-02-14 18:28

    Here's a numpy based approach using np.cumsum:

    a = np.array([np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 
                  1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN])
    
    ix0 = (a == 0).cumsum()
    ix1 = (a == 1).cumsum()
    dec = (ix1 - ix0).astype(float)
    # Only necessary if the seq can end with an unclosed interval
    ix = len(a)-(a[::-1]==1).argmax()
    last = ix1[-1]-ix0[-1]
    if last > 0:
        dec[ix:] = a[ix:]
    # -----
    out = np.where(dec==1, dec, a)
    

    print(out)
    array([nan, nan,  1.,  1.,  1.,  1.,  0., nan,  1.,  1.,  0.,  1.,  1.,
            0., nan,  1., nan])
    
    0 讨论(0)
  • 2021-02-14 18:28

    You can retrieve indices one ones and zeros using np.argwhere and then fill the values among each slice:

    import numpy as np
    
    a = np.array([np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN])
    
    ones = np.squeeze(np.argwhere(a==1))
    zeros = np.squeeze(np.argwhere(a==0))
    
    if ones[0]>zeros[0]:
        zeros = zeros[1:]
    
    value = -999
    for i,j in zip(ones,zeros):
        a[i+1:j] = value
    
    a
    array([  nan,   nan,    1., -999., -999., -999.,    0.,   nan,    1.,
           -999.,    0.,    1., -999.,    0.,   nan,    1.,   nan])
    
    0 讨论(0)
  • 2021-02-14 18:34

    Assuming each 1 is followed by 0 (minus last 1):

    list_1 = np.array([np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN])
    zeros_ind = np.where(list_1 == 0)[0]
    ones_ind = np.where(list_1 == 1)[0]
    ones_ind = ones_ind[:zeros_ind.size]
    
    #create a concatenated list of ranges of indices you desire to slice
    indexer = np.r_[tuple([np.s_[i:j] for (i,j) in zip(ones_ind,zeros_ind)])]
    #slice using numpy indexing
    list_1[indexer] = 1
    

    Output:

    [nan nan  1.  1.  1.  1.  0. nan  1.  1.  0.  1.  1.  0. nan  1. nan]
    
    0 讨论(0)
  • 2021-02-14 18:39

    Here's a code where a variable replace will determine if the element should be replace or not and for will iterate from 0 to len of the interval and if it finds 1 then replace will true then elements will be replaced and when it will find next 0 replace will be falls and element will not replace till again appearing of 1

      replace = False
        for i in (len(interval)-1):
            if interval[i]==1:
                replace = True
            elif interval[i]==0:
                replace = False
            if replace:
                list[i]=inerval[i]
    
    0 讨论(0)
  • 2021-02-14 18:39

    Pandas solution:

    s = pd.Series(list_1)
    s1 = s.eq(1)
    s0 = s.eq(0)
    m = (s1 | s0).where(s1.cumsum().ge(1),False).cumsum().mod(2).eq(1)
    s.loc[m & s.isna()] = 1
    print(s.tolist())
    #[nan, nan, 1.0, 1.0, 1.0, 1.0, 0.0, nan, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, nan, 1.0, 1.0]
    

    but if there is only 1, 0 or NaN you can do:

    s = pd.Series(list_1)
    s.fillna(s.ffill().where(lambda x: x.eq(1))).tolist()
    

    output

    [nan,
     nan,
     1.0,
     1.0,
     1.0,
     1.0,
     0.0,
     nan,
     1.0,
     1.0,
     0.0,
     1.0,
     1.0,
     0.0,
     nan,
     1.0,
     1.0]
    
    0 讨论(0)
提交回复
热议问题