How to fill elements between intervals of a list

后端未结

关注

 6  2332

I have a list like this:

list_1 = [np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN]

So the

Benchmarking on NumPy solutions with array input

To keep things simple, we will just scaled up the given sample to 10,000x by tiling and test out the NumPy based ones.

Other NumPy solutions -

#@yatu's soln
def func_yatu(a):
    ix0 = (a == 0).cumsum()
    ix1 = (a == 1).cumsum()
    dec = (ix1 - ix0).astype(float)
    ix = len(a)-(a[::-1]==1).argmax()
    last = ix1[-1]-ix0[-1]
    if last > 0:
        dec[ix:] = a[ix:]
    out = np.where(dec==1, dec, a)
    return out

# @FBruzzesi's soln (with the output returned in a separate array)
def func_FBruzzesi(a, value=1):
    ones = np.squeeze(np.argwhere(a==1))
    zeros = np.squeeze(np.argwhere(a==0))   
    if ones[0]>zeros[0]:
        zeros = zeros[1:]   
    out = a.copy()
    for i,j in zip(ones,zeros):
        out[i+1:j] = value
    return out

# @Ehsan's soln (with the output returned in a separate array)
def func_Ehsan(list_1):
    zeros_ind = np.where(list_1 == 0)[0]
    ones_ind = np.where(list_1 == 1)[0]
    ones_ind = ones_ind[:zeros_ind.size]        
    indexer = np.r_[tuple([np.s_[i:j] for (i,j) in zip(ones_ind,zeros_ind)])]
    out = list_1.copy()
    out[indexer] = 1
    return out

Timings -

In [48]: list_1 = [np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN]
    ...: a = np.array(list_1)

In [49]: a = np.tile(a,10000)

In [50]: %timeit func_Ehsan(a)
    ...: %timeit func_FBruzzesi(a)
    ...: %timeit func_yatu(a)
    ...: %timeit fill_inbetween(a)
4.86 s ± 325 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
253 ms ± 29.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
3.39 ms ± 205 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.01 ms ± 168 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The copying process doesn't take much of runtime, so that can be ignored -

In [51]: %timeit a.copy()
78.3 µs ± 571 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

0 讨论(0)

我在风中等你

2021-02-14 18:28

Here's a numpy based approach using np.cumsum:

a = np.array([np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 
              1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN])

ix0 = (a == 0).cumsum()
ix1 = (a == 1).cumsum()
dec = (ix1 - ix0).astype(float)
# Only necessary if the seq can end with an unclosed interval
ix = len(a)-(a[::-1]==1).argmax()
last = ix1[-1]-ix0[-1]
if last > 0:
    dec[ix:] = a[ix:]
# -----
out = np.where(dec==1, dec, a)

print(out)
array([nan, nan,  1.,  1.,  1.,  1.,  0., nan,  1.,  1.,  0.,  1.,  1.,
        0., nan,  1., nan])

0 讨论(0)

说谎

2021-02-14 18:28

You can retrieve indices one ones and zeros using np.argwhere and then fill the values among each slice:

import numpy as np

a = np.array([np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN])

ones = np.squeeze(np.argwhere(a==1))
zeros = np.squeeze(np.argwhere(a==0))

if ones[0]>zeros[0]:
    zeros = zeros[1:]

value = -999
for i,j in zip(ones,zeros):
    a[i+1:j] = value

a
array([  nan,   nan,    1., -999., -999., -999.,    0.,   nan,    1.,
       -999.,    0.,    1., -999.,    0.,   nan,    1.,   nan])

0 讨论(0)

暗喜

2021-02-14 18:34

Assuming each 1 is followed by 0 (minus last 1):

list_1 = np.array([np.NaN, np.NaN, 1, np.NaN, np.NaN, np.NaN, 0, np.NaN, 1, np.NaN, 0, 1, np.NaN, 0, np.NaN,  1, np.NaN])
zeros_ind = np.where(list_1 == 0)[0]
ones_ind = np.where(list_1 == 1)[0]
ones_ind = ones_ind[:zeros_ind.size]

#create a concatenated list of ranges of indices you desire to slice
indexer = np.r_[tuple([np.s_[i:j] for (i,j) in zip(ones_ind,zeros_ind)])]
#slice using numpy indexing
list_1[indexer] = 1

Output:

[nan nan  1.  1.  1.  1.  0. nan  1.  1.  0.  1.  1.  0. nan  1. nan]

0 讨论(0)

花落未央

2021-02-14 18:39
Here's a code where a variable replace will determine if the element should be replace or not and for will iterate from 0 to len of the interval and if it finds 1 then replace will true then elements will be replaced and when it will find next 0 replace will be falls and element will not replace till again appearing of 1
```
  replace = False
    for i in (len(interval)-1):
        if interval[i]==1:
            replace = True
        elif interval[i]==0:
            replace = False
        if replace:
            list[i]=inerval[i]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

一整个雨季

2021-02-14 18:39

Pandas solution:

s = pd.Series(list_1)
s1 = s.eq(1)
s0 = s.eq(0)
m = (s1 | s0).where(s1.cumsum().ge(1),False).cumsum().mod(2).eq(1)
s.loc[m & s.isna()] = 1
print(s.tolist())
#[nan, nan, 1.0, 1.0, 1.0, 1.0, 0.0, nan, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, nan, 1.0, 1.0]

but if there is only 1, 0 or NaN you can do:

s = pd.Series(list_1)
s.fillna(s.ffill().where(lambda x: x.eq(1))).tolist()

output

[nan,
 nan,
 1.0,
 1.0,
 1.0,
 1.0,
 0.0,
 nan,
 1.0,
 1.0,
 0.0,
 1.0,
 1.0,
 0.0,
 nan,
 1.0,
 1.0]

0 讨论(0)