Replace a list of numbers with flat sub-ranges

后端 未结 1 1055

Given a list of numbers, like this:

lst = [0, 10, 15, 17]

I\'d like a list that has elements from i -> i + 3 for all

相关标签:
1条回答
  • 2021-01-18 23:53

    Approach #1 : One approach based on broadcasted summation and then using np.unique to get unique numbers -

    np.unique(np.asarray(lst)[:,None] + np.arange(4))
    

    Approach #2 : Another based on broadcasted summation and then masking -

    def mask_app(lst, interval_len = 4):
        arr = np.array(lst)
        r = np.arange(interval_len)
        ranged_vals = arr[:,None] + r
        a_diff = arr[1:] - arr[:-1]
        valid_mask = np.vstack((a_diff[:,None] > r, np.ones(interval_len,dtype=bool)))
        return ranged_vals[valid_mask]
    

    Runtime test

    Original approach -

    from collections import OrderedDict
    def org_app(lst):
        list(OrderedDict.fromkeys([y for x in lst for y in range(x, x + 4)]).keys())
    

    Timings -

    In [409]: n = 10000
    
    In [410]: lst = np.unique(np.random.randint(0,4*n,(n))).tolist()
    
    In [411]: %timeit org_app(lst)
         ...: %timeit np.unique(np.asarray(lst)[:,None] + np.arange(4))
         ...: %timeit mask_app(lst, interval_len = 4)
         ...: 
    10 loops, best of 3: 32.7 ms per loop
    1000 loops, best of 3: 1.03 ms per loop
    1000 loops, best of 3: 671 µs per loop
    
    In [412]: n = 100000
    
    In [413]: lst = np.unique(np.random.randint(0,4*n,(n))).tolist()
    
    In [414]: %timeit org_app(lst)
         ...: %timeit np.unique(np.asarray(lst)[:,None] + np.arange(4))
         ...: %timeit mask_app(lst, interval_len = 4)
         ...: 
    1 loop, best of 3: 350 ms per loop
    100 loops, best of 3: 14.7 ms per loop
    100 loops, best of 3: 9.73 ms per loop
    

    The bottleneck with the two posted approaches seems like is with the conversion to array, though that seems to be paying off well afterwards. Just to give a sense of the time spent on the conversion for the last dataset -

    In [415]: %timeit np.array(lst)
    100 loops, best of 3: 5.6 ms per loop
    
    0 讨论(0)
提交回复
热议问题