Remove empty strings from a list of strings

后端 未结 12 1448
孤城傲影
孤城傲影 2020-11-22 04:33

I want to remove all empty strings from a list of strings in python.

My idea looks like this:

while \'\' in str_list:
    str_list.remove(\'\')


        
相关标签:
12条回答
  • 2020-11-22 05:11

    Keep in mind that if you want to keep the white spaces within a string, you may remove them unintentionally using some approaches. If you have this list

    ['hello world', ' ', '', 'hello'] what you may want ['hello world','hello']

    first trim the list to convert any type of white space to empty string:

    space_to_empty = [x.strip() for x in _text_list]
    

    then remove empty string from them list

    space_clean_list = [x for x in space_to_empty if x]
    
    0 讨论(0)
  • 2020-11-22 05:11

    As reported by Aziz Alto filter(None, lstr) does not remove empty strings with a space ' ' but if you are sure lstr contains only string you can use filter(str.strip, lstr)

    >>> lstr = ['hello', '', ' ', 'world', ' ']
    >>> lstr
    ['hello', '', ' ', 'world', ' ']
    >>> ' '.join(lstr).split()
    ['hello', 'world']
    >>> filter(str.strip, lstr)
    ['hello', 'world']
    

    Compare time on my pc

    >>> from timeit import timeit
    >>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
    3.356455087661743
    >>> timeit('filter(str.strip, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
    5.276503801345825
    

    The fastest solution to remove '' and empty strings with a space ' ' remains ' '.join(lstr).split().

    As reported in a comment the situation is different if your strings contain spaces.

    >>> lstr = ['hello', '', ' ', 'world', '    ', 'see you']
    >>> lstr
    ['hello', '', ' ', 'world', '    ', 'see you']
    >>> ' '.join(lstr).split()
    ['hello', 'world', 'see', 'you']
    >>> filter(str.strip, lstr)
    ['hello', 'world', 'see you']
    

    You can see that filter(str.strip, lstr) preserve strings with spaces on it but ' '.join(lstr).split() will split this strings.

    0 讨论(0)
  • 2020-11-22 05:17

    Use filter:

    newlist=filter(lambda x: len(x)>0, oldlist) 
    

    The drawbacks of using filter as pointed out is that it is slower than alternatives; also, lambda is usually costly.

    Or you can go for the simplest and the most iterative of all:

    # I am assuming listtext is the original list containing (possibly) empty items
    for item in listtext:
        if item:
            newlist.append(str(item))
    # You can remove str() based on the content of your original list
    

    this is the most intuitive of the methods and does it in decent time.

    0 讨论(0)
  • 2020-11-22 05:21

    I would use filter:

    str_list = filter(None, str_list)
    str_list = filter(bool, str_list)
    str_list = filter(len, str_list)
    str_list = filter(lambda item: item, str_list)
    

    Python 3 returns an iterator from filter, so should be wrapped in a call to list()

    str_list = list(filter(None, str_list))
    
    0 讨论(0)
  • 2020-11-22 05:22

    Sum up best answers:

    1. Eliminate emtpties WITHOUT stripping:

    That is, all-space strings are retained:

    slist = list(filter(None, slist))
    

    PROs:

    • simplest;
    • fastest (see benchmarks below).

    2. To eliminate empties after stripping ...

    2.a ... when strings do NOT contain spaces between words:

    slist = ' '.join(slist).split()
    

    PROs:

    • small code
    • fast (BUT not fastest with big datasets due to memory, contrary to what @paolo-melchiorre results)

    2.b ... when strings contain spaces between words?

    slist = list(filter(str.strip, slist))
    

    PROs:

    • fastest;
    • understandability of the code.

    Benchmarks on a 2018 machine:

    ## Build test-data
    #
    import random, string
    nwords = 10000
    maxlen = 30
    null_ratio = 0.1
    rnd = random.Random(0)                  # deterministic results
    words = [' ' * rnd.randint(0, maxlen)
             if rnd.random() > (1 - null_ratio)
             else
             ''.join(random.choices(string.ascii_letters, k=rnd.randint(0, maxlen)))
             for _i in range(nwords)
            ]
    
    ## Test functions
    #
    def nostrip_filter(slist):
        return list(filter(None, slist))
    
    def nostrip_comprehension(slist):
        return [s for s in slist if s]
    
    def strip_filter(slist):
        return list(filter(str.strip, slist))
    
    def strip_filter_map(slist): 
        return list(filter(None, map(str.strip, slist))) 
    
    def strip_filter_comprehension(slist):  # waste memory
        return list(filter(None, [s.strip() for s in slist]))
    
    def strip_filter_generator(slist):
        return list(filter(None, (s.strip() for s in slist)))
    
    def strip_join_split(slist):  # words without(!) spaces
        return ' '.join(slist).split()
    
    ## Benchmarks
    #
    %timeit nostrip_filter(words)
    142 µs ± 16.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
    %timeit nostrip_comprehension(words)
    263 µs ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit strip_filter(words)
    653 µs ± 37.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit strip_filter_map(words)
    642 µs ± 36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit strip_filter_comprehension(words)
    693 µs ± 42.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit strip_filter_generator(words)
    750 µs ± 28.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit strip_join_split(words)
    796 µs ± 103 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    0 讨论(0)
  • 2020-11-22 05:28

    For a list with a combination of spaces and empty values, use simple list comprehension -

    >>> s = ['I', 'am', 'a', '', 'great', ' ', '', '  ', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', '', 'a', '', 'joke', '', ' ', '', '?', '', '', '', '?']
    

    So, you can see, this list has a combination of spaces and null elements. Using the snippet -

    >>> d = [x for x in s if x.strip()]
    >>> d
    >>> d = ['I', 'am', 'a', 'great', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', 'a', 'joke', '?', '?']
    
    0 讨论(0)
提交回复
热议问题