I want to remove all empty strings from a list of strings in python.
My idea looks like this:
while \'\' in str_list:
str_list.remove(\'\')
>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']
>>> ' '.join(lstr).split()
['hello', 'world']
>>> filter(None, lstr)
['hello', ' ', 'world', ' ']
Compare time
>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
4.226747989654541
>>> timeit('filter(None, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.0278358459472656
Notice that filter(None, lstr)
does not remove empty strings with a space ' '
, it only prunes away ''
while ' '.join(lstr).split()
removes both.
To use filter()
with white space strings removed, it takes a lot more time:
>>> timeit('filter(None, [l.replace(" ", "") for l in lstr])', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
18.101892948150635
Using a list comprehension is the most Pythonic way:
>>> strings = ["first", "", "second"]
>>> [x for x in strings if x]
['first', 'second']
If the list must be modified in-place, because there are other references which must see the updated data, then use a slice assignment:
strings[:] = [x for x in strings if x]
Instead of if x, I would use if X != '' in order to just eliminate empty strings. Like this:
str_list = [x for x in str_list if x != '']
This will preserve None data type within your list. Also, in case your list has integers and 0 is one among them, it will also be preserved.
For example,
str_list = [None, '', 0, "Hi", '', "Hello"]
[x for x in str_list if x != '']
[None, 0, "Hi", "Hello"]
Depending on the size of your list, it may be most efficient if you use list.remove() rather than create a new list:
l = ["1", "", "3", ""]
while True:
try:
l.remove("")
except ValueError:
break
This has the advantage of not creating a new list, but the disadvantage of having to search from the beginning each time, although unlike using while '' in l
as proposed above, it only requires searching once per occurrence of ''
(there is certainly a way to keep the best of both methods, but it is more complicated).
filter actually has a special option for this:
filter(None, sequence)
It will filter out all elements that evaluate to False. No need to use an actual callable here such as bool, len and so on.
It's equally fast as map(bool, ...)
Reply from @Ib33X is awesome. If you want to remove every empty string, after stripped. you need to use the strip method too. Otherwise, it will return the empty string too if it has white spaces. Like, " " will be valid too for that answer. So, can be achieved by.
strings = ["first", "", "second ", " "]
[x.strip() for x in strings if x.strip()]
The answer for this will be ["first", "second"]
.
If you want to use filter
method instead, you can do like
list(filter(lambda item: item.strip(), strings))
. This is give the same result.