In python, I have a list that should have one and only one truthy value (that is, bool(value) is True
). Is there a clever way to check for this
A one-line answer that retains the short-circuiting behavior:
from itertools import ifilter, islice
def only1(l):
return len(list(islice(ifilter(None, l), 2))) == 1
This will be significantly faster than the other alternatives here for very large iterables that have two or more true values relatively early.
ifilter(None, itr)
gives an iterable that will only yield truthy elements (x
is truthy if bool(x)
returns True
). islice(itr, 2)
gives an iterable that will only yield the first two elements of itr
. By converting this to a list and checking that the length is equal to one we can verify that exactly one truthy element exists without needing to check any additional elements after we have found two.
Here are some timing comparisons:
Setup code:
In [1]: from itertools import islice, ifilter
In [2]: def fj(l): return len(list(islice(ifilter(None, l), 2))) == 1
In [3]: def david(l): return sum(bool(e) for e in l) == 1
Exhibiting short-circuit behavior:
In [4]: l = range(1000000)
In [5]: %timeit fj(l)
1000000 loops, best of 3: 1.77 us per loop
In [6]: %timeit david(l)
1 loops, best of 3: 194 ms per loop
Large list where short-circuiting does not occur:
In [7]: l = [0] * 1000000
In [8]: %timeit fj(l)
100 loops, best of 3: 10.2 ms per loop
In [9]: %timeit david(l)
1 loops, best of 3: 189 ms per loop
Small list:
In [10]: l = [0]
In [11]: %timeit fj(l)
1000000 loops, best of 3: 1.77 us per loop
In [12]: %timeit david(l)
1000000 loops, best of 3: 990 ns per loop
So the sum()
approach is faster for very small lists, but as the input list gets larger my version is faster even when short-circuiting is not possible. When short-circuiting is possible on a large input, the performance difference is clear.
if sum([bool(x) for x in list]) == 1
(Assuming all your values are booleanish.)
This would probably be faster just summing it
sum(list) == 1
although it may cause some problems depending on the data types in your list.
I wanted to earn the necromancer badge, so I generalized the Jon Clements' excellent answer, preserving the benefits of short-circuiting logic and fast predicate checking with any and all.
Thus here is:
N(trues) = n
def n_trues(iterable, n=1):
i = iter(iterable)
return all(any(i) for j in range(n)) and not any(i)
N(trues) <= n:
def up_to_n_trues(iterable, n=1):
i = iter(iterable)
all(any(i) for j in range(n))
return not any(i)
N(trues) >= n:
def at_least_n_trues(iterable, n=1):
i = iter(iterable)
return all(any(i) for j in range(n))
m <= N(trues) <= n
def m_to_n_trues(iterable, m=1, n=1):
i = iter(iterable)
assert m <= n
return at_least_n_trues(i, m) and up_to_n_trues(i, n - m)
>>> l = [0, 0, 1, 0, 0]
>>> has_one_true = len([ d for d in l if d ]) == 1
>>> has_one_true
True
import collections
def only_n(l, testval=True, n=1):
counts = collections.Counter(l)
return counts[testval] == n
Linear time. Uses the built-in Counter class, which is what you should be using to check counts.
Re-reading your question, it looks like you actually want to check that there is only one truthy value, rather than one True
value. Try this:
import collections
def only_n(l, testval=True, coerce=bool, n=1):
counts = collections.Counter((coerce(x) for x in l))
return counts[testval] == n
While you can get better best case performance, nothing has better worst-case performance. This is also short and easy to read.
Here's a version optimised for best-case performance:
import collections
import itertools
def only_n(l, testval=True, coerce=bool, n=1):
counts = collections.Counter()
def iterate_and_count():
for x in itertools.imap(coerce,l):
yield x
if x == testval and counts[testval] > n:
break
counts.update(iterate_and_count())
return counts[testval] == n
The worst case performance has a high k
(as in O(kn+c)
), but it is completely general.
Here's an ideone to experiment with performance: http://ideone.com/ZRrv2m
One that doesn't require imports:
def single_true(iterable):
i = iter(iterable)
return any(i) and not any(i)
Alternatively, perhaps a more readable version:
def single_true(iterable):
iterator = iter(iterable)
# consume from "i" until first true or it's exhausted
has_true = any(iterator)
# carry on consuming until another true value / exhausted
has_another_true = any(iterator)
# True if exactly one true found
return has_true and not has_another_true
This:
i
has any true value