I am looking to count the number of times the values in an array change in polarity (EDIT: Number of times the values in an array cross zero).
Suppose I have an array:<
Seems like, you want to group numbers by their sign. This could be done using built-in method groupby
:
In [2]: l = [80.6, 120.8, -115.6, -76.1, 131.3, 105.1, 138.4, -81.3, -95.3, 89.2, -154.1, 121.4, -85.1, 96.8, 68.2]
In [3]: from itertools import groupby
In [5]: list(groupby(l, lambda x: x < 0))
Out[5]:
[(False, <itertools._grouper at 0x7fc9022095f8>),
(True, <itertools._grouper at 0x7fc902209828>),
(False, <itertools._grouper at 0x7fc902209550>),
(True, <itertools._grouper at 0x7fc902209e80>),
(False, <itertools._grouper at 0x7fc902209198>),
(True, <itertools._grouper at 0x7fc9022092e8>),
(False, <itertools._grouper at 0x7fc902209240>),
(True, <itertools._grouper at 0x7fc902209908>),
(False, <itertools._grouper at 0x7fc9019a64e0>)]
Then you should use function len
which returns the number of groups:
In [7]: len(list(groupby(l, lambda x: x < 0)))
Out[7]: 9
Obviously, there will be at least one group (for a non-empty list), but if you want to count the number of points, where a sequence changes its polarity, you could just subtract one group. Do not forget about the empty-list case.
You should also take care about zero elements: shouldn't they be extracted into another group? If so, you could just change the key
argument (lambda function) of groupby
function.
Based on Scott's answer
The generator expression proposed by Scott uses enumerate
which returns tuples containing index and list item. List item are not used in the expression at all and discarded later. So better solution in terms of time would be
sum(1 for i in range(1, len(a)) if a[i-1]*a[i]<0)
If your list a
is really huge, range may throw an exception. You can replace it with itertools.islice
and itertools.count
.
In Python version 2.x, use xrange instead of Python 3's range
.
In Python 3, xrange
is no longer available.
I think a loop is a straight forward way to go:
a = [80.6, 120.8, -115.6, -76.1, 131.3, 105.1, 138.4, -81.3, -95.3, 89.2, -154.1, 121.4, -85.1, 96.8, 68.2]
def change_sign(v1, v2):
return v1 * v2 < 0
s = 0
for ind, _ in enumerate(a):
if ind+1 < len(a):
if change_sign(a[ind], a[ind+1]):
s += 1
print s # prints 8
You could use a generator expression but it gets ugly:
z_cross = sum(1 for ind, val in enumerate(a) if (ind+1 < len(a))
if change_sign(a[ind], a[ind+1]))
print z_cross # prints 8
EDIT:
@Alik pointed out that for huge lists the best option in space and time (at least out of the solutions we have considered) is not to call change_sign
in the generator expression but to simply do:
z_cross = sum(1 for i, _ in enumerate(a) if (i+1 < len(a)) if a[i]*a[i+1]<0)
Here's a numpy
solution. Numpy's methods are generally pretty fast and well-optimized, but if you're not already working with numpy
there's probably some overhead from converting the list to a numpy
array:
import numpy as np
my_list = [80.6, 120.8, -115.6, -76.1, 131.3, 105.1, 138.4, -81.3, -95.3, 89.2, -154.1, 121.4, -85.1, 96.8, 68.2]
(np.diff(np.sign(my_list)) != 0).sum()
Out[8]: 8
You can achieve it using list comprehension:
myList = [80.6, 120.8, -115.6, -76.1, 131.3, 105.1, 138.4, -81.3, -95.3, 89.2, -154.1, 121.4, -85.1, 96.8, 68.2]
len([x for i, x in enumerate(myList) if i > 0 and ((myList[i-1] > 0 and myList[i] < 0) or (myList[i-1] < 0 and myList[i] > 0))])
This produces the same result:
import numpy as np
my_array = np.array([80.6, 120.8, -115.6, -76.1, 131.3, 105.1, 138.4, -81.3, -95.3,
89.2, -154.1, 121.4, -85.1, 96.8, 68.2])
((my_array[:-1] * my_array[1:]) < 0).sum()
gives:
8
and seems to be the fastest solution:
%timeit ((my_array[:-1] * my_array[1:]) < 0).sum()
100000 loops, best of 3: 11.6 µs per loop
Compared to the fastest so far:
%timeit (np.diff(np.sign(my_array)) != 0).sum()
10000 loops, best of 3: 22.2 µs per loop
Also for larger arrays:
big = np.random.randint(-10, 10, size=10000000)
this:
%timeit ((big[:-1] * big[1:]) < 0).sum()
10 loops, best of 3: 62.1 ms per loop
vs:
%timeit (np.diff(np.sign(big)) != 0).sum()
1 loops, best of 3: 97.6 ms per loop