I want to calculate the average value of several lists in python. These lists contain numbers as strings. Empty string isn\'t zero, it means a missing value.
The be
Here's some timing on OP's solution vs. aIKid's solution vs. gnibbler's solutions, using a list of 100,000 numbers in 1..9
(plus the empty string) and 10 trials:
import timeit
setup = '''
from __main__ import f1, f2, f3, f4
import random
random.seed(0)
choices = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '']
num = [random.choice(choices) for _ in range(10**5)]
'''
def f1(num): # OP
total = sum([int(n) if n else 0 for n in num])
length = sum([1 if n else 0 for n in num])
ave = float(total)/length if length > 0 else '-'
return ave
def f2(num): # aIKid
total = sum(int(n) if n else 0 for n in num)
length = sum(1 if n else 0 for n in num)
ave = float(total)/length if length > 0 else '-'
return ave
def f3(num): # gnibbler 1
L = [int(n) for n in num if n]
ave = sum(L)/float(len(L)) if L else '-'
return ave
def f4(num): # gnibbler 2
L = [float(n) for n in num if n]
ave = sum(L)/float(len(L)) if L else '-'
return ave
number = 10
things = ['f1(num)', 'f2(num)', 'f3(num)', 'f4(num)']
for thing in things:
print(thing, timeit.timeit(thing, setup=setup, number=number))
Result:
f1(num) 1.8177659461490339 # OP
f2(num) 2.0769015213241513 # aIKid
f3(num) 1.6350571199344595 # gnibbler 1
f4(num) 0.807052779158564 # gnibbler 2
It looks like gnibbler's solution using float
is the fastest here.
num = ['1', '2', '', '6']
L = [int(n) for n in num if n]
ave = sum(L)/float(len(L)) if L else '-'
or
num = ['1', '2', '', '6']
L = [float(n) for n in num if n]
avg = sum(L)/len(L) if L else '-'
You can discard the square brackets. sum
accepts generator expressions, too:
total = sum(int(n) if n else 0 for n in num)
length = sum(1 if n else 0 for n in num)
And since generators yields the value only when needed, you save the expensive cost of storing a list in the memory. Especially if you're dealing with bigger datas.
A little different approach
num = ['1', '2', '', '6']
total = reduce(lambda acc, x: float(acc) + (float(x) if x else 0),num,0)
length = reduce(lambda acc, x: float(acc) + (1 if x else 0),num,0)
average = (',',total/length)[length > 0]
In Python 3.4 use the statistics library:
from statistics import mean
num = ['1', '2', '', '6']
ave = mean(int(n) if n else 0 for n in num)