I want to be able to \'build\' a numpy array on the fly, I do not know the size of this array in advance.
For example I want to do something like this:
For posterity, I think this is quicker:
a = np.array([np.array(list()) for _ in y])
You might even be able to pass in a generator (i.e. [] -> ()), in which case the inner list is never fully stored in memory.
Responding to comment below:
>>> import numpy as np
>>> y = range(10)
>>> a = np.array([np.array(list) for _ in y])
>>> a
array([array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object),
array(<type 'list'>, dtype=object)], dtype=object)
a = np.empty(0)
for x in y:
a = np.append(a, x)
Build a Python list and convert that to a Numpy array. That takes amortized O(1) time per append + O(n) for the conversion to array, for a total of O(n).
a = []
for x in y:
a.append(x)
a = np.array(a)
You can do this:
a = np.array([])
for x in y:
a = np.append(a, x)
Since y is an iterable I really do not see why the calls to append:
a = np.array(list(y))
will do and it's much faster:
import timeit
print timeit.timeit('list(s)', 's=set(x for x in xrange(1000))')
# 23.952975494633154
print timeit.timeit("""li=[]
for x in s: li.append(x)""", 's=set(x for x in xrange(1000))')
# 189.3826994248866