It appears that I have data in the format of a list of NumPy arrays (type() = np.ndarray
):
[array([[ 0.00353654]]), array([[ 0.00353654]]), arra
You could use numpy.concatenate, which as the name suggests, basically concatenates all the elements of such an input list into a single NumPy array, like so -
import numpy as np
out = np.concatenate(input_list).ravel()
If you wish the final output to be a list, you can extend the solution, like so -
out = np.concatenate(input_list).ravel().tolist()
Sample run -
In [24]: input_list
Out[24]:
[array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]])]
In [25]: np.concatenate(input_list).ravel()
Out[25]:
array([ 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
0.00353654, 0.00353654, 0.00353654])
Convert to list -
In [26]: np.concatenate(input_list).ravel().tolist()
Out[26]:
[0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654,
0.00353654]
I came across this same issue and found a solution that combines 1-D numpy arrays of variable length:
np.column_stack(input_list).ravel()
See numpy.column_stack for more info.
Example with variable-length arrays with your example data:
In [135]: input_list
Out[135]:
[array([[ 0.00353654, 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654]]),
array([[ 0.00353654, 0.00353654, 0.00353654]])]
In [136]: [i.size for i in input_list] # variable size arrays
Out[136]: [2, 1, 1, 3]
In [137]: np.column_stack(input_list).ravel()
Out[137]:
array([ 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
0.00353654, 0.00353654])
Note: Only tested on Python 2.7.12
Another simple approach would be to use numpy.hstack() followed by removing the singleton dimension using squeeze()
as in:
In [61]: np.hstack(list_of_arrs).squeeze()
Out[61]:
array([0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654,
0.00353654, 0.00353654, 0.00353654])
Can also be done by
np.array(list_of_arrays).flatten().tolist()
resulting in
[0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654, 0.00353654]
Update
As @aydow points out in the comments, using numpy.ndarray.ravel
can be faster if one doesn't care about getting a copy or a view
np.array(list_of_arrays).ravel()
Although, according to docs
When a view is desired in as many cases as possible,
arr.reshape(-1)
may be preferable.
In other words
np.array(list_of_arrays).reshape(-1)
The initial suggestion of mine was to use numpy.ndarray.flatten
that returns a copy every time which affects performance.
Let's now see how the time complexity of the above-listed solutions compares using perfplot package for a setup similar to the one of the OP
import perfplot
perfplot.show(
setup=lambda n: np.random.rand(n, 2),
kernels=[lambda a: a.ravel(),
lambda a: a.flatten(),
lambda a: a.reshape(-1)],
labels=['ravel', 'flatten', 'reshape'],
n_range=[2**k for k in range(16)],
xlabel='N')
Here flatten
demonstrates piecewise linear complexity which can be reasonably explained by it making a copy of the initial array compare to constant complexities of ravel
and reshape
that return a view.
It's also worth noting that, quite predictably, converting the outputs .tolist()
evens out the performance of all three to equally linear.
Another way using itertools
for flattening the array:
import itertools
# Recreating array from question
a = [np.array([[0.00353654]])] * 13
# Make an iterator to yield items of the flattened list and create a list from that iterator
flattened = list(itertools.chain.from_iterable(a))
This solution should be very fast, see https://stackoverflow.com/a/408281/5993892 for more explanation.
If the resulting data structure should be a numpy
array instead, use numpy.fromiter()
to exhaust the iterator into an array:
# Make an iterator to yield items of the flattened list and create a numpy array from that iterator
flattened_array = np.fromiter(itertools.chain.from_iterable(a), float)
Docs for itertools.chain.from_iterable()
:
https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable
Docs for numpy.fromiter()
:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html