问题
I have the following matrices:
A.toarray()
array([[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]], dtype=int64)
type(A)
scipy.sparse.csr.csr_matrix
A.shape
(878049, 942)
And matrix B:
B
array([2248, 2248, 2248, ..., 0, 0, 0])
type(B)
numpy.ndarray
B.shape
(878049,)
I would like to column stack A
and B
in C, I tried the folowing:
C = sparse.column_stack([A,B])
Then:
/usr/local/lib/python3.5/site-packages/numpy/lib/shape_base.py in column_stack(tup)
315 arr = array(arr, copy=False, subok=True, ndmin=2).T
316 arrays.append(arr)
--> 317 return _nx.concatenate(arrays, 1)
318
319 def dstack(tup):
ValueError: all the input array dimensions except for the concatenation axis must match exactly
My problem is how can I preserve the dimentions. Thus, any idea of how to column stack them?.
Update
I tried the following:
#Sorry for the name
C = np.vstack(( A.A.T, B)).T
and I got:
array([[ 0, 0, 0, ..., 0, 6],
[ 0, 0, 0, ..., 0, 6],
[ 0, 0, 0, ..., 0, 6],
...,
[ 0, 0, 0, ..., 0, 1],
[ 0, 0, 0, ..., 0, 1],
[ 0, 0, 0, ..., 0, 1]], dtype=int64)
Is this the correct way to column stack them?.
回答1:
Did you try the following?
C=np.vstack((A.T,B)).T
With sample values:
A = array([[1, 2, 3], [4, 5, 6]])
>>>> A.shape
(2, 3)
B = array([7, 8])
>>> B.shape
(2,)
C=np.vstack((A.T,B)).T
>>> C.shape
(2, 4)
If A is a sparse matrix, and you want to maintain the output as sparse, you could do:
C=np.vstack((A.A.T,B)).T
D=csr_matrix((C))
回答2:
2 issues
- there isn't a
sparse.column_stack
- you are mixing a sparse matrix and dense array
2 smaller examples:
In [129]: A=sparse.csr_matrix([[1,0,0],[0,1,0]])
In [130]: B=np.array([1,2])
Using np.column_stack
gives your error:
In [131]: np.column_stack((A,B))
...
ValueError: all the input array dimensions except for the concatenation axis must match exactly
But if I first turn A
into an array, column_stack does fine:
In [132]: np.column_stack((A.A, B))
Out[132]:
array([[1, 0, 0, 1],
[0, 1, 0, 2]])
the equivalent with concatenate
:
In [133]: np.concatenate((A.A, B[:,None]), axis=1)
Out[133]:
array([[1, 0, 0, 1],
[0, 1, 0, 2]])
there is a sparse.hstack
. For that I need to turn B
into a sparse matrix as well. Transpose works because it is now a matrix (as opposed to a 1d array):
In [134]: sparse.hstack((A,sparse.csr_matrix(B).T))
Out[134]:
<2x4 sparse matrix of type '<class 'numpy.int32'>'
with 4 stored elements in COOrdinate format>
In [135]: _.A
Out[135]:
array([[1, 0, 0, 1],
[0, 1, 0, 2]], dtype=int32)
来源:https://stackoverflow.com/questions/38498299/how-to-column-stack-a-numpy-array-with-a-scipy-sparse-matrix