问题
I am reading about Product Quantization, from section II.A page 3 of PQ for NNS, that says:
..all subquantizers have the same finite number k* of reproduction values. In that case the number of centroids is (k*)^m
where m
is the number of subvectors.
However, I do not get k* at all! I mean in vector quantization we assign every vector to k centroids. In produce quantization, we assign every subvector to k centroids. How did k* come into play?
回答1:
I think k*
is the number of centroids in each subspace, and k
is the number of centroids in the whole space.
For example if the data is 2d, like (x, y)
, and we treat each dimension as a subspace, and do kmeans with say k*=3
respectively, we'll get 3 centroids in each subspace, {x1, x2, x3}
and {y1, y2, y3}
.
Then there'll be 3^2=9
possible centroids in the whole space, which are*(x1, y1)
, (x1, y2)
, (x1, y3)
, (x2, y1)
...
In this way we can get a large number of centroids (2^64
in the paper) using a small amount of memory, because we don't have to store all k*^m
centorids, we only need to store k*
centroids in each subspace.
Edit:
In above the example, the number of subspaces m=2
, number of centroids in each subspace k*=3
, number of centroids the whole subspace k=3^2
, number of dimensions of each subspace D*=1
, number of floating points to store mD*k*=Dk*=6
.
*the cartesian product of x and y
来源:https://stackoverflow.com/questions/38382217/k-reproduction-values