问题
Usually, algorithms as SIFT, SURF and many others provdies a set of k
keypoints and the associated descriptor in d
dimension (for example, in SIFT each descriptor has d=128
dimensions).
So, in order to describe an image we need a matrix kxd
(k
descriptor vectors, each one in d
dimensions). So far so good.
My question is: how can we describe an image through a single vector?
This could be really useful since we could save a lot of space and because certain algorithms (like LSH) requires a vector as input/query.
In some papers (for example this, section 6.5) this approach is described as "global descriptors".
Up to know, I found only this paper but it doesn't seem so accurate (and it's from 2009, not so new).
UPDATE: Other possible solutions (some suggested in the comments):
Visual bag of words
gist descriptor
来源:https://stackoverflow.com/questions/37455280/global-vector-descriptor