问题
I am working on detecting similarity between 2 videos in Java. The user will suggest two videos, and software has to detect whether they are similar by checking the file content. I read that it is possible to compare each frame of the 2 videos. Can anyone please share any suitable algorithms (or code or methods) that can be implemented in Java?
回答1:
There is a huge variety of algorithms for determining similarity in images. A search for image similarity algorithm
and video similarity algorithm
in Google Scholar will produce a large number of related papers - there are also a few questions (e.g. this one) here on StackOverflow.
A couple of important aspects that should be noted:
There is no universal definition of similarity - you need to define it with regard to your specific purpose. For example, an image with a red square and an image with a blue square could be considered similar because both have squares, or entirely dissimilar based on the color difference.
Similarity is not generally defined in absolute terms i.e. as something that either exists or not. Most similarity algorithms produce a relative indicator that has to be correlated with a baseline to produce meaningful results. For example, if you have a corpus of images depicting squares of various colors, you might get high similarity values in absolute terms, but it's the minute differences caused by the color changes that should be focused on.
Disclaimer: before using any algorithm found through a search engine, you should investigate its legal status. Video similarity is a rather hot research area and quite a few algorithms are probably encumbered by patents and such. Using them for academic research might be acceptable, but anything else you should ask a lawyer about...
EDIT:
I am not certain what you need, but I can offer a few general tips:
Investigate if the video metadata, such as length and resolution, may be useful. For example, would it make sense to actually compare the content of a 30-second clip to a 3-hour film?
Consider if you can get away with using image-based similarity on a random sample of corresponding frames from the same timestamps in each file. Examining each and every frame in detail would probably be a waste of time and CPU cycles in most cases.
Consider using a tiered similarity measurement architecture, where simpler and less-expensive methods are used to weed out the obvious cases, before the real CPU hogs step in. For example, computing the average color and other simple metrics for a frame is probably much easier than contour detection or face recognition.
That said, I do not believe that will be able to get a definite answer here. You will have to experiment and see what approaches work best for your actual use cases...
来源:https://stackoverflow.com/questions/9894361/detecting-similarity-in-two-video-files