Finding an interesting frame in a video

后端 未结 7 1205
囚心锁ツ
囚心锁ツ 2021-01-31 05:10

Does anyone know of an algorithm that I could use to find an \"interesting\" representative thumbnail for a video?

I have say 30 bitmaps and I would like to choose the

7条回答
  •  失恋的感觉
    2021-01-31 05:42

    If the video contains structure, i.e. several shots, then the standard techniques for video summarisation involve (a) shot detection, then (b) use the first, mid, or nth frame to represent each shot. See [1].

    However, let us assume you wish to find an interesting frame in a single continuous stream of frames taken from a single camera source. I.e. a shot. This is the "key frame detection" problem that is widely discussed in IR/CV (Information Retrieval, Computer Vision) texts. Some illustrative approaches:

    • In [2] a mean colour histogram is computed for all frames and the key-frame is that with the closest histogram. I.e. we select the best frame in terms of it's colour distribution.
    • In [3] we assume that camera stillness is an indicator of frame importance. As suggested by Beds, above. We pick the still frames using optic-flow and use that.
    • In [4] each frame is projected into some high dimensional content space, we find those frames at the corners of the space and use them to represent the video.
    • In [5] frames are evaluated for importance using their length and novelty in content space.

    In general, this is a large field and there are lots of approaches. You can look at the academic conferences such as The International Conference on Image and Video Retrieval (CIVR) for the latest ideas. I find that [6] presents a useful detailed summary of video abstraction (key-frame detection and summarisation).

    For your "find the best of 30 bitmaps" problem I would use an approach like [2]. Compute a frame representation space (e.g. a colour histogram for the frame), compute a histogram to represent all frames, and use the frame with the minimum distance between the two (e.g. pick a distance metric that's best for your space. I would try Earth Mover's Distance).

    1. M.S. Lew. Principles of Visual Information Retrieval. Springer Verlag, 2001.
    2. B. Gunsel, Y. Fu, and A.M. Tekalp. Hierarchical temporal video segmentation and content characterization. Multimedia Storage and Archiving Systems II, SPIE, 3229:46-55, 1997.
    3. W. Wolf. Key frame selection by motion analysis. In IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 1228-1231, 1996.
    4. L. Zhao, W. Qi, S.Z. Li, S.Q. Yang, and H.J. Zhang. Key-frame extraction and shot retrieval using Nearest Feature Line. In IW-MIR, ACM MM, pages 217-220, 2000.
    5. S. Uchihashi. Video Manga: Generating semantically meaningful video summaries. In Proc. ACM Multimedia 99, Orlando, FL, Nov., pages 383-292, 1999.
    6. Y. Li, T. Zhang, and D. Tretter. An overview of video abstraction techniques. Technical report, HP Laboratory, July 2001.

提交回复
热议问题