Decode video frames on iPhone GPU

后端 未结 3 1090
天涯浪人
天涯浪人 2020-12-30 04:46

I\'m looking for the fastest way to decode a local mpeg-4 video\'s frames on the iPhone. I\'m simply interested in the luminance values of the pixels in every 10th frame. I

相关标签:
3条回答
  • 2020-12-30 05:15

    Seemingly vImage might be appropriate, assuming you can use iOS 5. Every 10th frame seems to be within reason for using a framework like vImage. However, any type of actual real-time processing is almost certainly going to require OpenGL.

    0 讨论(0)
  • 2020-12-30 05:21

    Assuming the bottleneck of your application is in the code that converts the video frames to a displayable format (like RGB), you might be interested in a code I shared that was used to convert one .mp4 frame (encoded as YV12) to RGB using Qt and OpenGL. This application uploads the frame to the GPU and activates a GLSL fragment shader to do the conversion from YV12 to RGB, so it could be displayed in a QImage.

    static const char *p_s_fragment_shader =
        "#extension GL_ARB_texture_rectangle : enable\n"
        "uniform sampler2DRect tex;"
        "uniform float ImgHeight, chromaHeight_Half, chromaWidth;"
        "void main()"
        "{"
        "    vec2 t = gl_TexCoord[0].xy;" // get texcoord from fixed-function pipeline
        "    float CbY = ImgHeight + floor(t.y / 4.0);"
        "    float CrY = ImgHeight + chromaHeight_Half + floor(t.y / 4.0);"
        "    float CbCrX = floor(t.x / 2.0) + chromaWidth * floor(mod(t.y, 2.0));"
        "    float Cb = texture2DRect(tex, vec2(CbCrX, CbY)).x - .5;"
        "    float Cr = texture2DRect(tex, vec2(CbCrX, CrY)).x - .5;"
        "    float y = texture2DRect(tex, t).x;" // redundant texture read optimized away by texture cache
        "    float r = y + 1.28033 * Cr;"
        "    float g = y - .21482 * Cb - .38059 * Cr;"
        "    float b = y + 2.12798 * Cb;"
        "    gl_FragColor = vec4(r, g, b, 1.0);"
        "}"
    
    0 讨论(0)
  • 2020-12-30 05:34

    If you are willing to use an iOS 5 only solution, take a look at the sample app ChromaKey from the 2011 WWDC session on AVCaputureSession.

    That demo captures 30 FPS of video from the built-in camera and passes each frame to OpenGL as a texture. It then uses OpenGL to manipulate the frame, and optionally writes the result out to an output video file.

    The code uses some serious low-level magic to bind a Core Video Pixel buffer from an AVCaptureSession to OpenGL so they share memory in the graphics hardware.

    It should be fairly straightforward to change the AVCaptureSession to use a movie file as input rather than camera input.

    You could probably set up the session to deliver frames in Y/UV form rather than RGB, where the Y component is luminance. Failing that, it would be a pretty simple matter to write a shader that would convert RGB values for each pixel to luminance values.

    You should be able to do all this on ALL Frames, not just every 10th frame.

    0 讨论(0)
提交回复
热议问题