rgb to yuv420 algorithm efficiency

后端 未结 5 2027
北荒
北荒 2021-01-30 10:05

I wrote an algorithm to convert a RGB image to a YUV420. I spend a long time trying to make it faster but I haven\'t find any other way to boost its efficiency, so now I turn to

5条回答
  •  庸人自扰
    2021-01-30 10:22

    I guess the lookup tables are superfluous. The respective multiplications should be faster than a memory access. Especially in such an inner loop.

    Then, I would also apply some small changes (as others already have suggested)..:

    void Bitmap2Yuv420p( boost::uint8_t *destination, boost::uint8_t *rgb,
                         const int &width, const int &height ) {
      const size_t image_size = width * height;
      const size_t upos = image_size;
      const size_t vpos = upos + upos / 4;
      for( size_t i = 0; i < image_size; ++i ) {
        boost::uint8_t r = rgb[3*i  ];
        boost::uint8_t g = rgb[3*i+1];
        boost::uint8_t b = rgb[3*i+2];
        destination[i] = ( ( 66*r + 129*g + 25*b ) >> 8 ) + 16;
        if (!((i / width) % 2) && !(i % 2)) {
          destination[upos++] = ( ( -38*r + -74*g + 112*b ) >> 8) + 128;
          destination[vpos++] = ( ( 112*r + -94*g + -18*b ) >> 8) + 128;
        }
      }
    }
    

    EDIT

    You should also rearrange the code, so that you can remove the if(). Small, simple inner loops without branches are fast. Here, it may be a good idea to first write Y plane, then U and V planes, like this:

    void Bitmap2Yuv420p( boost::uint8_t *destination, boost::uint8_t *rgb,
                         const int &width, const int &height ) {
      const size_t image_size = width * height;
      boost::uint8_t *dst_y = destination;
      boost::uint8_t *dst_u = destination + image_size;
      boost::uint8_t *dst_v = destination + image_size + image_size/4;
    
      // Y plane
      for( size_t i = 0; i < image_size; ++i ) {
        *dst_y++ = ( ( 66*rgb[3*i] + 129*rgb[3*i+1] + 25*rgb[3*i+2] ) >> 8 ) + 16;
      }
    #if 1
      // U plane
      for( size_t y=0; y> 8 ) + 128;
      }
      // V plane
      for( size_t y=0; y> 8 ) + 128;
      }
    #else // also try this version:
      // U+V planes
      for( size_t y=0; y> 8 ) + 128;
          *dst_v++ = ( ( 112*rgb[3*i] + -94*rgb[3*i+1] + -18*rgb[3*i+2] ) >> 8 ) + 128;
      }
    #endif
    }
    

提交回复
热议问题