rgb to yuv420 algorithm efficiency

后端未结

关注

 5  2027

北荒 2021-01-30 10:05

I wrote an algorithm to convert a RGB image to a YUV420. I spend a long time trying to make it faster but I haven\'t find any other way to boost its efficiency, so now I turn to

5条回答

庸人自扰 (楼主)

2021-01-30 10:22

I guess the lookup tables are superfluous. The respective multiplications should be faster than a memory access. Especially in such an inner loop.

Then, I would also apply some small changes (as others already have suggested)..:

void Bitmap2Yuv420p( boost::uint8_t *destination, boost::uint8_t *rgb,
                     const int &width, const int &height ) {
  const size_t image_size = width * height;
  const size_t upos = image_size;
  const size_t vpos = upos + upos / 4;
  for( size_t i = 0; i < image_size; ++i ) {
    boost::uint8_t r = rgb[3*i  ];
    boost::uint8_t g = rgb[3*i+1];
    boost::uint8_t b = rgb[3*i+2];
    destination[i] = ( ( 66*r + 129*g + 25*b ) >> 8 ) + 16;
    if (!((i / width) % 2) && !(i % 2)) {
      destination[upos++] = ( ( -38*r + -74*g + 112*b ) >> 8) + 128;
      destination[vpos++] = ( ( 112*r + -94*g + -18*b ) >> 8) + 128;
    }
  }
}

EDIT

You should also rearrange the code, so that you can remove the if(). Small, simple inner loops without branches are fast. Here, it may be a good idea to first write Y plane, then U and V planes, like this:

void Bitmap2Yuv420p( boost::uint8_t *destination, boost::uint8_t *rgb,
                     const int &width, const int &height ) {
  const size_t image_size = width * height;
  boost::uint8_t *dst_y = destination;
  boost::uint8_t *dst_u = destination + image_size;
  boost::uint8_t *dst_v = destination + image_size + image_size/4;

  // Y plane
  for( size_t i = 0; i < image_size; ++i ) {
    *dst_y++ = ( ( 66*rgb[3*i] + 129*rgb[3*i+1] + 25*rgb[3*i+2] ) >> 8 ) + 16;
  }
#if 1
  // U plane
  for( size_t y=0; y> 8 ) + 128;
  }
  // V plane
  for( size_t y=0; y> 8 ) + 128;
  }
#else // also try this version:
  // U+V planes
  for( size_t y=0; y> 8 ) + 128;
      *dst_v++ = ( ( 112*rgb[3*i] + -94*rgb[3*i+1] + -18*rgb[3*i+2] ) >> 8 ) + 128;
  }
#endif
}

0 讨论(0)

查看其它5个回答