How to normalize pixel values of an UIImage in Swift?

本秂侑毒 提交于 2020-01-06 14:48:39

问题


We are attempting to normalize an UIImage so that it can be passed correctly into a CoreML model.

The way we are retrieving the RGB values from each pixel is by first initializing a [CGFloat] array called rawData of values for each pixel such that there is a position for the colors Red, Green, Blue and the alpha value. In bitmapInfo, we get the raw pixel values from the original UIimage itself and conduct. This is used to fill the bitmapInfo paramter in context, a CGContext variable. We will later used the context variable to draw a CGImage that will later convert the normalized CGImage back into a UIImage.

Using a nested for-loop iterating through x and y coordinates, the minimum and maximum pixel color values among all colors (found through the CGFloat's raw data array) across all the pixels are found. A bound variable is set to terminate the for loop, otherwise, it will has out of range error.

range indicates the range of possible RGB values (ie. the difference between the maximum color value and the minimum).

Using the equation to normalize each pixel value:

A = Image
curPixel = current pixel (R,G, B or Alpha) 
NormalizedPixel = (curPixel-minPixel(A))/range

and a similar designed nested for loop from above to parse through the array of rawData and modify each pixel's colors according to this normalization.

Most of our codes are from:

  1. UIImage to UIColor array of pixel colors
  2. Change color of certain pixels in a UIImage
  3. https://gist.github.com/pimpapare/e8187d82a3976b851fc12fe4f8965789

We use CGFloat instead of UInt8 because the normalized pixel values should be real numbers that between 0 and 1, not either 0 or 1.

func normalize() -> UIImage?{

    let colorSpace = CGColorSpaceCreateDeviceRGB()

    guard let cgImage = cgImage else {
        return nil
    }

    let width = Int(size.width)
    let height = Int(size.height)

    var rawData = [CGFloat](repeating: 0, count: width * height * 4)
    let bytesPerPixel = 4
    let bytesPerRow = bytesPerPixel * width
    let bytesPerComponent = 8

    let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue & CGBitmapInfo.alphaInfoMask.rawValue

    let context = CGContext(data: &rawData,
                            width: width,
                            height: height,
                            bitsPerComponent: bytesPerComponent,
                            bytesPerRow: bytesPerRow,
                            space: colorSpace,
                            bitmapInfo: bitmapInfo)

    let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
    context?.draw(cgImage, in: drawingRect)

    let bound = rawData.count

    //find minimum and maximum
    var minPixel: CGFloat = 1.0
    var maxPixel: CGFloat = 0.0

    for x in 0..<width {
        for y in 0..<height {

            let byteIndex = (bytesPerRow * x) + y * bytesPerPixel

            if(byteIndex > bound - 4){
                break
            }
            minPixel = min(CGFloat(rawData[byteIndex]), minPixel)
            minPixel = min(CGFloat(rawData[byteIndex + 1]), minPixel)
            minPixel = min(CGFloat(rawData[byteIndex + 2]), minPixel)

            minPixel = min(CGFloat(rawData[byteIndex + 3]), minPixel)


            maxPixel = max(CGFloat(rawData[byteIndex]), maxPixel)
            maxPixel = max(CGFloat(rawData[byteIndex + 1]), maxPixel)
            maxPixel = max(CGFloat(rawData[byteIndex + 2]), maxPixel)

            maxPixel = max(CGFloat(rawData[byteIndex + 3]), maxPixel)
        }
    }

    let range = maxPixel - minPixel
    print("minPixel: \(minPixel)")
    print("maxPixel : \(maxPixel)")
    print("range: \(range)")

    for x in 0..<width {
        for y in 0..<height {
            let byteIndex = (bytesPerRow * x) + y * bytesPerPixel

            if(byteIndex > bound - 4){
                break
            }
            rawData[byteIndex] = (CGFloat(rawData[byteIndex]) - minPixel) / range
            rawData[byteIndex+1] = (CGFloat(rawData[byteIndex+1]) - minPixel) / range
            rawData[byteIndex+2] = (CGFloat(rawData[byteIndex+2]) - minPixel) / range

            rawData[byteIndex+3] = (CGFloat(rawData[byteIndex+3]) - minPixel) / range

        }
    }

    let cgImage0 = context!.makeImage()
    return UIImage.init(cgImage: cgImage0!)
}

Before normalization, we expect the pixel values range is 0 - 255 and after normalization, the pixel values range is 0 - 1.

The normalization formula is able to normalize pixel values to values between 0 and 1. But when we try to print out (simply add print statements when we loop through pixel values) the pixel values before normalization to verify we get the raw pixel values correct, we found out that the range of those values are off. For example, a pixel value have value as 3.506e+305 (larger than 255.) We think we get the raw pixel value wrong at the beginning.

We are not familiar with image processing in Swift and we are not sure if the whole normalization process is right. any help would be appreciated!


回答1:


A couple of observations:

  1. Your rawData is floating point, CGFloat, array, but your context isn’t populating it with floating point data, but rather with UInt8 data. If you want a floating point buffer, build a floating point context with CGBitmapInfo.floatComponents and tweak the context parameters accordingly. E.g.:

    func normalize() -> UIImage? {
        let colorSpace = CGColorSpaceCreateDeviceRGB()
    
        guard let cgImage = cgImage else {
            return nil
        }
    
        let width = cgImage.width
        let height = cgImage.height
    
        var rawData = [Float](repeating: 0, count: width * height * 4)
        let bytesPerPixel = 16
        let bytesPerRow = bytesPerPixel * width
        let bitsPerComponent = 32
    
        let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.floatComponents.rawValue | CGBitmapInfo.byteOrder32Little.rawValue
    
        guard let context = CGContext(data: &rawData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: bitsPerComponent,
                                      bytesPerRow: bytesPerRow,
                                      space: colorSpace,
                                      bitmapInfo: bitmapInfo) else { return nil }
    
        let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
        context.draw(cgImage, in: drawingRect)
    
        var maxValue: Float = 0
        var minValue: Float = 1
    
        for pixel in 0 ..< width * height {
            let baseOffset = pixel * 4
            for offset in baseOffset ..< baseOffset + 3 {
                let value = rawData[offset]
                if value > maxValue { maxValue = value }
                if value < minValue { minValue = value }
            }
        }
        let range = maxValue - minValue
        guard range > 0 else { return nil }
    
        for pixel in 0 ..< width * height {
            let baseOffset = pixel * 4
            for offset in baseOffset ..< baseOffset + 3 {
                rawData[offset] = (rawData[offset] - minValue) / range
            }
        }
    
        return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) }
    }
    
  2. But this begs the question of why you’d bother with floating point data. If you were returning this floating point data back to your ML model, then I can imagine it might be useful, but you’re just creating a new image. Because of that, you also have to opportunity to just retrieve the UInt8 data, do the floating point math, and then update the UInt8 buffer, and create the image from that. Thus:

    func normalize() -> UIImage? {
        let colorSpace = CGColorSpaceCreateDeviceRGB()
    
        guard let cgImage = cgImage else {
            return nil
        }
    
        let width = cgImage.width
        let height = cgImage.height
    
        var rawData = [UInt8](repeating: 0, count: width * height * 4)
        let bytesPerPixel = 4
        let bytesPerRow = bytesPerPixel * width
        let bitsPerComponent = 8
    
        let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue
    
        guard let context = CGContext(data: &rawData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: bitsPerComponent,
                                      bytesPerRow: bytesPerRow,
                                      space: colorSpace,
                                      bitmapInfo: bitmapInfo) else { return nil }
    
        let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
        context.draw(cgImage, in: drawingRect)
    
        var maxValue: UInt8 = 0
        var minValue: UInt8 = 255
    
        for pixel in 0 ..< width * height {
            let baseOffset = pixel * 4
            for offset in baseOffset ..< baseOffset + 3 {
                let value = rawData[offset]
                if value > maxValue { maxValue = value }
                if value < minValue { minValue = value }
            }
        }
        let range = Float(maxValue - minValue)
        guard range > 0 else { return nil }
    
        for pixel in 0 ..< width * height {
            let baseOffset = pixel * 4
            for offset in baseOffset ..< baseOffset + 3 {
                rawData[offset] = UInt8(Float(rawData[offset] - minValue) / range * 255)
            }
        }
    
        return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) }
    }
    

    I just depends upon whether you really needed this floating point buffer for your ML model (in which case, you might return the array of floats in the first example, rather than creating a new image) or whether the goal was just to create the normalized UIImage.

    I benchmarked this, and it was a tad faster on iPhone XS Max than the floating point rendition, but takes a quarter of the the memory (e.g. a 2000×2000px image takes 16mb with UInt8, but 64mb with Float).

  3. Finally, I should mention that vImage has a highly optimized function, vImageContrastStretch_ARGB8888 that does something very similar to what we’ve done above. Just import Accelerate and then you can do something like:

    func normalize3() -> UIImage? {
        let colorSpace = CGColorSpaceCreateDeviceRGB()
    
        guard let cgImage = cgImage else { return nil }
    
        var format = vImage_CGImageFormat(bitsPerComponent: UInt32(cgImage.bitsPerComponent),
                                          bitsPerPixel: UInt32(cgImage.bitsPerPixel),
                                          colorSpace: Unmanaged.passRetained(colorSpace),
                                          bitmapInfo: cgImage.bitmapInfo,
                                          version: 0,
                                          decode: nil,
                                          renderingIntent: cgImage.renderingIntent)
    
        var source = vImage_Buffer()
        var result = vImageBuffer_InitWithCGImage(
            &source,
            &format,
            nil,
            cgImage,
            vImage_Flags(kvImageNoFlags))
    
        guard result == kvImageNoError else { return nil }
    
        defer { free(source.data) }
    
        var destination = vImage_Buffer()
        result = vImageBuffer_Init(
            &destination,
            vImagePixelCount(cgImage.height),
            vImagePixelCount(cgImage.width),
            32,
            vImage_Flags(kvImageNoFlags))
    
        guard result == kvImageNoError else { return nil }
    
        result = vImageContrastStretch_ARGB8888(&source, &destination, vImage_Flags(kvImageNoFlags))
        guard result == kvImageNoError else { return nil }
    
        defer { free(destination.data) }
    
        return vImageCreateCGImageFromBuffer(&destination, &format, nil, nil, vImage_Flags(kvImageNoFlags), nil).map {
            UIImage(cgImage: $0.takeRetainedValue(), scale: scale, orientation: imageOrientation)
        }
    }
    

    While this employs a slightly different algorithm, it’s worth considering, because in my benchmarking, on my iPhone XS Max it was over 5 times as fast as the floating point rendition.


A few unrelated observations:

  1. Your code snippet is normalizing the alpha channel, too. I’m not sure you’d want to do that. Usually colors and alpha channels are independent. Above I assume you really wanted to normalize just the color channels. If you want to normalize alpha channel, too, then you might have a separate min-max range of values for alpha channels and process that separately. But it doesn’t make much sense to normalize alpha channel with the same range of values as for the color channels (or vice versa).

  2. Rather than using the UIImage width and height, I’m using the values from the CGImage. This is important distinction in case your images might not have a scale of 1.

  3. You might want to consider early-exit if, for example, the range was already 0-255 (i.e. no normalization needed).



来源:https://stackoverflow.com/questions/55433107/how-to-normalize-pixel-values-of-an-uiimage-in-swift

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!