问题
We are attempting to normalize an UIImage
so that it can be passed correctly into a CoreML model.
The way we are retrieving the RGB values from each pixel is by first initializing a [CGFloat]
array called rawData
of values for each pixel such that there is a position for the colors Red, Green, Blue and the alpha value. In bitmapInfo
, we get the raw pixel values from the original UIimage itself and conduct. This is used to fill the bitmapInfo
paramter in context
, a CGContext
variable. We will later used the context
variable to draw
a CGImage
that will later convert the normalized CGImage
back into a UIImage
.
Using a nested for-loop iterating through x
and y
coordinates, the minimum and maximum pixel color values among all colors (found through the CGFloat
's raw data array) across all the pixels are found.
A bound variable is set to terminate the for loop, otherwise, it will has out of range error.
range
indicates the range of possible RGB values (ie. the difference between the maximum color value and the minimum).
Using the equation to normalize each pixel value:
A = Image
curPixel = current pixel (R,G, B or Alpha)
NormalizedPixel = (curPixel-minPixel(A))/range
and a similar designed nested for loop from above to parse through the array of rawData
and modify each pixel's colors according to this normalization.
Most of our codes are from:
- UIImage to UIColor array of pixel colors
- Change color of certain pixels in a UIImage
- https://gist.github.com/pimpapare/e8187d82a3976b851fc12fe4f8965789
We use CGFloat
instead of UInt8
because the normalized pixel values should be real numbers that between 0 and 1, not either 0 or 1.
func normalize() -> UIImage?{
let colorSpace = CGColorSpaceCreateDeviceRGB()
guard let cgImage = cgImage else {
return nil
}
let width = Int(size.width)
let height = Int(size.height)
var rawData = [CGFloat](repeating: 0, count: width * height * 4)
let bytesPerPixel = 4
let bytesPerRow = bytesPerPixel * width
let bytesPerComponent = 8
let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Big.rawValue & CGBitmapInfo.alphaInfoMask.rawValue
let context = CGContext(data: &rawData,
width: width,
height: height,
bitsPerComponent: bytesPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: bitmapInfo)
let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height))
context?.draw(cgImage, in: drawingRect)
let bound = rawData.count
//find minimum and maximum
var minPixel: CGFloat = 1.0
var maxPixel: CGFloat = 0.0
for x in 0..<width {
for y in 0..<height {
let byteIndex = (bytesPerRow * x) + y * bytesPerPixel
if(byteIndex > bound - 4){
break
}
minPixel = min(CGFloat(rawData[byteIndex]), minPixel)
minPixel = min(CGFloat(rawData[byteIndex + 1]), minPixel)
minPixel = min(CGFloat(rawData[byteIndex + 2]), minPixel)
minPixel = min(CGFloat(rawData[byteIndex + 3]), minPixel)
maxPixel = max(CGFloat(rawData[byteIndex]), maxPixel)
maxPixel = max(CGFloat(rawData[byteIndex + 1]), maxPixel)
maxPixel = max(CGFloat(rawData[byteIndex + 2]), maxPixel)
maxPixel = max(CGFloat(rawData[byteIndex + 3]), maxPixel)
}
}
let range = maxPixel - minPixel
print("minPixel: \(minPixel)")
print("maxPixel : \(maxPixel)")
print("range: \(range)")
for x in 0..<width {
for y in 0..<height {
let byteIndex = (bytesPerRow * x) + y * bytesPerPixel
if(byteIndex > bound - 4){
break
}
rawData[byteIndex] = (CGFloat(rawData[byteIndex]) - minPixel) / range
rawData[byteIndex+1] = (CGFloat(rawData[byteIndex+1]) - minPixel) / range
rawData[byteIndex+2] = (CGFloat(rawData[byteIndex+2]) - minPixel) / range
rawData[byteIndex+3] = (CGFloat(rawData[byteIndex+3]) - minPixel) / range
}
}
let cgImage0 = context!.makeImage()
return UIImage.init(cgImage: cgImage0!)
}
Before normalization, we expect the pixel values range is 0 - 255 and after normalization, the pixel values range is 0 - 1.
The normalization formula is able to normalize pixel values to values between 0 and 1. But when we try to print out (simply add print statements when we loop through pixel values) the pixel values before normalization to verify we get the raw pixel values correct, we found out that the range of those values are off. For example, a pixel value have value as 3.506e+305 (larger than 255.) We think we get the raw pixel value wrong at the beginning.
We are not familiar with image processing in Swift and we are not sure if the whole normalization process is right. any help would be appreciated!
回答1:
A couple of observations:
Your
rawData
is floating point,CGFloat
, array, but your context isn’t populating it with floating point data, but rather withUInt8
data. If you want a floating point buffer, build a floating point context withCGBitmapInfo.floatComponents
and tweak the context parameters accordingly. E.g.:func normalize() -> UIImage? { let colorSpace = CGColorSpaceCreateDeviceRGB() guard let cgImage = cgImage else { return nil } let width = cgImage.width let height = cgImage.height var rawData = [Float](repeating: 0, count: width * height * 4) let bytesPerPixel = 16 let bytesPerRow = bytesPerPixel * width let bitsPerComponent = 32 let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.floatComponents.rawValue | CGBitmapInfo.byteOrder32Little.rawValue guard let context = CGContext(data: &rawData, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else { return nil } let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height)) context.draw(cgImage, in: drawingRect) var maxValue: Float = 0 var minValue: Float = 1 for pixel in 0 ..< width * height { let baseOffset = pixel * 4 for offset in baseOffset ..< baseOffset + 3 { let value = rawData[offset] if value > maxValue { maxValue = value } if value < minValue { minValue = value } } } let range = maxValue - minValue guard range > 0 else { return nil } for pixel in 0 ..< width * height { let baseOffset = pixel * 4 for offset in baseOffset ..< baseOffset + 3 { rawData[offset] = (rawData[offset] - minValue) / range } } return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) } }
But this begs the question of why you’d bother with floating point data. If you were returning this floating point data back to your ML model, then I can imagine it might be useful, but you’re just creating a new image. Because of that, you also have to opportunity to just retrieve the
UInt8
data, do the floating point math, and then update theUInt8
buffer, and create the image from that. Thus:func normalize() -> UIImage? { let colorSpace = CGColorSpaceCreateDeviceRGB() guard let cgImage = cgImage else { return nil } let width = cgImage.width let height = cgImage.height var rawData = [UInt8](repeating: 0, count: width * height * 4) let bytesPerPixel = 4 let bytesPerRow = bytesPerPixel * width let bitsPerComponent = 8 let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue guard let context = CGContext(data: &rawData, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else { return nil } let drawingRect = CGRect(origin: .zero, size: CGSize(width: width, height: height)) context.draw(cgImage, in: drawingRect) var maxValue: UInt8 = 0 var minValue: UInt8 = 255 for pixel in 0 ..< width * height { let baseOffset = pixel * 4 for offset in baseOffset ..< baseOffset + 3 { let value = rawData[offset] if value > maxValue { maxValue = value } if value < minValue { minValue = value } } } let range = Float(maxValue - minValue) guard range > 0 else { return nil } for pixel in 0 ..< width * height { let baseOffset = pixel * 4 for offset in baseOffset ..< baseOffset + 3 { rawData[offset] = UInt8(Float(rawData[offset] - minValue) / range * 255) } } return context.makeImage().map { UIImage(cgImage: $0, scale: scale, orientation: imageOrientation) } }
I just depends upon whether you really needed this floating point buffer for your ML model (in which case, you might return the array of floats in the first example, rather than creating a new image) or whether the goal was just to create the normalized
UIImage
.I benchmarked this, and it was a tad faster on iPhone XS Max than the floating point rendition, but takes a quarter of the the memory (e.g. a 2000×2000px image takes 16mb with
UInt8
, but 64mb withFloat
).Finally, I should mention that vImage has a highly optimized function, vImageContrastStretch_ARGB8888 that does something very similar to what we’ve done above. Just
import Accelerate
and then you can do something like:func normalize3() -> UIImage? { let colorSpace = CGColorSpaceCreateDeviceRGB() guard let cgImage = cgImage else { return nil } var format = vImage_CGImageFormat(bitsPerComponent: UInt32(cgImage.bitsPerComponent), bitsPerPixel: UInt32(cgImage.bitsPerPixel), colorSpace: Unmanaged.passRetained(colorSpace), bitmapInfo: cgImage.bitmapInfo, version: 0, decode: nil, renderingIntent: cgImage.renderingIntent) var source = vImage_Buffer() var result = vImageBuffer_InitWithCGImage( &source, &format, nil, cgImage, vImage_Flags(kvImageNoFlags)) guard result == kvImageNoError else { return nil } defer { free(source.data) } var destination = vImage_Buffer() result = vImageBuffer_Init( &destination, vImagePixelCount(cgImage.height), vImagePixelCount(cgImage.width), 32, vImage_Flags(kvImageNoFlags)) guard result == kvImageNoError else { return nil } result = vImageContrastStretch_ARGB8888(&source, &destination, vImage_Flags(kvImageNoFlags)) guard result == kvImageNoError else { return nil } defer { free(destination.data) } return vImageCreateCGImageFromBuffer(&destination, &format, nil, nil, vImage_Flags(kvImageNoFlags), nil).map { UIImage(cgImage: $0.takeRetainedValue(), scale: scale, orientation: imageOrientation) } }
While this employs a slightly different algorithm, it’s worth considering, because in my benchmarking, on my iPhone XS Max it was over 5 times as fast as the floating point rendition.
A few unrelated observations:
Your code snippet is normalizing the alpha channel, too. I’m not sure you’d want to do that. Usually colors and alpha channels are independent. Above I assume you really wanted to normalize just the color channels. If you want to normalize alpha channel, too, then you might have a separate min-max range of values for alpha channels and process that separately. But it doesn’t make much sense to normalize alpha channel with the same range of values as for the color channels (or vice versa).
Rather than using the
UIImage
width and height, I’m using the values from theCGImage
. This is important distinction in case your images might not have a scale of 1.You might want to consider early-exit if, for example, the range was already 0-255 (i.e. no normalization needed).
来源:https://stackoverflow.com/questions/55433107/how-to-normalize-pixel-values-of-an-uiimage-in-swift