MLKit Text detection on iOS working for photos taken from Assets.xcassets, but not the same photo taken on camera/uploaded from camera roll

后端 未结 1 465
鱼传尺愫
鱼传尺愫 2021-01-22 04:00

I\'m using Google\'s Text detection API from MLKit to detect text from images. It seems to work perfectly on screenshots but when I try to use it on images taken in the app (usi

1条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-01-22 04:35

    If your imagepicker is working fine, the problem can be with the image orientation. For a quick test, you can capture multiple images in different orientation and see if it works.

    My problem was the text recognition working from image picked from gallery but not from the camera. That was orientation issue.

    Solution 1

    Before converting into vision image, fix the image orientation as follows.

    let fixedImage = pickedImage.fixImageOrientation()
    

    Add this extension.

    extension UIImage {
        func fixImageOrientation() -> UIImage {
            UIGraphicsBeginImageContext(self.size)
            self.draw(at: .zero)
            let fixedImage = UIGraphicsGetImageFromCurrentImageContext()
            UIGraphicsEndImageContext()
            return fixedImage ?? self
        } }
    

    Solution 2

    Firebase documentation provide a method to fix for all orientation.

    func imageOrientation(
        deviceOrientation: UIDeviceOrientation,
        cameraPosition: AVCaptureDevice.Position
        ) -> VisionDetectorImageOrientation {
        switch deviceOrientation {
        case .portrait:
            return cameraPosition == .front ? .leftTop : .rightTop
        case .landscapeLeft:
            return cameraPosition == .front ? .bottomLeft : .topLeft
        case .portraitUpsideDown:
            return cameraPosition == .front ? .rightBottom : .leftBottom
        case .landscapeRight:
            return cameraPosition == .front ? .topRight : .bottomRight
        case .faceDown, .faceUp, .unknown:
            return .leftTop
        }
    }
    

    Create metada:

    let cameraPosition = AVCaptureDevice.Position.back  // Set to the capture device you used.
    let metadata = VisionImageMetadata()
    metadata.orientation = imageOrientation(
        deviceOrientation: UIDevice.current.orientation,
        cameraPosition: cameraPosition
    )
    

    Set metadata to vision image.

    let image = VisionImage(buffer: sampleBuffer)
    image.metadata = metadata
    

    0 讨论(0)
提交回复
热议问题