How can I use the Keras OCR example?

前端 未结 4 2100
囚心锁ツ
囚心锁ツ 2020-12-25 13:09

I found examples/image_ocr.py which seems to for OCR. Hence it should be possible to give the model an image and receive text. However, I have no idea how to do so. How do I

4条回答
  •  隐瞒了意图╮
    2020-12-25 13:48

    Now I have a model.h5. What's next?

    First I should comment that the model.h5 contains the weights of your network, if you wish to save the architecture of your network as well you should save it as a json like this example:

    model_json = model_json = model.to_json()
    with open("model_arch.json", "w") as json_file:
        json_file.write(model_json)
    

    Now, once you have your model and its weights you can load them on demand by doing the following:

    json_file = open('model_arch.json', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    loaded_model = model_from_json(loaded_model_json)
    # load weights into new model
    # if you already have a loaded model and dont need to save start from here
    loaded_model.load_weights("model.h5")    
    # compile loaded model with certain specifications
    sgd = SGD(lr=0.01)
    loaded_model.compile(loss="binary_crossentropy", optimizer=sgd, metrics=["accuracy"])
    

    Then, with that loaded_module you can proceed to predict the classification of certain input like this:

    prediction = loaded_model.predict(some_input, batch_size=20, verbose=0)
    

    Which will return the classification of that input.

    About the Side Questions:

    1. CTC seems to be a term they are defining in the paper you refered, extracting from it says:

    In what follows, we refer to the task of labelling un- segmented data sequences as temporal classification (Kadous, 2002), and to our use of RNNs for this pur- pose as connectionist temporal classification (CTC).

    1. To compensate the rotation of a document, images, or similar you could either generate more data from your current one by applying such transformations (take a look at this blog post that explains a way to do that ), or you could use a Convolutional Neural Network approach, which also is actually what that Keras example you are using does, as we can see from that git:

    This example uses a convolutional stack followed by a recurrent stack and a CTC logloss function to perform optical character recognition of generated text images.

    You can check this tutorial that is related to what you are doing and where they also explain more about Convolutional Neural Networks.

    1. Well this one is a broad question but to detect lines you could use the Hough Line Transform, or also Canny Edge Detection could be good options.

    Edit: The error you are getting is because it is expected more parameters instead of 1, from the keras docs we can see:

    predict(self, x, batch_size=32, verbose=0)
    

    Raises ValueError: In case of mismatch between the provided input data and the model's expectations, or in case a stateful model receives a number of samples that is not a multiple of the batch size.

提交回复
热议问题