I\'m trying to train tesseract to recognize numbers from real images of gas meters.
The images that I use for training are made with a camera, for this reason there are
I would try this simple ImageMagick command first:
convert \
original.jpg \
-threshold 50% \
result.jpg
(Play a bit with the 50%
parameter -- try with smaller and higher values...)
Thresholding basically leaves over only 2 values, zero or maximum, for each color channel. Values below the threshold get set to 0, values above it get set to 255 (or 65535 if working at 16-bit depth).
Depending on your original.jpg, you may have a OCR-able, working, very high contrast image as a result.