How to input the image to the neural network?

99封情书 提交于 2019-11-28 16:42:05

The easiest solution would be to normalize all of your images, both for training and testing, to have the same resolution. Also the character in each image should be about the same size. It is also a good idea to use greyscale images, so each pixel would give you just one number. Then you could use each pixel value as one input to your network. For instance, if you have images of size 16x16 pixels, your network would have 16*16 = 256 input neurons. The first neuron would see the value of the pixel at (0,0), the second at (0,1), and so on. Basically you put the image values into one vector and feed this vector into the network. This should already work.

By first extracting features (e.g., edges) from the image and then using the network on those features, you could perhaps increase the speed of learning and also make the detection more robust. What you do in that case is incorporating prior knowledge. For character recognition you know certain relevant features. So by extracting them as a preprocessing step, the network doesn't have to learn those features. However, if you provide the wrong, i.e. irrelevant, features, the network will not be able to learn the image --> character mapping.

The name for the problem you're trying to solve is "feature extraction". It's decidedly non-trivial and a subject of active research.

The naive way to go about this is simply to map each pixel of the image to a corresponding input neuron. Obviously, this only works for images that are all the same size, and is generally of limited effectiveness.

Beyond this, there is a host of things you can do... Gabor filters, Haar-like features, PCA and ICA, sparse features, just to name a few popular examples. My advice would be to pick up a textbook on neural networks and pattern recognition or, specifically, optical character recognition.

user1391128

All these considerations about applying NNs to images are covered in our 2002 review paper (Feature based, pixel based, scale invariance, etc.)

Your biggest challenge is the so-called 'curse of dimensionality'.

I would compare NN-performance with that of a support vector machine (tricky which kernels to use).

You can use as input the actual pixels. This is why sometimes it is preferable to use smaller resolution of the input images.

The nice thing about ANN is that they are somehow capable of feature selection (ignoring non-important pixels by assigning near-zero weights for those input nodes)

Egon

Here are some steps: make sure your color/ grey scale image is a binary image. To do this, perform some thresholding operation. following that some sort of feature extraction. For OCR / NN stuff this example might help, although in ruby : https://github.com/gbuesing/neural-net-ruby/blob/master/examples/mnist.rb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!