Neural Network Error oscillating with each training example

前端 未结 1 413
悲哀的现实
悲哀的现实 2021-01-14 21:23

I\'ve implemented a back-propagating neural network and trained it on my data. The data alternates between sentences in English & Africaans. The neural network is suppos

1条回答
  •  执笔经年
    2021-01-14 21:53

    I agree with the comments that this model is probably not the best for your classification problem but if you are interested in trying to get this to work I will give you the reason I think this oscillates and the way that I would try and tackle this problem.

    From my understanding of your question and comments, I cannot understand what the network actually “learns” in this instance. You feed letters in (is this the number of times that the letter occurs in the sentence?) and you force it to map to an output. Let’s say you just use English now and English corresponds to an output of 1. So you “train” it on one sentence and for argument’s sake it chooses the letter “a” as the determining input, which is quite a common letter. It sets the network weight such that when it sees “a” the output is 1, and all other letter inputs get weighted down such that they don’t influence the output. It might not be so black and white, but it could be doing something very similar. Now, every time you feed another English sentence in, it only has to see an “a” to give a correct output. Doing the same for just Africaans as an output of zero, it maps “a” to zero. So, every time you alternate between the two languages, it completely reassigns the weightings... you’re not building on a structure. The back-propagation of error is basically always a fixed value because there are no degrees of rightness or wrongness, it’s one or the other. So I would expect it to oscillate exactly as you are seeing.

    EDIT: I think this boils down to something like the presence of letters being used to classify the language category and expecting one of two polar outputs, rather than anything about the relationship between the letters that defines the language.

    On a conceptual level, I would have a full pre-processing stage to get some statistics. Off the top of my head, I might calculate (I don’t know the language): - The ratio of letter “a” to “c” occuring in sentence - The ratio of letter “d” to “p” occuring in sentence - The average length of word in the sentence

    Do this for 50 sentences of each language. Feed all the data in at once and train on the whole set (70% for training, 15% for validation, 15% for testing). You cannot train a network on a single value each time (as I think you are doing?), it needs to see the whole picture. Now your output is not so black and white, it has the flexibility to map to a value that is between 0 and 1, not absolutes each time. Anything above 0.5 is English, below 0.5 is Africaans. Start with, say, 10 statistical parameters for the languages, 5 neurons in hidden layer, 1 neuron in the output layer.

    0 讨论(0)
提交回复
热议问题