Cheap and Secure Web Hosting Provider : See Now

[Solved]: Tackling overlearning neural network issue

, , No Comments
Problem Detail: 

I'm trying to train a neural network using this sort of data (for a homework): Number of Features : 42, Target data : 0 or 1, Number of Samples : 111 Individuals ( 69 Cases + 42 Controls )

However i'm facing an over-learning issue (the learning curve approaches the goal set, however the validation curve shows no improvement): the learning stops after less than 15 iterations because of a great number of validation fails: no improvement in the quality of validation tests. I managed to have great results by using 444 (4*111) examples in my training as I repeated the initial data four times. but i'm not sure if this is clean. In other words, is it alright to repeat existing examples to have more values for training ?

Thanks in advance.

Asked By : ryuzakinho

Answered By : D.W.

Yes, it is alright to repeat training data, as long as each one is repeated an equal number of times: this does not violate assumptions. That said, it smells to me like a kludge. If you have no theoretical understanding of why the approach was failing without that adjustment, then just randomly tweaking things (like repeating points) in hopes that it will happen to succeed does not seem like the right approach.

Be careful that you don't do cross-validation on the result after repeating samples. You need to separate the test set from the training set before repeating. If you repeat samples, then do cross-validation, there is a high likelihood that most or all samples in the test set will also appear in the training set (due to the repetition) and thus you'll be over-fitting -- the cross-validation results won't be meaningful in that case.

Best Answer from StackOverflow

Question Source :

3.2K people like this

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback