**Problem Detail:**

My neural net is having trouble switching the sign of a weight. The issue is that the deltas applied to the weight are proportional to that weight, so when it gets closer to zero, the deltas become smaller and are never sufficient to get it past that point. I tried adding a momentum term with 5% of the previous iteration contributing to the current iteration without success.

In my example I have a very simple neural net with one input and one output and no hidden layer. The transfer function in the neural net is a sigmoid, and the function I am trying to learn is also a sigmoid. Specifically: $y = S(10 x - 5)$ where S is the sigmoid function. The neural net gets to a point where its internal weights are essentially generating a function very close to $y = S(2x + 0)$. At this point it gets stuck because the fitted function is sometimes above and sometimes below the desired function, but the weight on $x$ needs to go up and the weight on the constant term needs to go down. I ran it for 1,000 epochs and 100,000 epochs and it stays in the same place. The initial weights were 0.6 and 0.2, so it made improvements over that.

How can a net get over this hump? Should I not be scaling the weight delta by the weight value?

Here's the reference I'm using for computing the weights for back-propagation. I am using a learning rate of 0.05 and apply the deltas after each training example. The x-values (input) used in training are 0.00, 0.01, 0.02, ... 0.99, 1.00. I always play them in order.

Thanks for your help.

Note: I originally posted this on stack overflow and was directed here.

###### Asked By : MattD

#### Answered By : BartoszKP

Your problem stems from the fact that the equations in the link you provide refer to *linear* neural networks. Your network is not linear, as it has a sigmoid activation function. So you need to include its gradient in weights correction. Wikipedia explains how to do it pretty well. Here is an example Python script, which learns your target function in 1000 epochs achieving average error of order $10^{-3}$ in exactly the same conditions as you describe. Increasing the number of epochs makes the error smaller.

**Note:** this is **NOT** a recommended implementation of a neural network - just a demonstration script written on a whim.

`import math def sigmoid(x): return 1.0 / (1.0 + math.exp(-x)) class Network: def __init__(self, w, b): self.w = w self.b = b def __call__(self, x): return sigmoid(self.w * x + self.b) def propagateError(self, x, error, output): dEdW = error * output * (1 - output) * x dEdB = error * output * (1 - output) self.w += 0.05 * dEdW self.b += 0.05 * dEdB def target(x): return sigmoid(10 * x - 5) n = Network(0.6, 0.2) # Training for it in range(1000): for i in range(100): x = 1.0 * i / 100 output = n(x) t = target(x) error = t - output n.propagateError(x, error, output) # Testing te = 0 for i in range(100): x = 1.0 * i / 100 output = n(x) t = target(x) error = t - output print("Desired:",t) print("Actual:",output) print("Error:",error) te += error print("---") print("Average error:",te/100.0) `

Output:

`('Desired:', 0.0066928509242848554) ('Actual:', 0.012425919415559344) ('Error:', -0.005733068491274489) --- ('Desired:', 0.007391541344281971) ('Actual:', 0.013550016530277079) ('Error:', -0.006158475185995108) ... ('Desired:', 0.9918374288468401) ('Actual:', 0.9855607114050042) ('Error:', 0.006276717441835888) --- ('Desired:', 0.9926084586557181) ('Actual:', 0.9867575913715647) ('Error:', 0.005850867284153405) --- ('Average error:', -0.0013496485567271077) `

###### Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/23209

**3.2K people like this**

## 0 comments:

## Post a Comment

Let us know your responses and feedback