Simplest perceptron update rules demonstration

Updated version of this page on neural-networks.io.

Simple perceptron

Let’s consider the following simple perceptron with a transfert function given by f(x)=x to keep the maths simple:

Transfert function

The transfert function is given by:

 y= w_1.x_1 + w_2.x_2 + ... + w_N.x_N = \sum\limits_{i=1}^N w_i.x_i

Error (or loss)

In artificial neural networks, the error we want to minimize is:

 E=(y

with:

  • E the error
  • y the expected output (from training data set)
  • y the real output of the network (from networt)

In practice and to simplify the maths, this error is divided by two:

 E=\frac{1}{2}(y

Gradient descent

The algorithm (gradient descent) used to train the network (i.e. updating the weights) is given by:

 w_i

where:

  • w_i the weight before update
  • w_i the weight after update
  • \eta the learning rate

Derivating the error

Let’s derivate the error:

 \frac{dE}{dw_i} = \frac{1}{2}\frac{d}{dw_i}(y

Thanks to the chain rule [  (f \circ g) ] the previous equation can be rewritten:

 \frac{dE}{dw_i} = \frac{2}{2}(y

As  y= w_1.x_1 + w_2.x_2 + ... + w_N.x_N :

 \frac{dE}{dw_i} = -(y

Updating the weights

The weights can be updated with the following formula:

 w_i

In conclusion :

 w_i

Leave a Reply

Your email address will not be published. Required fields are marked *