**Updated version** of this page on neural-networks.io.

## Simple perceptron

Let’s consider the following simple perceptron with a transfert function given by to keep the maths simple:

## Transfert function

The transfert function is given by:

## Error (or loss)

In artificial neural networks, the error we want to minimize is:

with:

- the error
- the expected output (from training data set)
- the real output of the network (from networt)

In practice and to simplify the maths, this error is divided by two:

## Gradient descent

The algorithm (gradient descent) used to train the network (i.e. updating the weights) is given by:

where:

- the weight before update
- the weight after update
- the learning rate

## Derivating the error

Let’s derivate the error:

Thanks to the chain rule [ ] the previous equation can be rewritten:

As :

## Updating the weights

The weights can be updated with the following formula:

In conclusion :