**Recap of Perceptron **

You already know that the basic unit of a neural network is a network that has just a single node, and this is referred to as the **perceptron**.

The perceptron is made up of inputs x_{1}, x_{2}, …, x_{n} their corresponding weights w_{1}, w_{2}, …, w_{n}. A function known as activation function takes these inputs, multiplies them with their corresponding weights and produces an output y.

Figure 1: A Perceptron (‘single-layer’ perceptron) |

**What is Multilayer Perceptron?**

A multilayer perceptron is a class of neural network that is made up of at least 3 nodes. So now you can see the difference. Also, each of the node of the multilayer perceptron, except the input node is a neuron that uses a* non-linear activation function*.

The nodes of the multilayer perceptron are arranged in layers.

- The input layer
- The output layer
- Hidden layers: layers between the input and the output

Also note the the learning algorithm for the multilayer perceptron is known as back propagation(explained here).

**How the Multilayer Perceptron Works**

In MLP, the neurons use non-linear activation functions that is designed to model the behavior of the neurons in the human brain.

An multi-layer perceptron has a linear activation function in all its neuron and uses backpropagation for its training.

**About Activation Functions **

The activation function combines the input to the neuron with the weights and then adds a bias to produce the output. In other words, the activation function maps the weighted inputs to the output of the neuron.

One of such activation functions is the sigmoid function which is used to determine the output of the neuron. An example of a sigmoid function is the **logistic function** which is shown below

Another example of a sigmoid function, is the **hyperbolic tangent** activation function shown below which produces an output ranging between -1 and 1

**Applying Activation Function to MLP**

With activation function, we can calculate the output of any neuron in the MLP. Assuming **w** denotes the vector of weights, **x** is the vector of inputs, *b* is the bias and ϕ is the activation function, the for the i^{th}, neuron, the output *y* would be given by:

An MLP is made up of a set of nodes which forms the input layer, one or more hidden layers, and an output layer.

**Layers of Multilayer Perceptron(Hidden Layers)**

Remember that from the definition of multilayer perceptron, there must be one or more hidden layers. This means that in general, the layers of an MLP should be a minimum of three layers, since we have also the input and the output layer. This is illustrated in the figure below.

Also to note is that the function activating these hidden layers has to be non-linear function (activation function) as discussed in previous section/

**Training/Learning in Multilayer Perceptrons**

The training process of the MLP occurs by continuous adjustment of the weights of the connections after each processing. This adjustment is based on the error in output(which is the different between the expected result and the output). This continuous adjustment of the weights is a supervised learning process called ‘backpropagation’.

The backpropagation algorithm consists of two parts:

- forward pass
- backward pass

In the forward pass, the expect output corresponding to the given inputs are evaluated

In the backward pass, partial derivatives of the cost function with respects to the different parameters are propagated back through the network.

The process continues until the error is at the lowest value.

(A detailed lesson on backpropagation is found here.)

Another learning method for the multilayer perceptron is the Stochastic Gradient Descent) which is explained in details in another tutorial.

Thanks for reading and if you have any questions, do drop in the comment box below