Natural Language Processing — Neural Networks and Neural Language Models Lecture series — The neural unit

In this post, we will be learning about the following:

  1. The concept of a neural network
  2. The neural computing unit in a neural network
  3. How the neural computing units actually work
  4. Activation functions

The concept of a neural network:

A neural network is simply a network of neural computing units, each of which takes in a vector of inputs and produces a single output value. The image in Fig. 1 below demonstrates a graphical representation of a neural network:

Fig. 1

The neural computing unit in a neural network:

A neural computing unit is the fundamental building block of a neural network.

Fig. 2

The neural computational unit takes in a vector of input values, performs some computation on them and produces an output value.

How neural computing units actually work:

When neural computing units receive a vector of input values, they perform a weighted sum on these input values and then they add a bias term to the result of this weighted sum. The result of this computation is then passed unto a non-linear function, known as an activation function, to finally produce an output value.

For instance, suppose a neural computational unit contains the following weights, w = [0.2, 0.3, 0.2, 0.1], and the bias term 0.5. When it receives an input vector containing the values, x = [0.5, 0.1, 0.4, 0.2], the following weighted sum would first be performed:

(0.2 * 0.5 + 0.3 * 0.1 + 0.2 * 0.4 + 0.1 * 0.2)

The result of the above weighted sum is 0.23.

The bias term of 0.5 is then added to the result of the weighted sum to give 0.73, which is then supplied to the activation function of the neural computing unit to produce an output value.

Activation functions:

As stated earlier on, activation functions are simply non-linear functions that convert the values they receive into output values for the neural computational units. The 3 popular activation functions that we will be looking at in this post are the sigmoid activation function, the tanh activation function and the rectified linear ReLU activation function).

The sigmoid activation function: The sigmoid activation function takes in an input value and then returns an output value which is between the values of 0 and 1.

The mathematical function below represents the sigmoid activation function, where z represents the input value received by the function:

The tanh activation function: The tanh activation function takes in an input value and then returns an output value which is between the values of -1 and +1.

The mathematical function below represents the tanh activation function where z represents the input value received by the function:

The rectified linear ReLU activation function: The rectified linear ReLU activation function takes in an input value and then it returns that same input value if it is a positive number, otherwise it returns the value 0.

The mathematical function below represents the represents the rectified linear ReLU activation function:

In the next post, we will be talking about the XOR problem.