Natural Language Processing — Neural Networks and Neural Language Models Lecture series — The XOR problem part 2
In the previous post, we spoke about the XOR problem and we also mentioned how neural networks could be used to solve the XOR problem. In this post, we will be talking about how we can formulate a neural network structure to solve the XOR problem.
The neural network structure that we will be discussing was formulated by Goodfellow and it is demonstrated in Fig. 1 below:
The architecture of the neural network in Fig. 1 is made up of an input layer, consisting of x1 and x2, a middle layer, consisting of h1 and h2, and an output layer consisting of y1. Note that the middle layer consists of 2 ReLU-based units.
Now, consider the following input table in Fig. 2 below:
When x1 is 0 and x2 is 0, in the middle layer unit of h2, the weights [1,1] are applied to the input values to obtain the following mathematical expression:
0(1) +0(1) = 0
After the evaluation of the mathematical expression above, the bias term is also added to yield the following expression:
0 + -1(+1) = -1
However since ReLU-based units produce a value of zero for all values that are negative in nature, the output value for h2 is zero (0).
This same process is replicated for the middle layer unit of h1 to yield a value of zero (0). Since the middle layer yields the values of [0,0], applying the values of the weights [1, -2] and the bias expression 0(+1), the output of the entire neural network will be zero (0).
When this entire process is replicated for all of the other input values of the table depicted in Fig. 2, the corresponding output values produced are that of the XOR operation.
In this example, the weights of the various layers and the bias terms were chosen for the purposes of illustration. In a real scenario, the neural network would have had to learn the values of the weights that produce the correct expected outputs through a process known as back-propagation.
In the next tutorial post, we shall look at the concept of feed-forward neural networks.