In the previous post, we looked at what primal and dual problems were. In this post, we will be looking at a theorem which expresses the inferences that we can derive from the relationship between the dual and the primal linear programming problems. This theorem is called the ‘duality’ theorem.
The main objective of this post is to gain an understanding of the duality theorem.
Can we draw any conjectures from the relationship between primal problems and their corresponding dual problems?
The answer to this question is yes and the duality theorem aids us in doing…
In this post, we will be learning about the concept of feed-forward neural networks and the mathematical representation of this concept.
Feed-Forward Neural Network:
A feed-forward neural network is simply a multi-layer network of neural units in which the outputs from the units in each layer are passed to the units in the next higher layer. These networks do not have any cycles within them. That is, the outputs within the network do not flow in a cyclical manner.
To gain a refreshed understanding of what neural units are and how they work, you can read about them here.
In the previous post, I explained the various what the various KKT conditions were about with the exception of the ‘complementary slackness’ condition. In the subsequent series of posts, I intend to break down this concept to enable you to have maximum understanding. To begin this quest, I will be first be talking about the concept of ‘Primal’ and ‘Dual’ linear programming problems.
To have an understanding of the complementary slackness condition, I will first be explaining the concept of ‘Primal’ and ‘Dual’ problems.
What are the ‘primal’ and ‘dual’ natures of linear programming problems?
In the previous post, we spoke about the XOR problem and we also mentioned how neural networks could be used to solve the XOR problem. In this post, we will be talking about how we can formulate a neural network structure to solve the XOR problem.
The neural network structure that we will be discussing was formulated by Goodfellow and it is demonstrated in Fig. 1 below:
The architecture of the neural network in Fig. 1 is made up of an input layer, consisting of x1 and x2, a middle layer, consisting of h1 and h2, and an output layer…
In the previous post, we looked at the various conditions that make up the Karush-Kahn-Tucker (KKT) conditions. In this post, we would explore each of those conditions and explain what they mean.
Understand the main ideas behind the various KKT conditions.
What do the following KKT conditions mean:
The stationary condition:
The stationary condition just states that the selected point must be a stationary point. A stationary point refers to the point at which the function stops increasing or decreasing. When…
In the previous post, we spoke about the Wolfe dual problem which simply expressed another way in which we could express the SVM optimisation problem. However in this post, we will be looking at the conditions that need to be met before we can declare that the solution obtained from solving this optimisation problem is optimal in nature. These conditions are referred to as Karush-Kuhn-Tucker (KKT) conditions. They are referred to as “Karush-Kuhn-Tucker" conditions because they were first published in 1951 by Harold W. Kuhn and Albert W. …
In the previous post, we spoke about how hard it was to solve an SVM optimisation problem in the initial way in which it was expressed. We also spoke about how to re-write the optimisation problem into a way in which it would be much easier to solve. The alternative that we suggested involved looking at the SVM optimisation problem through the lens of a ‘min max’ problem instead of through the lens of just a ‘min’ problem. …
In this post, we will be learning about the XOR problem and how it can be solved.
Consider the following image in Fig. 1:
The XOR problem is that, we can formulate perceptrons that could take in as inputs the values of x1 and x2 for the operations of AND and OR and generate their corresponding y outputs but it would be impossible to formulate a perceptron that could take in as inputs the values of x1 and x2 and then generate their corresponding y outputs for the operation of XOR.
Note: A perceptron is a neural unit that has…
In the previous post, I spoke about how to apply the concept of Lagrange multipliers to solve an SVM optimisation problem. In that post, I spoke about how hard it was to solve the derived Lagrange function for an example SVM optimisation problem. In this post, we will be talking about why it is hard to solve such a problem and we will be talking about how to go about solving such a problem.
Understand how hard it is to solve a Lagrange function for an SVM optimisation problem and how to go about solving such a problem.
In the previous post, we talked about how to apply the concept of Lagrange multipliers to solve optimisation problems. In this post, we will be talking about how to apply the concept of Lagrange multipliers to solve an SVM optimisation problem.
Understand how to apply Lagrange multipliers to solve SVM optimisation problems.
How would you apply the concept of Lagrange multipliers to solve the SVM optimisation depicted in Fig. 1 below:
Well, the first step would be to introduce a Lagrange function. Like what is depicted in Fig. 2 below: