Support Vector Machines — Lecture series — Lagrange multipliers part 2

David Sasu
3 min readMay 7, 2021

In the previous post, I explained the main idea behind the concept of Lagrange multipliers. In this post, I will be diving deeper into how this concept is applied to solve optimisation problems.

Learning objective:

Understand how to apply Lagrange multipliers to solve optimisation problems.

Main question:

We know from the previous post that, according to the concept of Lagrange multipliers, the value of the unknown variable x that solves the optimisation problem can be found at the point where the gradient of the function being optimised (f ) points in the same direction as the gradient of the function serving as the constraint in the optimisation problem (g). That is:

NOTE: The “upside-down triangle” in the equation above means gradient.

But this equation brings about some questions that need to be answered. What is the meaning of the alpha sign in the equation? and how can we apply this equation to solve for the value of x?

Let’s begin with the first question, the meaning of the alpha sign in the equation. Consider the image in Fig. 1 below:

Fig. 1

NB: The image is not drawn to scale and I am using it only to explain a concept

From basic algebra, we understand that a gradient is the measure of a change in y-axis with respect to a corresponding change in the x-axis. From Fig. 1, the gradient of g(x) is indicated by the change in the point from A to B and the gradient of f(x) is indicated by the change in the point from A to C.

We can observe from Fig. 1 that the gradient of g(x) and f(x) both point in the same direction. We can also observe that for the gradient of g(x) to be considered equal to the gradient of f(x), the gradient of g(x) has to be multiplied by a factor of 2. This factor of 2 may represent the value of the alpha sign in the expression of the concept of Lagrange multipliers. In actual fact, the alpha sign is what is referred to as a “Lagrange multiplier”. This is because by multiplying the gradient of the constraint with alpha, we ensure that it meets the gradient of the optimisation problem function at the same point.

With that understanding, let’s look at the second question we posed before we began, which is how we can apply this to solve for the unknown value of x.

If we define a Lagrange function:

Fig. 2

Then its gradient is:

Fig. 3

Therefore solving for which values of x and alpha cause the gradient of the Lagrange function to be zero allow us to find the “maximum” or the “minimum” of function.

Okay, let’s put everything together:

  1. Construct the Lagrange function L by introducing one Lagrange multiplier per constraint.
  2. Get the gradient of the Lagrange function.
  3. Solve for the when the gradient of the Lagrange function is equal to zero.

I shall be going over these steps again in the subsequent posts, so if you don’t get it yet don’t stress about it :). In the next post, I will be talking about the SVM Lagrangian problem.

--

--