Support Vector Machines — Lecture series — The SVMs optimisation problem

In the previous post, we spoke about how to solve a constrained optimization problem. In this post, we will be looking at how to apply this knowledge to find the most optimum separating hyperplane when we are given a set of linearly separable data points.

Learning objective:

Understand how to formulate the task of finding the optimum separating hyperplane into an optimisation problem.

Main question:

We know from the previous lecture posts that the most optimum separating hyperplane of a given set of linearly separable data points is defined by the normal vector w and bias b for which the geometric margin is the largest. So the main question is, how can we use this knowledge to formulate an optimisation problem that can help us to find the right values of w and b?

Well, since we are looking for the values of wand b that maximise the geometric margin (M), we can confidently state that the first segment of our optimisation problem statement can be written as this:

However, we know that the hyperplane with the optimum geometric margin should satisfy all of the different data points and not just one faction of the points. Hence a suitable constraint for the optimisation problem statement above is:

Where yi refers to the geometric margin for each data example in our set. Hence the complete optimisation problem statement is:

But this optimisation problem is very difficult to solve. In the next post, we will look at why this problem is difficult to solve and how we can get around it.