Support Vector Machines — Lecture series — An introduction to the problem of optimisation
In the previous post, we learnt about the geometric margin and how to mathematically derive it. In this post, we will be introducing the concept of an optimisation problem and how it relates to the geometric margin.
Understand what an optimisation problem is and how it relates to the geometric margin.
Consider the two images in Fig. 1 and Fig. 2 below:
In Fig. 1 and Fig. 2, we see 2 different hyperplanes, with different geometric margins, for the same linearly separable dataset. Which geometric margin, for which hyperplane, best separates the data points?
We know from our previous lecture posts that, the best hyperplane that separates the data points is the hyperplane that has the largest geometric margin. We also know that the size of the geometric margin is determined by the values of ‘w’ and ‘b’ in the geometric margin formula. But then the main question is, how do we find the values of ‘w’ and ‘b’ which generate the largest geometric margin?
To find ‘w’ and ‘b’, we need to solve what is called an optimisation problem.
Given some mathematical function f(x), the main goal of an optimisation problem is to find the value of x that either generates the maximum value for the function f or find the value of x that generates the minimum value for the function f.
Let’s just take this example to make the concept of an optimisation problem clearer:
Supposing we have some function f(x) = x² (Check out the graphical representation of this function in Fig. 3):
If our stated optimisation problem is to find the value of x that minimises the function f(x), then we can solve that problem by looking at the graph to find out where f(x) generates the lowest value and returning the value of x at that point.
In the next post, we shall look at how to mathematically solve optimisation problems and then we shall look at how to solve them in relation to the variables ‘w’ and ‘b’ in the geometric margin.