Support Vector Machine
- Finding a separating hyperplane
- Must correctly classify the data
- The closest points have the greatest distance to it — Maximize the margin width
- Advantages
- Insensitive to high dimensionality
- Insensitive to class imbalance
- Disadvantages
- Data may not have a clear boundary
- Linear hyperplane doesn’t exist in non-linear data.
sklearn.svm.LinearSVC: a parameterCto dictate tradeoff between margin and correct classification. The larger, the narrower the margin.- Use kernel function to map data points to a higher dimensionality, so that
they can be separated. A feature transformation
is needed.
- Polynomial
- Radial basis function (RBF), or Gaussian kernel
- Sigmoid kernel
Problem Definition
For a binary classification task
- Notations
- , where
- Derive a function .
- The hyperplane is given as
- Where is a weighted vector and is the bias.
- To use the SVM, we have
- Boundaries are given by
- If , then
- If , then
- i.e.
- Margin width is given by