Support Vector Machine

  • Finding a separating hyperplane
    • Must correctly classify the data
    • The closest points have the greatest distance to it — Maximize the margin width
  • Advantages
    • Insensitive to high dimensionality
    • Insensitive to class imbalance
  • Disadvantages
    • Data may not have a clear boundary
    • Linear hyperplane doesn’t exist in non-linear data.
  • sklearn.svm.LinearSVC: a parameter C to dictate tradeoff between margin and correct classification. The larger, the narrower the margin.
  • Use kernel function to map data points to a higher dimensionality, so that they can be separated. A feature transformation is needed.
    • Polynomial
    • Radial basis function (RBF), or Gaussian kernel
    • Sigmoid kernel

Problem Definition

For a binary classification task

  • Notations
    • , where
    • Derive a function .
    • The hyperplane is given as
    • Where is a weighted vector and is the bias.
  • To use the SVM, we have
    • Boundaries are given by
    • If , then
    • If , then
    • i.e.
  • Margin width is given by