Machine Learning

  • Supervision: Training data objects and the features are accompanied by labels.
    • Classification (Categorical)
    • Regression (Numerical)
  • Unsupervised Learning

Given training objects , such that is the feature vector of the th object and is its label. A learning algorithm seeks a function , where is the feature space and is the label space.

  • Data Splitting: Training/Validation (Dev)/Test Set
  • Overfitting
    • Predicts too closely/exactly
    • Starts to “memorizing” instead of learning and generalizing
  • Cross-validation
    • Partition the dataset into subsets (usually 5 or 10)
    • Pick a different subset for testing each time