Machine Learning
- Supervision: Training data objects and the features are accompanied by
labels.
- Classification (Categorical)
- Regression (Numerical)
- Unsupervised Learning
- Clustering
- Pattern / Association Mining
Given training objects , such that is the feature vector of the th object and is its label. A learning algorithm seeks a function , where is the feature space and is the label space.
- Data Splitting: Training/Validation (Dev)/Test Set
- Overfitting
- Predicts too closely/exactly
- Starts to “memorizing” instead of learning and generalizing
- Cross-validation
- Partition the dataset into subsets (usually 5 or 10)
- Pick a different subset for testing each time