Linear Regression
- y=f(x,β)
- Ordinary Least Square method (OLS)
- By setting partial derivative to zero, y^=β^0+β^1x
- In the case of straight line, residuals should be randomly distributed
around 0
- Coefficient of determination R2 measures how much of the variability of
y has been accounted by the model.
- Assumptions
- Normality: residuals are randomly distributed
- Homoscedasticity: residuals have constant variance, use Q-Q plot for
residuals
- Independence
- No outliers
- Box-Cox transformation: transform the dependent variable and stabilize its
variance to make it normally distributed.
- Multicollinearity
- Measured by Variance Inflation Factor (VIF)
- Can be dealt with using Ridge and Lasso regressions, which penalizes
overfitting.