Example with code
Preface
- It's tempting to think that "bias" refers to the feature selection process. It does not.
- Bias is how bad your model $E(T)$ is at predicting the true value of a parameter $\theta$. (i.e. $E(T)=\theta+bias(\theta)$, and if $bias(\theta)=0$, then $E(T)=\theta$ -- you have an unbiased model)
- Sometimes having bias is good. In machine learning, having a little bias can significantly reduce variance.
High Bias, Low Variance Models
- You have a procedure to generate models. This procedure builds simple models with few features.
- You chose those few features. Why not other features? (it's tempting to think this is "bias", but it's not. Bias how bad your model is at predicting the actual value -- i.e.)
- Since there are few features, changes to the dataset will have limited effects on the parameters of the model -- different models built using this procedure will look similar (low variance)
- Your model may not have enough parameters to learn the real underlying relationships (high bias, underfitting)
Low Bias, High Variance Models
- This represents most machine learning models today.
- You have a procedure to generate models. This procedure builds complex models with many features.
- You did not choose any features. Just throw everything in (low bias)
- Since there are many features, changes to the dataset can have large effects on the parameters of the model -- different models built using this procedure may look very different (high variance)
- Your model may also learn noise that only exists in the training data, but not the real world (overfitting)