Example with code

Preface

  • It's tempting to think that "bias" refers to the feature selection process. It does not.
  • Bias is how bad your model $E(T)$ is at predicting the true value of a parameter $\theta$. (i.e. $E(T)=\theta+bias(\theta)$, and if $bias(\theta)=0$, then $E(T)=\theta$ -- you have an unbiased model)
  • Sometimes having bias is good. In machine learning, having a little bias can significantly reduce variance.

High Bias, Low Variance Models

  • You have a procedure to generate models. This procedure builds simple models with few features.
  • You chose those few features. Why not other features? (it's tempting to think this is "bias", but it's not. Bias how bad your model is at predicting the actual value -- i.e.)
  • Since there are few features, changes to the dataset will have limited effects on the parameters of the model -- different models built using this procedure will look similar (low variance)
  • Your model may not have enough parameters to learn the real underlying relationships (high bias, underfitting)

Low Bias, High Variance Models

  • This represents most machine learning models today.
  • You have a procedure to generate models. This procedure builds complex models with many features.
  • You did not choose any features. Just throw everything in (low bias)
  • Since there are many features, changes to the dataset can have large effects on the parameters of the model -- different models built using this procedure may look very different (high variance)
  • Your model may also learn noise that only exists in the training data, but not the real world (overfitting)