Example
Linear Regression
Typical framing
$y = X\beta + \epsilon$ -- Target = data * params + error
$\epsilon_i ~ \mathcal{N}(0, \sigma^2)$ -- Error is a noise function centered on 0
$y_i ~ \mathcal{N}(\beta^Tx_i, \sigma^2$
Regularization
Added because $\hat{\beta}$ for ML tends to have high variance, so small change in $x$ results in large change in $\beta$ (bad)
$\text{LASSO: } \hat{\beta}_{L1} = \argmin_\beta[\lVert {y-X\beta}_2^2 \rVert + \lambda \lVert \beta \rVert_1]$
$\text{RIDGE: } \hat{\beta}_{L2} = \argmin_\beta[\lVert {y-X\beta}_2^2 \rVert + \lambda \lVert \beta \rVert_2^2]$