Lecture 21: Bias Variance Tradeoff

  • relation between x and y
  • we observe the random error \( \epsilon \)
  • we only see \( Y \), the data

  • prediction is \( \hat{Y} \)

Prediction Error

  • g is the right model, \( \epsilon \) is random error
  • red Y hat is our prediction

Model Risk

  • expectation of the squared difference

    • take a sample and get the mean
  • Chance Error

    • random
  • Bias

    • when our model is bad

Observation Variance

  • \( \epsilon \) is random, expectation is zero and variance is \( \sigma^2 \)
    • so Var of Y, g(x) is constant so varaince is only of epsilon
    • called observation error
    • measuring error, missing information
    • irreducia error

Chance Error

  • vary a little
  • from a random sample

Model Variance

  • average of prediction

  • can overfit into the data
  • reduce model complexity
  • don't fit the noise
  • bias

Our Model Vs the Truth

  • green is true, red is fixed

Model Bias

  • model prediction minus true g based on fixed x
  • not random
  • underfitting from not domain knowledge
  • overfit

  • average prediction vs actual value vs average prediction

Decomposition of Error and Risk

  • expected squared diff
  • Expectation in error (variance of observation), square of the bias, model variance

Bias Variance Decomposition

Predicting by a Function with Parameters

  • f is just y