• relation between x and y
• we observe the random error $$\epsilon$$
• we only see $$Y$$, the data

• prediction is $$\hat{Y}$$

Prediction Error

• g is the right model, $$\epsilon$$ is random error
• red Y hat is our prediction

Model Risk

• expectation of the squared difference

• take a sample and get the mean
• Chance Error

• random
• Bias

• when our model is bad

Observation Variance

• $$\epsilon$$ is random, expectation is zero and variance is $$\sigma^2$$
• so Var of Y, g(x) is constant so varaince is only of epsilon
• called observation error
• measuring error, missing information
• irreducia error

Chance Error

• vary a little
• from a random sample

Model Variance

• average of prediction

• can overfit into the data
• reduce model complexity
• don't fit the noise
• bias

Our Model Vs the Truth

• green is true, red is fixed

Model Bias

• model prediction minus true g based on fixed x
• not random
• underfitting from not domain knowledge
• overfit

• average prediction vs actual value vs average prediction

Decomposition of Error and Risk

• expected squared diff
• Expectation in error (variance of observation), square of the bias, model variance

• f is just y