- machine learning
- when labeled you have
**supervised learning**
- when quantitative do
**regression**
- when categorical do
**classification**

- when unlabeled you have
**unsupervised learning**
- dimensionality reduction
- clustering

- finally reinforcement learning

**binary** two classes
**multiclass** [cat, dog, car]
**structured prediction** ?

- try least squares regression

- two difference coins case by case

- single expression \( p^y(1-p)^{1-y} \)

- estimate probability find value that maximize the function
- take a log

- you can minimize the average?

- what is the function in the two hard cases?

- as a loss function there is a penalty

- linear functions are not good for probabilities

- t can be infinity and negative infinity
- take e of both sides

- model probability on the real line
**sigmoid**, defined on whole line, smooth, increasing, elongated S

- features
- linear combo of features

- log odds
- probabiligty is of the log odds

- linear regression continuous
- categorical (probability Y is 1)

- you need a little bit of uncertainty