# Lecture 13: Review Modeling and Optimization, Intro to Regression

## Human Contexts and Ethics

Imagine you are a Data Scientist on Twitter's "Trust and Safety" team.

1. Question/Prob Formation
• Fake News is a problem
• Doesn't have to be an Engineering-Focused problem!
2. Data Acquisition and Cleaning
• What data do we have and need to collect
• President's Tweet
3. Exploratory Data Analysis
• Example classify tweets as healthy or unhealthy
• Note biases and anomalies
4. Predictions and Inference
• What is the story, social good
• Think of who is listening what kind of power do you have

## Review Modeling and Optimization

### Models

• Models are a function f that map from X to Y.

• Parametric Models
• Have parameters, often represented as a vector
• Linear Models

• Non-Parametric Models?
• Nearest Neighbor
• copy the prediction from the closest datapoint
• Really big! Grows with the size of the data

• Kernel Density Estimator has a param, but it's more like a hyper param

• Can predict midterm grades from homeworks
• Simple model interpretable, summarize data
• Complex model

### Loss Functions

• loss how close is our model prediction to the actual value

• Average Loss

• Solve it with optimization, find the $$\theta$$ (param) that min loss

• $$f_\theta(x)$$ is our model. $$L(\theta)$$ is the loss func

• F.l1_loss equiv
• Keep it as tensors. can do autograd

• When building a model, do class ExponentialModel(nn.Module)
• Weights self.w = nn.Parameter(torch.ones(2, 1))
• Are initial weights is a 2x1 tensor of ones [1, 1]
• forward is how to make a prediction
• to evaluate:
m = ExponentialModel()
m(0) # returns tensor of 2


• In the 3d plot, have w0, w1, and loss. Find point that minimizes loss.

• Example of orange vs yellow line and it's location on the loss landscape

### Optimization of the Model

• You know your loss, compute the gradient (how to improve our loss)
• grad: scalar -> vector, each index is deriv with respect to param

• Take deriv and evaluate it