Lecture 16: Least Squares Regression in SKLearn

  • Our data!
# Grid of test points
# one column of just 1
def add_ones_column(X):
    return np.hstack([np.ones((X.shape[0],1)), X])
add_ones_column(X)

theta_hat = least_squares_by_solve(add_ones_column(X),Y)
def model_append_ones(X):
    return add_ones_column(X) @ theta_hat

def plot_plane(f, X, grid_points = 30):
    u = np.linspace(X[:,0].min(),X[:,0].max(), grid_points)
    v = np.linspace(X[:,1].min(),X[:,1].max(), grid_points)
    xu, xv = np.meshgrid(u,v)
    X = np.vstack((xu.flatten(),xv.flatten())).transpose()
    z = f(X)
    return go.Surface(x=xu, y=xv, z=z.reshape(xu.shape),opacity=0.8)

fig = go.Figure()
fig.add_trace(data_scatter)
fig.add_trace(plot_plane(model_append_ones, X))
fig.update_layout(margin=dict(l=0, r=0, t=0, b=0), 
                  height=600)

Scikit Learn

# The API
model = SuperCoolModelType(args)

# train
model.fit(df[['X1' 'X1']], df[['Y']])

# predict!
model.predict(df2[['X1' 'X1']])

## Linear Regression
from sklearn.linear_model import LinearRegression

model = LinearRegression(fit_intercept=True) # intercept (It makes it don't go through the origin?
model.fit(synth_data[["X1", "X2"]], synth_data[["Y"]])

# predict
synth_data['Y_hat'] = model.predict(synth_data[["X1", "X2"]])
synth_data

Looks good!

Hyper-Parameters

Let's go through Kernel Regression

from sklearn.kernel_ridge import KernelRidge
super_model = KernelRidge(kernel="rbf")
super_model.fit(synth_data[["X1", "X2"]], synth_data[["Y"]])

fig = go.Figure()
fig.add_trace(data_scatter)
fig.add_trace(plot_plane(super_model.predict, X))
fig.update_layout(margin=dict(l=0, r=0, t=0, b=0), 
                  height=600)

Curvy Dude!

Feature Functions

  • P features (mappings)

  • Map non-linear into linear
  • Feature Engineering
    • non-linear
    • change from categorical
    • Covariant Matrix?

One-Hot Encoding

  • Matrix. Instead of Alabama = 1 Hawaii = 50 cause this implies order

  • Bag-of-words with n-grams
  • high dimensional and sparse

  • Ordering (2-gram) "book well", "well enjoy"

Domain Knowledge

  • Know isWinter know spike in time

Constant Feature

  • Add the 1 column, bias param

  • Feature functions, they have no params
#stack features
def phi_periodic(X):
    return np.hstack([
        X,
        np.sin(X),
        np.sin(10*X),
        np.sin(20*X),
        np.sin(X + 1),
        np.sin(10*X + 1),
        np.sin(20*X + 1)
    ])
    

  • Some features used, some not at all