STATS 315 HW4

Homework 4: From Data To Model¶

In this homework we will focus on going from the data to a model that generalizes well.

import tensorflow as tf
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
from matplotlib import pyplot as plt
from tensorflow.keras import regularizers
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from keras.utils.np_utils import to_categorical
from tensorflow.keras.datasets import mnist

Part 1: Warmup¶
We will first start off again with the california housing dataset

california_housing = fetch_california_housing(return_X_y=True, as_frame=True)
X = california_housing[0]
y = california_housing[1]
X_train_unscaled, X_test_unscaled, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
sc=StandardScaler()
X_train=sc.fit_transform(X_train_unscaled)
X_test = sc.transform(X_test_unscaled)

Question 1 (9 pts):
Recreate the regression model created in HW3 using tf.keras.Sequential. While you will have to use the same amount of epochs and batch size, for the optimizer you can use RMSprop with the same learning rate as in HW3.

“””A function that returns a compiled model that matches the regression
problem in HW3
Args: None
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q1 = q1()
model_q1.fit(X_train,y_train, epochs=100, batch_size=1000)

Part 2: MNIST to Model¶
Now we shall switch to the MNIST dataset

(X, y_int), _ = mnist.load_data()
X = X.reshape(X.shape[0],-1)
y_one_hot = to_categorical(y_int, num_classes=10)
X_train_unscaled, X_test_unscaled, y_train, y_test = train_test_split(X, y_one_hot, test_size=0.3, random_state=42)
sc=StandardScaler()
X_train=sc.fit_transform(X_train_unscaled)
X_test = sc.transform(X_test_unscaled)
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.2, random_state=42)

Question 2 (9 pts):
Create a model that underfits the data after at least 50 epochs. (For the loss use cross entropy)

“””A function that returns a compiled model that underfits the data
Args: None
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q2 = q2()
history_q2 = model_q2.fit(X_train,y_train, epochs=50, batch_size=2048, validation_data=(X_valid, y_valid))

val_loss = history_q2.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss)+1),val_loss)

Question 3 (9 pts):
Take the model from question 2 and change it/tweak hyperparameters until it is able to overfit the data. (The resulting model does not need to be similar to the answer in question 2.)

“””A function that returns a compiled model that overfits the data
Args: None
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q3 = q3()
history_q3 = model_q3.fit(X_train,y_train, epochs=ep, batch_size=bs, validation_data=(X_valid, y_valid))

#plotting the validation loss
val_loss = history_q3.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss)+1),val_loss)

Question 4 (9 pts):
From your model in question 3, what do you think is the best stopping point? Note your answer below and why you think it. Then, Use keras early stopping on your model from question 3 to have keras automatically do it for you. Hint: use the tensorflow docs to learn how to use the earlystopping callback

Write what you think here

#your code goes here

#plotting the validation loss
val_loss = history_q4.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss)+1),val_loss)

Question 5 (9 pts):
Starting with your model from question 3, try to regularize it by reducing size of network. State the process you went through on how you settled on your model.

Write here

“””A function that returns a compiled model that has been reduced in size
Args: None
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q5 = q5()
history_q5 = model_q5.fit(X_train,y_train, epochs=ep, batch_size=bs, validation_data=(X_valid, y_valid))

#plotting the loss compared to original model
val_loss_q5 = history_q5.history[“val_loss”]
val_loss_q3 = history_q3.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss_q3)+1),val_loss_q3, label=”original”)
plt.plot(np.arange(1,len(val_loss_q5)+1),val_loss_q5, label=”reduced size”)
plt.legend()

Question 6 (9 pts):
Starting with your model from question 3, try to regularize it by using by L1 regularization. State the process you went through on how you settled on your model.

Write here

“””A function that returns a compiled model that has been regularized using L1
regularization.
Args: None
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q6 = q6()
history_q6 = model_q6.fit(X_train,y_train, epochs=ep, batch_size=bs, validation_data=(X_valid, y_valid))

val_loss_q3 = history_q3.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss_q3)+1),val_loss_q3, label=”original”)
val_loss_q6 = history_q6.history[“val_categorical_crossentropy”]
plt.plot(np.arange(1,len(val_loss_q6)+1),val_loss_q6, label=”L1 reg”)
plt.legend()

Question 7 (9 pts):
Starting with your model from question 3, try to regularize it by using L2 regularization. State the process you went through on how you settled on your model.

Write here

“””A function that returns a compiled model that has been regularized using L2
regularization.
Args: None
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q7 = q7()
history_q7 = model_q7.fit(X_train,y_train, epochs=ep, batch_size=bs, validation_data=(X_valid, y_valid))

val_loss_q3 = history_q3.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss_q3)+1),val_loss_q3, label=”original”)
val_loss_q7 = history_q7.history[“val_categorical_crossentropy”]
plt.plot(np.arange(1,len(val_loss_q7)+1),val_loss_q7, label=”L2 reg”)
plt.legend()

Question 8 (9 pts):
Starting with your model from question 3, try to regularize it by using dropout. State the process you went through on how you settled on your model.

Write here

“””A function that returns a compiled model that has been regularized using
Args: None
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q8 = q8()
history_q8 = model_q8.fit(X_train,y_train, epochs=ep, batch_size=bs, validation_data=(X_valid, y_valid))

val_loss_q3 = history_q3.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss_q3)+1),val_loss_q3, label=”original”)
val_loss_q8 = history_q8.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss_q8)+1),val_loss_q8, label=”dropout”)
plt.legend()

Question 9 (9 pts):
Starting with your model from question 3, try to regularize it by using a combination of the methods used in questions 5-8. Make changes to hyperparameters and have the model stop at a good epoch. State the process you went through on how you settled on your model.

Write here

“””A function that returns a compiled model that has been regularized using
various methods.
Returns: A tf.keras.Sequential model that has been compiled”””
#your code here

model_q9 = q9()
history_q9 = model_q9.fit(X_train,y_train, epochs=50, batch_size=256, validation_data=(X_valid, y_valid), callbacks=[es])

val_loss_q3 = history_q3.history[“val_loss”]
plt.plot(np.arange(1,len(val_loss_q3)+1),val_loss_q3, label=”original”)
val_loss_q9 = history_q9.history[“val_categorical_crossentropy”]
plt.plot(np.arange(1,len(val_loss_q9)+1),val_loss_q9, label=”dropout/L2 reg/reduced size/early stop”)
plt.legend()

Question 10 (9 pts):
Take the models from question 2 to 9 and find their test loss.

models = [model_q2, model_q3, model_q4, model_q5, model_q6, model_q7, model_q8,
model_q9] #the models to find test loss of
#your code here

Question 11 (10 pts): If you had to use one of these models, which one would you use and why?

Write your answer here