Markdown Viewer

# Homework 5 With this considered, we decide to standardize or one-hot encode all features in this section. - We encourage you, though, to try raw features on your own to see how their performance matches your expectations! One additional step we perform is to standardize the output values. - Note that we did not have to worry about this in a classification context - as all outputs were ±1. * In a regression context, standardizing the output values can have practical performance gains, * again due to better numerical performance of learning algorithms on data that is in a good magnitude range. The metric we will use to measure the quality of our learned predictors is Root Mean Square Error (RMSE) - This is useful metric because it gives a sense of the deviation in the nature units of the predictor - RMSE is defined as follows: $\mathrm{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( y^{(i)} - f(x^{(i)}) \right)^2 }$ * where $f$ is our learned predictor - in this case, $f(x) = \theta \cdot x + \theta_0$. - This gives a measure of how far away the true values are from the predicted values - we are interested in this value, measured in units of mpg. **Note**: One very important thing to keep in mind when employing standardization is that we need to reverse the standardization when we want to report results. - If we standardize output values in the training set by subtracting $μ$ and dividing by $σ$, we need to take care to: 1. Perform standardization with the same values of $μ$ and $σ$ on the test set (Why?) before predicting outputs using our learned predictor. 2. Multiply the RMSE calculated on the test set by a factor of $σ$ to report test error. (Why?) Given all of this, we now will try using: - Two choices of feature set: ```python [cylinders=standard, displacement=standard, horsepower=standard, weight=standard, acceleration=standard, origin=one_hot] ``` ```python [cylinders=one_hot, displacement=standard, horsepower=standard, weight=standard, acceleration=standard, origin=one_hot] ``` Polynomial features (we will construct the polynomial features after having standardized the input data) of orders 1-3 - Different choices of the regularization parameter,$λ$. - Although, ideally, you would run a grid search over a large range of $λ$, we will ask you to look at the choices 1. $λ={0.0,0.01,0.02,⋯,0.1}$ for polynomial features of orders $1$ and $2$, 2. $λ={0,20,40,⋯,200}$ for polynomial features of order $3$ - as this is approximately where we found the optimal $λ$ to lie - We will use $10$-fold cross-validation to try all possible combinations of these feature choices and test which is best. - We have attached a code file with some predefined methods that will be useful to you here. - Alternatively, a google colab link may be found here. - If you choose to use the code file, a more detailed description of the roles of the files is below: - The file `code_for_hw5.py` contains functions, some of which will need to be filled in with your definitions from this homework. - Your functions are then called by `ridge_min`, defined for you, which takes - a dataset $(X,y)$ - and a hyperparameter $λ$ - as input - and returns - $θ$ - and - $θ0$ - minimizing the ridge regression objective using SGD - this is the analogue of the `svm_min` function that you wrote for homework last week - The learning rate and number of iterations are fixed in this function, and should not be modified for the purpose of answering the below questions - although you should feel free to experiment with these if you are interested! - This function will then further be called by `xval_learning_alg` - also defined for you in the same file - which returns the average RMSE across all - here, 10 - splits of your data when performing cross-validation. - Note that this RMSE is reported in standardized y units; to convert this to RMSE in mpg ^[miles per gallon] - you should multiply this by the sigma returned by the hw5.std_y function call. - The file auto.py will be used to implement the auto data regression. - The file contains code for creating the two feature sets that you are asked to work with here. - After - Transforming those features further with make_polynomial_feature_fun, and - running the cross-validation function, which uses your implementations in code_for_hw5.py - both from code_for_hw5.py - you should be able to answer the following questions: