# Homework 5
With this considered, we decide to standardize or one-hot encode all features in this section.
- We encourage you, though, to try raw features on your own to see how their performance matches your expectations!
One additional step we perform is to standardize the output values.
- Note that we did not have to worry about this in a classification context
- as all outputs were ±1.
* In a regression context, standardizing the output values can have practical performance gains,
* again due to better numerical performance of learning algorithms on data that is in a good magnitude range.
The metric we will use to measure the quality of our learned predictors is Root Mean Square Error (RMSE)
- This is useful metric because it gives a sense of the deviation in the nature units of the predictor
- RMSE is defined as follows:
$\mathrm{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( y^{(i)} - f(x^{(i)}) \right)^2 }$
* where $f$ is our learned predictor
- in this case, $f(x) = \theta \cdot x + \theta_0$.
- This gives a measure of how far away the true values are from the predicted values
- we are interested in this value, measured in units of mpg.
**Note**: One very important thing to keep in mind when employing standardization is that we need to reverse the standardization when we want to report results.
- If we standardize output values in the training set by subtracting $μ$ and dividing by $σ$, we need to take care to:
1. Perform standardization with the same values of $μ$ and $σ$ on the test set (Why?) before predicting outputs using our learned predictor.
2. Multiply the RMSE calculated on the test set by a factor of $σ$ to report test error. (Why?)
Given all of this, we now will try using:
- Two choices of feature set:
```python
[cylinders=standard, displacement=standard, horsepower=standard, weight=standard, acceleration=standard, origin=one_hot]
```
```python
[cylinders=one_hot, displacement=standard, horsepower=standard, weight=standard, acceleration=standard, origin=one_hot]
```
Polynomial features (we will construct the polynomial features after having standardized the input data) of orders 1-3
- Different choices of the regularization parameter,$λ$.
- Although, ideally, you would run a grid search over a large range of $λ$, we will ask you to look at the choices
1. $λ={0.0,0.01,0.02,⋯,0.1}$ for polynomial features of orders $1$ and $2$,
2. $λ={0,20,40,⋯,200}$ for polynomial features of order $3$
- as this is approximately where we found the optimal $λ$ to lie
- We will use $10$-fold cross-validation to try all possible combinations of these feature choices and test which is best.
- We have attached a code file with some predefined methods that will be useful to you here.
- Alternatively, a google colab link may be found here.
- If you choose to use the code file, a more detailed description of the roles of the files is below:
- The file `code_for_hw5.py` contains functions, some of which will need to be filled in with your definitions from this homework.
- Your functions are then called by `ridge_min`, defined for you, which takes
- a dataset $(X,y)$
- and a hyperparameter $λ$
- as input
- and returns
- $θ$
- and
- $θ0$
- minimizing the ridge regression objective using SGD
- this is the analogue of the `svm_min` function that you wrote for homework last week
- The learning rate and number of iterations are fixed in this function, and should not be modified for the purpose of answering the below questions
- although you should feel free to experiment with these if you are interested!
- This function will then further be called by `xval_learning_alg`
- also defined for you in the same file
- which returns the average RMSE across all
- here, 10
- splits of your data when performing cross-validation.
- Note that this RMSE is reported in standardized y units; to convert this to RMSE in mpg ^[miles per gallon]
- you should multiply this by the sigma returned by the hw5.std_y function call.
- The file auto.py will be used to implement the auto data regression.
- The file contains code for creating the two feature sets that you are asked to work with here.
- After
- Transforming those features further with make_polynomial_feature_fun, and
- running the cross-validation function, which uses your implementations in code_for_hw5.py
- both from code_for_hw5.py
- you should be able to answer the following questions: