# Homework 5
With this considered, we decide to standardize or one-hot encode all features.
We encourage you to try raw features on your own to see how they perform.
One additional step is to standardize the output values.
In classification, we did not worry about this because all outputs were ±1.
In regression, standardizing outputs can give practical performance gains.
Learning algorithms work better on data that is in a good magnitude range.
The metric we will use is Root Mean Square Error (RMSE).
RMSE gives a sense of deviation in the natural units of the predictor.
RMSE is defined as:
```math
\mathrm{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y^{(i)} - f(x^{(i)}))^2 }
````
where $f$ is our learned predictor.
In this case: $f(x) = \theta \cdot x + \theta_0$.
This measures how far the true values are from the predicted values.
We are interested in this value, measured in units of mpg.
**Note:** When standardizing, we need to reverse the standardization to report results.
* If we standardize outputs in the training set by subtracting $μ$ and dividing by $σ$, we need to:
1. Perform standardization with the same $μ$ and $σ$ on the test set before predicting outputs.
2. Multiply RMSE on the test set by $σ$ to report test error.
We will try using:
**Two choices of feature set:**
```python
[cylinders=standard, displacement=standard, horsepower=standard,
weight=standard, acceleration=standard, origin=one_hot]
```
```python
[cylinders=one_hot, displacement=standard, horsepower=standard,
weight=standard, acceleration=standard, origin=one_hot]
```
Polynomial features (after standardizing) of orders 1–3.
**Regularization parameter $\lambda$:**
* Orders 1 and 2: $\lambda \in {0.0,0.01,0.02,\dots,0.1}$
* Order 3: $\lambda \in {0,20,40,\dots,200}$
We will use 10-fold cross-validation to test all feature and $\lambda$ combinations.
The file `code_for_hw5.py` contains functions to fill in your homework definitions.
`ridge_min` takes dataset $(X,y)$ and $\lambda$, and returns $\theta$ and $\theta_0$ minimizing ridge regression using SGD.
The learning rate and number of iterations are fixed. You should not modify them.
This function is called by `xval_learning_alg`, which returns average RMSE across 10 splits.
The RMSE is reported in standardized $y$ units; multiply by $\sigma$ to convert to mpg.
The file `auto.py` implements Auto data regression.
It creates the two feature sets, transforms them with `make_polynomial_feature_fun`,
and runs cross-validation using your functions.