Skip to content
Snippets Groups Projects
Commit 53016fab authored by Jack Meyers's avatar Jack Meyers
Browse files

two models, neither are great so going to look at polynomial and additive

parent 1dade4b6
No related branches found
No related tags found
No related merge requests found
......@@ -142,15 +142,33 @@ numeric_model = lm(AnnualRate ~ ., payroll_numeric)
summary(numeric_model)
```
While that model performed alright, the goal was to use some of the categorical predictors to also inform the regression. We ran a backward search using AIC with all of the factor predictors in order to find the factor variables which influence the regression.
While that model performed alright, the goal was to use some of the categorical predictors to also inform the regression. We attempted to use all of the factor variables in the regression but due to the size of the factor variables we found it was too much data.
```{r}
payroll_factor = payroll %>% select_if(~class(.) == 'factor')
payroll_factor$AnnualRate = payroll$AnnualRate
factor_model = lm(AnnualRate ~ ., payroll_factor)
summary(reduced_factor_model)
employment_demographic_factor_data = subset(payroll, select = c(`AnnualRate`, `DeptID`, `NameSuffix`, `ChkOption`, `JobIndicator`, `EthnicGrp`, `Sex`, `Full/Part`, `UnionDescr`, `Agency`, `City`))
factor_model = lm(AnnualRate ~ ., employment_demographic_factor_data)
summary(factor_model)
```
Overall the models were not looking good when we calculated the `LOOCV RMSE` so we needed to change them.
```{r}
calc_loocv_rmse = function(model) {
sqrt(mean((resid(model) / (1 - hatvalues(model))) ^ 2))
}
calc_loocv_rmse(factor_model)
calc_loocv_rmse(numeric_model)
```
The next step was to improve the basic models by adding interactions and polynomial terms.
```{r}
```
**Exploring Collinearity and Correlation of Predictors**
......
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment