Skip to content
Search
Generic filters
Exact matches only

Penalized Regression in R

Last Updated on August 22, 2019

In this post you will discover 3 recipes for penalized regression for the R platform.

You can copy and paste the recipes in this post to make a jump-start on your own problem or to learn and practice with linear regression in R.

Discover how to prepare data, fit machine learning models and evaluate their predictions in R with my new book, including 14 step-by-step tutorials, 3 projects, and full source code.

Let’s get started.

Penalized Regression

Penalized Regression
Photo by Bay Area Bias, some rights reserved

Each example in this post uses the longley dataset provided in the datasets package that comes with R. The longley dataset describes 7 economic variables observed from 1947 to 1962 used to predict the number of people employed yearly.

Ridge Regression

Ridge Regression creates a linear regression model that is penalized with the L2-norm which is the sum of the squared coefficients. This has the effect of shrinking the coefficient values (and the complexity of the model) allowing some coefficients with minor contribution to the response to get close to zero.

# load the package
library(glmnet)
# load data
data(longley)
x <- as.matrix(longley[,1:6])
y <- as.matrix(longley[,7])
# fit model
fit <- glmnet(x, y, family=”gaussian”, alpha=0, lambda=0.001)
# summarize the fit
summary(fit)
# make predictions
predictions <- predict(fit, x, type=”link”)
# summarize accuracy
mse <- mean((y – predictions)^2)
print(mse)

# load the package

library(glmnet)

# load data

data(longley)

x <- as.matrix(longley[,1:6])

y <- as.matrix(longley[,7])

# fit model

fit <- glmnet(x, y, family=”gaussian”, alpha=0, lambda=0.001)

# summarize the fit

summary(fit)

# make predictions

predictions <- predict(fit, x, type=”link”)

# summarize accuracy

mse <- mean((y – predictions)^2)

print(mse)

Learn about the glmnet function in the glmnet package.

Need more Help with R for Machine Learning?

Take my free 14-day email course and discover how to use R on your project (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

Least Absolute Shrinkage and Selection Operator

Least Absolute Shrinkage and Selection Operator (LASSO) creates a regression model that is penalized with the L1-norm which is the sum of the absolute coefficients. This has the effect of shrinking coefficient values (and the complexity of the model), allowing some with a minor effect to the response to become zero.

# load the package
library(lars)
# load data
data(longley)
x <- as.matrix(longley[,1:6])
y <- as.matrix(longley[,7])
# fit model
fit <- lars(x, y, type=”lasso”)
# summarize the fit
summary(fit)
# select a step with a minimum error
best_step <- fit$df[which.min(fit$RSS)]
# make predictions
predictions <- predict(fit, x, s=best_step, type=”fit”)$fit
# summarize accuracy
mse <- mean((y – predictions)^2)
print(mse)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

# load the package

library(lars)

# load data

data(longley)

x <- as.matrix(longley[,1:6])

y <- as.matrix(longley[,7])

# fit model

fit <- lars(x, y, type=”lasso”)

# summarize the fit

summary(fit)

# select a step with a minimum error

best_step <- fit$df[which.min(fit$RSS)]

# make predictions

predictions <- predict(fit, x, s=best_step, type=”fit”)$fit

# summarize accuracy

mse <- mean((y – predictions)^2)

print(mse)

Learn about the lars function in the lars package.

Elastic Net

Elastic Net creates a regression model that is penalized with both the L1-norm and L2-norm. This has the effect of effectively shrinking coefficients (as in ridge regression) and setting some coefficients to zero (as in LASSO).

# load the package
library(glmnet)
# load data
data(longley)
x <- as.matrix(longley[,1:6])
y <- as.matrix(longley[,7])
# fit model
fit <- glmnet(x, y, family=”gaussian”, alpha=0.5, lambda=0.001)
# summarize the fit
summary(fit)
# make predictions
predictions <- predict(fit, x, type=”link”)
# summarize accuracy
mse <- mean((y – predictions)^2)
print(mse)

# load the package

library(glmnet)

# load data

data(longley)

x <- as.matrix(longley[,1:6])

y <- as.matrix(longley[,7])

# fit model

fit <- glmnet(x, y, family=”gaussian”, alpha=0.5, lambda=0.001)

# summarize the fit

summary(fit)

# make predictions

predictions <- predict(fit, x, type=”link”)

# summarize accuracy

mse <- mean((y – predictions)^2)

print(mse)

Learn about the glmnet function in the glmnet package.

Summary

In this post you discovered 3 recipes for penalized regression in R.

Penalization is a powerful method for attribute selection and improving the accuracy of predictive models. For more information see Chapter 6 of Applied Predictive Modeling by Kuhn and Johnson that provides an excellent introduction to linear regression with R for beginners.

Discover Faster Machine Learning in R!

Master Machine Learning With R

Develop Your Own Models in Minutes

…with just a few lines of R code

Discover how in my new Ebook:
Machine Learning Mastery With R

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, build models, tuning, and much more…

Finally Bring Machine Learning To Your Own Projects

Skip the Academics. Just Results.

See What’s Inside

About Jason Brownlee

Jason Brownlee, PhD is a machine learning specialist who teaches developers how to get results with modern machine learning methods via hands-on tutorials.

error: Content is protected !!