Last Updated on August 22, 2019

When you are building a predictive model, you need a way to evaluate the capability of the model on unseen data.

This is typically done by estimating accuracy using data that was not used to train the model such as a test set, or using cross validation. The caret package in R provides a number of methods to estimate the accuracy of a machines learning algorithm.

In this post you discover 5 approaches for estimating model performance on unseen data. You will also have access to recipes in R using the caret package for each method, that you can copy and paste into your own project, right now.

Discover how to prepare data, fit machine learning models and evaluate their predictions in R with my new book, including 14 step-by-step tutorials, 3 projects, and full source code.

Let’s get started.

What You Will Learn

## Estimating Model Accuracy

We have considered model accuracy before in the configuration of test options in a test harness. You can read more in the post: How To Choose The Right Test Options When Evaluating Machine Learning Algorithms.

In this post you can going to discover 5 different methods that you can use to estimate model accuracy.

They are as follows and each will be described in turn:

- Data Split
- Bootstrap
- k-fold Cross Validation
- Repeated k-fold Cross Validation
- Leave One Out Cross Validation

Generally, I would recommend Repeated k-fold Cross Validation, but each method has its features and benefits, especially when the amount of data or space and time complexity are considered. Consider which approach best suits your problem.

### Need more Help with R for Machine Learning?

Take my free 14-day email course and discover how to use R on your project (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

## Data Split

Data splitting involves partitioning the data into an explicit training dataset used to prepare the model and an unseen test dataset used to evaluate the models performance on unseen data.

It is useful when you have a very large dataset so that the test dataset can provide a meaningful estimation of performance, or for when you are using slow methods and need a quick approximation of performance.

The example below splits the iris dataset so that 80% is used for training a Naive Bayes model and 20% is used to evaluate the models performance.

# load the libraries

library(caret)

library(klaR)

# load the iris dataset

data(iris)

# define an 80%/20% train/test split of the dataset

split=0.80

trainIndex <- createDataPartition(iris$Species, p=split, list=FALSE)

data_train <- iris[ trainIndex,]

data_test <- iris[-trainIndex,]

# train a naive bayes model

model <- NaiveBayes(Species~., data=data_train)

# make predictions

x_test <- data_test[,1:4]

y_test <- data_test[,5]

predictions <- predict(model, x_test)

# summarize results

confusionMatrix(predictions$class, y_test)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

# load the libraries

library(caret)

library(klaR)

# load the iris dataset

data(iris)

# define an 80%/20% train/test split of the dataset

split=0.80

trainIndex <- createDataPartition(iris$Species, p=split, list=FALSE)

data_train <- iris[ trainIndex,]

data_test <- iris[-trainIndex,]

# train a naive bayes model

model <- NaiveBayes(Species~., data=data_train)

# make predictions

x_test <- data_test[,1:4]

y_test <- data_test[,5]

predictions <- predict(model, x_test)

# summarize results

confusionMatrix(predictions$class, y_test)

## Bootstrap

Bootstrap resampling involves taking random samples from the dataset (with re-selection) against which to evaluate the model. In aggregate, the results provide an indication of the variance of the models performance. Typically, large number of resampling iterations are performed (thousands or tends of thousands).

The following example uses a bootstrap with 10 resamples to prepare a Naive Bayes model.

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”boot”, number=100)

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”)

# summarize results

print(model)

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”boot”, number=100)

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”)

# summarize results

print(model)

## k-fold Cross Validation

The k-fold cross validation method involves splitting the dataset into k-subsets. For each subset is held out while the model is trained on all other subsets. This process is completed until accuracy is determine for each instance in the dataset, and an overall accuracy estimate is provided.

It is a robust method for estimating accuracy, and the size of k and tune the amount of bias in the estimate, with popular values set to 3, 5, 7 and 10.

The following example uses 10-fold cross validation to estimate Naive Bayes on the iris dataset.

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”cv”, number=10)

# fix the parameters of the algorithm

grid <- expand.grid(.fL=c(0), .usekernel=c(FALSE))

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”, tuneGrid=grid)

# summarize results

print(model)

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”cv”, number=10)

# fix the parameters of the algorithm

grid <- expand.grid(.fL=c(0), .usekernel=c(FALSE))

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”, tuneGrid=grid)

# summarize results

print(model)

## Repeated k-fold Cross Validation

The process of splitting the data into k-folds can be repeated a number of times, this is called Repeated k-fold Cross Validation. The final model accuracy is taken as the mean from the number of repeats.

The following example uses 10-fold cross validation with 3 repeats to estimate Naive Bayes on the iris dataset.

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”repeatedcv”, number=10, repeats=3)

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”)

# summarize results

print(model)

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”repeatedcv”, number=10, repeats=3)

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”)

# summarize results

print(model)

## Leave One Out Cross Validation

In Leave One Out Cross Validation (LOOCV), a data instance is left out and a model constructed on all other data instances in the training set. This is repeated for all data instances.

The following example demonstrates LOOCV to estimate Naive Bayes on the iris dataset.

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”LOOCV”)

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”)

# summarize results

print(model)

# load the library

library(caret)

# load the iris dataset

data(iris)

# define training control

train_control <- trainControl(method=”LOOCV”)

# train the model

model <- train(Species~., data=iris, trControl=train_control, method=”nb”)

# summarize results

print(model)

## Summary

In this post you discovered 5 different methods that you can use to estimate the accuracy of your model on unseen data.

Those methods were: Data Split, Bootstrap, k-fold Cross Validation, Repeated k-fold Cross Validation, and Leave One Out Cross Validation.

You can learn more about the caret package in R at the caret package homepage and the caret package CRAN page. If you would like to master the caret package, I would recommend the book written by the author of the package, titled: Applied Predictive Modeling, especially Chapter 4 on overfitting models.

## Discover Faster Machine Learning in R!

#### Develop Your Own Models in Minutes

…with just a few lines of R code

Discover how in my new Ebook:

Machine Learning Mastery With R

Covers **self-study tutorials** and **end-to-end projects** like:

Loading data, visualization, build models, tuning, and much more…

#### Finally Bring Machine Learning To Your Own Projects

Skip the Academics. Just Results.

See What’s Inside