Last Updated on October 28, 2019
White noise is an important concept in time series forecasting.
If a time series is white noise, it is a sequence of random numbers and cannot be predicted. If the series of forecast errors are not white noise, it suggests improvements could be made to the predictive model.
In this tutorial, you will discover white noise time series with Python.
After completing this tutorial, you will know:
- The definition of a white noise time series and why it matters.
- How to check if your time series is white noise.
- Statistics and diagnostic plots to identify white noise in Python.
Discover how to prepare and visualize time series data and develop autoregressive forecasting models in my new book, with 28 step-by-step tutorials, and full python code.
Let’s get started.
- Updated Sept/2019: Updated examples to use latest API.
- Updated Oct/2019: Made the check for white noise clearer (thanks Samuel Corradi)
What You Will Learn
What is a White Noise Time Series?
A time series may be white noise.
A time series is white noise if the variables are independent and identically distributed with a mean of zero.
This means that all variables have the same variance (sigma^2) and each value has a zero correlation with all other values in the series.
If the variables in the series are drawn from a Gaussian distribution, the series is called Gaussian white noise.
Why Does it Matter?
White noise is an important concept in time series analysis and forecasting.
It is important for two main reasons:
- Predictability: If your time series is white noise, then, by definition, it is random. You cannot reasonably model it and make predictions.
- Model Diagnostics: The series of errors from a time series forecast model should ideally be white noise.
Model Diagnostics is an important area of time series forecasting.
Time series data are expected to contain some white noise component on top of the signal generated by the underlying process.
For example:
y(t) = signal(t) + noise(t)
y(t) = signal(t) + noise(t)
Once predictions have been made by a time series forecast model, they can be collected and analyzed. The series of forecast errors should ideally be white noise.
When forecast errors are white noise, it means that all of the signal information in the time series has been harnessed by the model in order to make predictions. All that is left is the random fluctuations that cannot be modeled.
A sign that model predictions are not white noise is an indication that further improvements to the forecast model may be possible.
Stop learning Time Series Forecasting the slow way!
Take my free 7-day email course and discover how to get started (with sample code).
Click to sign-up and also get a free PDF Ebook version of the course.
Start Your FREE Mini-Course Now!
Is your Time Series White Noise?
Your time series is probably NOT white noise if one or more of the following conditions are true:
- Is the mean/level non-zero?
- Does the mean/level change over time?
- Does the variance change over time?
- Do values correlate with lag values?
Some tools that you can use to check if your time series is white noise are:
- Create a line plot. Check for gross features like a changing mean, variance, or obvious relationship between lagged variables.
- Calculate summary statistics. Check the mean and variance of the whole series against the mean and variance of meaningful contiguous blocks of values in the series (e.g. days, months, or years).
- Create an autocorrelation plot. Check for gross correlation between lagged variables.
Example of White Noise Time Series
In this section, we will create a Gaussian white noise series in Python and perform some checks.
It is helpful to create and review a white noise time series in practice. It will provide the frame of reference and example plots and statistical tests to use and compare on your own time series projects to check if they are white noise.
Firstly, we can create a list of 1,000 random Gaussian variables using the gauss() function from the random module.
We will draw variables from a Gaussian distribution with a mean (mu) of 0.0 and a standard deviation (sigma) of 1.0.
Once created, we can wrap the list in a Pandas Series for convenience.
from random import gauss
from random import seed
from pandas import Series
from pandas.plotting import autocorrelation_plot
# seed random number generator
seed(1)
# create white noise series
series = [gauss(0.0, 1.0) for i in range(1000)]
series = Series(series)
from random import gauss
from random import seed
from pandas import Series
from pandas.plotting import autocorrelation_plot
# seed random number generator
seed(1)
# create white noise series
series = [gauss(0.0, 1.0) for i in range(1000)]
series = Series(series)
Next, we can calculate and print some summary statistics, including the mean and standard deviation of the series.
# summary stats
print(series.describe())
# summary stats
print(series.describe())
Given that we defined the mean and standard deviation when drawing the random numbers, there should be no surprises.
count 1000.000000
mean -0.013222
std 1.003685
min -2.961214
25% -0.684192
50% -0.010934
75% 0.703915
max 2.737260
count 1000.000000
mean -0.013222
std 1.003685
min -2.961214
25% -0.684192
50% -0.010934
75% 0.703915
max 2.737260
We can see that the mean is nearly 0.0 and the standard deviation is nearly 1.0. Some variance is expected given the small size of the sample.
If we had more data, it might be more interesting to split the series in half and calculate and compare the summary statistics for each half. We would expect to see a similar mean and standard deviation for each sub-series.
Now we can create some plots, starting with a line plot of the series.
# line plot
series.plot()
pyplot.show()
# line plot
series.plot()
pyplot.show()
We can see that it does appear that the series is random.
We can also create a histogram and confirm the distribution is Gaussian.
# histogram plot
series.hist()
pyplot.show()
# histogram plot
series.hist()
pyplot.show()
Indeed, the histogram shows the tell-tale bell-curve shape.
Finally, we can create a correlogram and check for any autocorrelation with lag variables.
# autocorrelation
autocorrelation_plot(series)
pyplot.show()
# autocorrelation
autocorrelation_plot(series)
pyplot.show()
The correlogram does not show any obvious autocorrelation pattern.
There are some spikes above the 95% and 99% confidence level, but these are a statistical fluke.
For completeness, the complete code listing is provided below.
from random import gauss
from random import seed
from pandas import Series
from pandas.plotting import autocorrelation_plot
from matplotlib import pyplot
# seed random number generator
seed(1)
# create white noise series
series = [gauss(0.0, 1.0) for i in range(1000)]
series = Series(series)
# summary stats
print(series.describe())
# line plot
series.plot()
pyplot.show()
# histogram plot
series.hist()
pyplot.show()
# autocorrelation
autocorrelation_plot(series)
pyplot.show()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
from random import gauss
from random import seed
from pandas import Series
from pandas.plotting import autocorrelation_plot
from matplotlib import pyplot
# seed random number generator
seed(1)
# create white noise series
series = [gauss(0.0, 1.0) for i in range(1000)]
series = Series(series)
# summary stats
print(series.describe())
# line plot
series.plot()
pyplot.show()
# histogram plot
series.hist()
pyplot.show()
# autocorrelation
autocorrelation_plot(series)
pyplot.show()
Further Reading
This section lists some resources for further reading on white noise and white noise time series.
Summary
In this tutorial, you discovered white noise time series in Python.
Specifically, you learned:
- White noise time series is defined by a zero mean, constant variance, and zero correlation.
- If your time series is white noise, it cannot be predicted, and if your forecast residuals are not white noise, you may be able to improve your model.
- The statistics and diagnostic plots you can use on your time series to check if it is white noise.
Do you have any questions about this tutorial? Ask your questions in the comments below and I will do my best to answer.
Want to Develop Time Series Forecasts with Python?
Develop Your Own Forecasts in Minutes
…with just a few lines of python code
Discover how in my new Ebook:
Introduction to Time Series Forecasting With Python
It covers self-study tutorials and end-to-end projects on topics like:
Loading data, visualization, modeling, algorithm tuning, and much more…
Finally Bring Time Series Forecasting to
Your Own Projects
Skip the Academics. Just Results.
See What’s Inside