Last Updated on November 28, 2019
What You Will Learn
Quickreference guide to the 17 statistical hypothesis tests that you need in
applied machine learning, with sample code in Python.
Although there are hundreds of statistical hypothesis tests that you could use, there is only a small subset that you may need to use in a machine learning project.
In this post, you will discover a cheat sheet for the most popular statistical hypothesis tests for a machine learning project with examples using the Python API.
Each statistical test is presented in a consistent way, including:
 The name of the test.
 What the test is checking.
 The key assumptions of the test.
 How the test result is interpreted.
 Python API for using the test.
Note, when it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.
Finally, there may be multiple tests for a given concern, e.g. normality. We cannot get crisp answers to questions with statistics; instead, we get probabilistic answers. As such, we can arrive at different answers to the same question by considering the question in different ways. Hence the need for multiple different tests for some questions we may have about data.
Discover statistical hypothesis testing, resampling methods, estimation statistics and nonparametric methods in my new book, with 29 stepbystep tutorials and full source code.
Let’s get started.
 Update Nov/2018: Added a better overview of the tests covered.
 Update Nov/2019: Added complete working examples of each test. Add time series tests.
Tutorial Overview
This tutorial is divided into 5 parts; they are:

Normality Tests
 ShapiroWilk Test
 D’Agostino’s K^2 Test
 AndersonDarling Test

Correlation Tests
 Pearson’s Correlation Coefficient
 Spearman’s Rank Correlation
 Kendall’s Rank Correlation
 ChiSquared Test

Stationary Tests
 Augmented DickeyFuller
 KwiatkowskiPhillipsSchmidtShin

Parametric Statistical Hypothesis Tests
 Student’s ttest
 Paired Student’s ttest
 Analysis of Variance Test (ANOVA)
 Repeated Measures ANOVA Test

Nonparametric Statistical Hypothesis Tests
 MannWhitney U Test
 Wilcoxon SignedRank Test
 KruskalWallis H Test
 Friedman Test
1. Normality Tests
This section lists statistical tests that you can use to check if your data has a Gaussian distribution.
ShapiroWilk Test
Tests whether a data sample has a Gaussian distribution.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
Interpretation
 H0: the sample has a Gaussian distribution.
 H1: the sample does not have a Gaussian distribution.
Python Code
# Example of the ShapiroWilk Normality Test
from scipy.stats import shapiro
data = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
stat, p = shapiro(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably Gaussian’)
else:
print(‘Probably not Gaussian’)
# Example of the ShapiroWilk Normality Test
from scipy.stats import shapiro
data = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
stat, p = shapiro(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably Gaussian’)
else:
print(‘Probably not Gaussian’)
More Information
D’Agostino’s K^2 Test
Tests whether a data sample has a Gaussian distribution.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
Interpretation
 H0: the sample has a Gaussian distribution.
 H1: the sample does not have a Gaussian distribution.
Python Code
# Example of the D’Agostino’s K^2 Normality Test
from scipy.stats import normaltest
data = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
stat, p = normaltest(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably Gaussian’)
else:
print(‘Probably not Gaussian’)
# Example of the D’Agostino’s K^2 Normality Test
from scipy.stats import normaltest
data = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
stat, p = normaltest(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably Gaussian’)
else:
print(‘Probably not Gaussian’)
More Information
AndersonDarling Test
Tests whether a data sample has a Gaussian distribution.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
Interpretation
 H0: the sample has a Gaussian distribution.
 H1: the sample does not have a Gaussian distribution.
Python Code
# Example of the AndersonDarling Normality Test
from scipy.stats import anderson
data = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
result = anderson(data)
print(‘stat=%.3f’ % (result.statistic))
for i in range(len(result.critical_values)):
sl, cv = result.significance_level[i], result.critical_values[i]
if result.statistic < cv:
print(‘Probably Gaussian at the %.1f%% level’ % (sl))
else:
print(‘Probably not Gaussian at the %.1f%% level’ % (sl))
# Example of the AndersonDarling Normality Test
from scipy.stats import anderson
data = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
result = anderson(data)
print(‘stat=%.3f’ % (result.statistic))
for i in range(len(result.critical_values)):
sl, cv = result.significance_level[i], result.critical_values[i]
if result.statistic < cv:
print(‘Probably Gaussian at the %.1f%% level’ % (sl))
else:
print(‘Probably not Gaussian at the %.1f%% level’ % (sl))
More Information
2. Correlation Tests
This section lists statistical tests that you can use to check if two samples are related.
Pearson’s Correlation Coefficient
Tests whether two samples have a linear relationship.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample are normally distributed.
 Observations in each sample have the same variance.
Interpretation
 H0: the two samples are independent.
 H1: there is a dependency between the samples.
Python Code
# Example of the Pearson’s Correlation test
from scipy.stats import pearsonr
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [0.353, 3.517, 0.125, 7.545, 0.555, 1.536, 3.350, 1.578, 3.537, 1.579]
stat, p = pearsonr(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
# Example of the Pearson’s Correlation test
from scipy.stats import pearsonr
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [0.353, 3.517, 0.125, 7.545, 0.555, 1.536, 3.350, 1.578, 3.537, 1.579]
stat, p = pearsonr(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
More Information
Spearman’s Rank Correlation
Tests whether two samples have a monotonic relationship.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample can be ranked.
Interpretation
 H0: the two samples are independent.
 H1: there is a dependency between the samples.
Python Code
# Example of the Spearman’s Rank Correlation Test
from scipy.stats import spearmanr
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [0.353, 3.517, 0.125, 7.545, 0.555, 1.536, 3.350, 1.578, 3.537, 1.579]
stat, p = spearmanr(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
# Example of the Spearman’s Rank Correlation Test
from scipy.stats import spearmanr
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [0.353, 3.517, 0.125, 7.545, 0.555, 1.536, 3.350, 1.578, 3.537, 1.579]
stat, p = spearmanr(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
More Information
Kendall’s Rank Correlation
Tests whether two samples have a monotonic relationship.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample can be ranked.
Interpretation
 H0: the two samples are independent.
 H1: there is a dependency between the samples.
Python Code
# Example of the Kendall’s Rank Correlation Test
from scipy.stats import kendalltau
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [0.353, 3.517, 0.125, 7.545, 0.555, 1.536, 3.350, 1.578, 3.537, 1.579]
stat, p = kendalltau(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
# Example of the Kendall’s Rank Correlation Test
from scipy.stats import kendalltau
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [0.353, 3.517, 0.125, 7.545, 0.555, 1.536, 3.350, 1.578, 3.537, 1.579]
stat, p = kendalltau(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
More Information
ChiSquared Test
Tests whether two categorical variables are related or independent.
Assumptions
 Observations used in the calculation of the contingency table are independent.
 25 or more examples in each cell of the contingency table.
Interpretation
 H0: the two samples are independent.
 H1: there is a dependency between the samples.
Python Code
# Example of the ChiSquared Test
from scipy.stats import chi2_contingency
table = [[10, 20, 30],[6, 9, 17]]
stat, p, dof, expected = chi2_contingency(table)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
# Example of the ChiSquared Test
from scipy.stats import chi2_contingency
table = [[10, 20, 30],[6, 9, 17]]
stat, p, dof, expected = chi2_contingency(table)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably independent’)
else:
print(‘Probably dependent’)
More Information
3. Stationary Tests
This section lists statistical tests that you can use to check if a time series is stationary or not.
Augmented DickeyFuller Unit Root Test
Tests whether a time series has a unit root, e.g. has a trend or more generally is autoregressive.
Assumptions
 Observations in are temporally ordered.
Interpretation
 H0: a unit root is present (series is nonstationary).
 H1: a unit root is not present (series is stationary).
Python Code
# Example of the Augmented DickeyFuller unit root test
from statsmodels.tsa.stattools import adfuller
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
stat, p, lags, obs, crit, t = adfuller(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably not Stationary’)
else:
print(‘Probably Stationary’)
# Example of the Augmented DickeyFuller unit root test
from statsmodels.tsa.stattools import adfuller
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
stat, p, lags, obs, crit, t = adfuller(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably not Stationary’)
else:
print(‘Probably Stationary’)
More Information
KwiatkowskiPhillipsSchmidtShin
Tests whether a time series is trend stationary or not.
Assumptions
 Observations in are temporally ordered.
Interpretation
 H0: the time series is not trendstationary.
 H1: the time series is trendstationary.
Python Code
# Example of the KwiatkowskiPhillipsSchmidtShin test
from statsmodels.tsa.stattools import kpss
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
stat, p, lags, crit = kpss(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably not Stationary’)
else:
print(‘Probably Stationary’)
# Example of the KwiatkowskiPhillipsSchmidtShin test
from statsmodels.tsa.stattools import kpss
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
stat, p, lags, crit = kpss(data)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably not Stationary’)
else:
print(‘Probably Stationary’)
More Information
4. Parametric Statistical Hypothesis Tests
This section lists statistical tests that you can use to compare data samples.
Student’s ttest
Tests whether the means of two independent samples are significantly different.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample are normally distributed.
 Observations in each sample have the same variance.
Interpretation
 H0: the means of the samples are equal.
 H1: the means of the samples are unequal.
Python Code
# Example of the Student’s ttest
from scipy.stats import ttest_ind
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = ttest_ind(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
# Example of the Student’s ttest
from scipy.stats import ttest_ind
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = ttest_ind(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
More Information
Paired Student’s ttest
Tests whether the means of two paired samples are significantly different.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample are normally distributed.
 Observations in each sample have the same variance.
 Observations across each sample are paired.
Interpretation
 H0: the means of the samples are equal.
 H1: the means of the samples are unequal.
Python Code
# Example of the Paired Student’s ttest
from scipy.stats import ttest_rel
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = ttest_rel(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
# Example of the Paired Student’s ttest
from scipy.stats import ttest_rel
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = ttest_rel(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
More Information
Analysis of Variance Test (ANOVA)
Tests whether the means of two or more independent samples are significantly different.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample are normally distributed.
 Observations in each sample have the same variance.
Interpretation
 H0: the means of the samples are equal.
 H1: one or more of the means of the samples are unequal.
Python Code
# Example of the Analysis of Variance Test
from scipy.stats import f_oneway
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
data3 = [0.208, 0.696, 0.928, 1.148, 0.213, 0.229, 0.137, 0.269, 0.870, 1.204]
stat, p = f_oneway(data1, data2, data3)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
# Example of the Analysis of Variance Test
from scipy.stats import f_oneway
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
data3 = [0.208, 0.696, 0.928, 1.148, 0.213, 0.229, 0.137, 0.269, 0.870, 1.204]
stat, p = f_oneway(data1, data2, data3)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
More Information
Repeated Measures ANOVA Test
Tests whether the means of two or more paired samples are significantly different.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample are normally distributed.
 Observations in each sample have the same variance.
 Observations across each sample are paired.
Interpretation
 H0: the means of the samples are equal.
 H1: one or more of the means of the samples are unequal.
Python Code
Currently not supported in Python.
More Information
5. Nonparametric Statistical Hypothesis Tests
MannWhitney U Test
Tests whether the distributions of two independent samples are equal or not.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample can be ranked.
Interpretation
 H0: the distributions of both samples are equal.
 H1: the distributions of both samples are not equal.
Python Code
# Example of the MannWhitney U Test
from scipy.stats import mannwhitneyu
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = mannwhitneyu(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
# Example of the MannWhitney U Test
from scipy.stats import mannwhitneyu
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = mannwhitneyu(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
More Information
Wilcoxon SignedRank Test
Tests whether the distributions of two paired samples are equal or not.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample can be ranked.
 Observations across each sample are paired.
Interpretation
 H0: the distributions of both samples are equal.
 H1: the distributions of both samples are not equal.
Python Code
# Example of the Wilcoxon SignedRank Test
from scipy.stats import wilcoxon
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = wilcoxon(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
# Example of the Wilcoxon SignedRank Test
from scipy.stats import wilcoxon
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = wilcoxon(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
More Information
KruskalWallis H Test
Tests whether the distributions of two or more independent samples are equal or not.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample can be ranked.
Interpretation
 H0: the distributions of all samples are equal.
 H1: the distributions of one or more samples are not equal.
Python Code
# Example of the KruskalWallis H Test
from scipy.stats import kruskal
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = kruskal(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
# Example of the KruskalWallis H Test
from scipy.stats import kruskal
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
stat, p = kruskal(data1, data2)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
More Information
Friedman Test
Tests whether the distributions of two or more paired samples are equal or not.
Assumptions
 Observations in each sample are independent and identically distributed (iid).
 Observations in each sample can be ranked.
 Observations across each sample are paired.
Interpretation
 H0: the distributions of all samples are equal.
 H1: the distributions of one or more samples are not equal.
Python Code
# Example of the Friedman Test
from scipy.stats import friedmanchisquare
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
data3 = [0.208, 0.696, 0.928, 1.148, 0.213, 0.229, 0.137, 0.269, 0.870, 1.204]
stat, p = friedmanchisquare(data1, data2, data3)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
# Example of the Friedman Test
from scipy.stats import friedmanchisquare
data1 = [0.873, 2.817, 0.121, 0.945, 0.055, 1.436, 0.360, 1.478, 1.637, 1.869]
data2 = [1.142, 0.432, 0.938, 0.729, 0.846, 0.157, 0.500, 1.183, 1.075, 0.169]
data3 = [0.208, 0.696, 0.928, 1.148, 0.213, 0.229, 0.137, 0.269, 0.870, 1.204]
stat, p = friedmanchisquare(data1, data2, data3)
print(‘stat=%.3f, p=%.3f’ % (stat, p))
if p > 0.05:
print(‘Probably the same distribution’)
else:
print(‘Probably different distributions’)
More Information
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
Summary
In this tutorial, you discovered the key statistical hypothesis tests that you may need to use in a machine learning project.
Specifically, you learned:
 The types of tests to use in different circumstances, such as normality checking, relationships between variables, and differences between samples.
 The key assumptions for each test and how to interpret the test result.
 How to implement the test using the Python API.
Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.
Did I miss an important statistical test or key assumption for one of the listed tests?
Let me know in the comments below.
Get a Handle on Statistics for Machine Learning!
Develop a working understanding of statistics
…by writing lines of code in python
Discover how in my new Ebook:
Statistical Methods for Machine Learning
It provides selfstudy tutorials on topics like:
Hypothesis Tests, Correlation, Nonparametric Stats, Resampling, and much more…
Discover how to Transform Data into Knowledge
Skip the Academics. Just Results.
See What’s Inside