Hypotheses Testing



population = np.random.normal(loc=65, scale=3.5, size=300)
population_mean = np.mean(population)

sample_1 = np.random.choice(population, size=30, replace=False)

Hypothesis Test Errors

Type I errors, also known as false positives, is the error of rejecting a null hypothesis when it is actually true. This can be viewed as a miss being registered as a hit. The acceptable rate of this type of error is called significance level and is usually set to be 0.05 (5%) or 0.01 (1%).

Type II errors, also known as false negatives, is the error of not rejecting a null hypothesis when the alternative hypothesis is the true. This can be viewed as a hit being registered as a miss.

Depending on the purpose of testing, testers decide which type of error to be concerned. But, usually type I error is more important than type II.

Sample Vs. Population Mean

In statistics, we often use the mean of a sample to estimate or infer the mean of the broader population from which the sample was taken. In other words, the sample mean is an estimation of the population mean.

Central Limit Theorem

The central limit theorem states that as samples of larger size are collected from a population, the distribution of sample means approaches a normal distribution with the same mean as the population. No matter the distribution of the population (uniform, binomial, etc), the sampling distribution of the mean will approximate a normal distribution and its mean is the same as the population mean.

The central limit theorem allows us to perform tests, make inferences, and solve problems using the normal distribution, even when the population is not normally distributed.

Hypothesis Test P-value

Statistical hypothesis tests return a p-value, which indicates the probability that the null hypothesis of a test is true. If the p-value is less than or equal to the significance level, then the null hypothesis is rejected in favor of the alternative hypothesis. And, if the p-value is greater than the significance level, then the null hypothesis is not rejected.

Univariate T-test

univariate T-test (or 1 Sample T-test) is a type of hypothesis test that compares a sample mean to a hypothetical population mean and determines the probability that the sample came from a distribution with the desired mean.

This can be performed in Python using the ttest_1samp() function of the SciPy library. The code block shows how to call ttest_1samp(). It requires two inputs, a sample distribution of values and an expected mean and returns two outputs, the t-statistic and the p-value.

from scipy.stats import ttest_1samp
t_stat, p_val = ttest_1samp(example_distribution, expected_mean)

from scipy.stats import ttest_ind
t_stat, pval = ttest_ind() # 2 Sample T-Test
t_stat, pval = f_oneway(a, b, c) # ANNOVA

Tukey’s Range Hypothesis Tests

Tukey’s Range hypothesis test can be used to check if the relationship between two datasets is statistically significant.

The Tukey’s Range test can be performed in Python using the StatsModels library function pairwise_tukeyhsd(). The example code block shows how to call pairwise_tukeyhsd(). It accepts a list of data, a list of labels, and the desired significance level.

from statsmodels.stats.multicomp import pairwise_tukeyhsd
tukey_results = pairwise_tukeyhsd(data, labels, alpha=significance_level)


def reject_null_hypothesis(p_value):
  Returns the truthiness of whether the null hypothesis can be rejected

  Takes a p-value as its input and assumes p <= 0.05 is significant
  return p_value <= 0.05 

# hypothesis_tests = [....] some array

for p_value in hypothesis_tests:

Binomial Test

import scipy.stats.binom_test
binom_test(x, n, p)

where, xx is the number of “successes” (0.051 * 10000 in this case) nn is the number of samples (10000 in this case) pp is the expected percentage of successes (0.06 in this case)

Chi Square Test

from scipy.stats import chi2_contingency

# Contingency table
#         harvester |  leaf cutter
# ----+------------------+------------
# 1st gr | 30       |  10
# 2nd gr | 35       |  5
# 3rd gr | 28       |  12
# 4th gr | 20 .     |  20

X = [[30, 10],
     [35, 5],
     [28, 12],
     [20, 20]]
chi2, pval, dof, expected = chi2_contingency(X)
print pval