What distribution is used when the population standard deviation is unknown?

The t-distribution, also known as the Student’s t-distribution, is a type of probability distribution that is similar to the normal distribution with its bell shape but has heavier tails. It is used for estimating population parameters for small sample sizes or unknown variances. T-distributions have a greater chance for extreme values than normal distributions, and as a result have fatter tails.

The t-distribution is the basis for computing t-tests in statistics.

Key Takeaways

  • The t-distribution is a continuous probability distribution of the z-score when the estimated standard deviation is used in the denominator rather than the true standard deviation.
  • The t-distribution, like the normal distribution, is bell-shaped and symmetric, but it has heavier tails, which means it tends to produce values that fall far from its mean.
  • T-tests are used in statistics to estimate significance.

What Does a T-Distribution Tell You? 

Tail heaviness is determined by a parameter of the t-distribution called degrees of freedom, with smaller values giving heavier tails, and with higher values making the t-distribution resemble a standard normal distribution with a mean of 0, and a standard deviation of 1.

What distribution is used when the population standard deviation is unknown?
What distribution is used when the population standard deviation is unknown?

Image by Sabrina Jiang © Investopedia 2020

When a sample of n observations is taken from a normally distributed population having mean M and standard deviation D, the sample mean, m, and the sample standard deviation, d, will differ from M and D because of the randomness of the sample.

A Z-score can be calculated with the population standard deviation as Z = (x – M)/D, and this value has the normal distribution with mean 0 and standard deviation 1. But when using the estimated standard deviation, a t-score is calculated as T = (m – M)/{d/sqrt(n)}, the difference between d and D makes the distribution a t-distribution with (n - 1) degrees of freedom rather than the normal distribution with mean 0 and standard deviation 1. 

Example of How To Use a T-Distribution

Take the following example for how t-distributions are put to use in statistical analysis. First, remember that a confidence interval for the mean is a range of values, calculated from the data, meant to capture a “population” mean. This interval is m +- t*d/sqrt(n), where t is a critical value from the t-distribution.

For instance, a 95% confidence interval for the mean return of the Dow Jones Industrial Average in the 27 trading days prior to 9/11/2001, is -0.33%, (+/- 2.055) * 1.07 / sqrt(27), giving a (persistent) mean return as some number between -0.75% and +0.09%. The number 2.055, the amount of standard errors to adjust by, is found from the T distribution.

Because the t-distribution has fatter tails than a normal distribution, it can be used as a model for financial returns that exhibit excess kurtosis, which will allow for a more realistic calculation of Value at Risk (VaR) in such cases.

T-Distribution vs. Normal Distribution 

Normal distributions are used when the population distribution is assumed to be normal. The t-distribution is similar to the normal distribution, just with fatter tails. Both assume a normally distributed population. T-distributions thus have higher kurtosis than normal distributions. The probability of getting values very far from the mean is larger with a t-distribution than a normal distribution.

What distribution is used when the population standard deviation is unknown?
What distribution is used when the population standard deviation is unknown?

Normal vs. T-Distribution.

Limitations of Using a T-Distribution 

The t-distribution can skew exactness relative to the normal distribution. Its shortcoming only arises when there’s a need for perfect normality. The t-distribution should only be used when population standard deviation is not known. If the population standard deviation is known and the sample size is large enough, the normal distribution should be used for better results.

What Is the T-Distribution in Statistics?

The t-distribution is used in statistics to estimate the significance of population parameters for small sample sizes or unknown variations. Like the normal distribution, it is bell-shaped and symmetric. Unlike normal distributions it has heavier tails, which results in a greater chance for extreme values.

What is the appropriate distribution to use if population standard deviation is unknown?

A hypothesis test for a population mean when the population standard deviation, σ, is unknown is conducted in the same way as if the population standard deviation is known. The only difference is that the t-distribution is invoked, instead of the standard normal distribution (z-distribution).

What table should be used if the population variance or standard deviation is unknown?

The z-test requires that the population variance be known. Thus, for cases where the population variance is unknown (which generally occur quite often) one would instead utilize the t-test.

When the population standard deviation is not known what is used to estimate it?

When the population standard deviation, σ, is unknown, the sample standard deviation is used to estimate σ in the confidence interval formula. The quantity 1.96σ/ √n is often called the margin of error for the estimate.