Suppose \(X\) is a random variable with a distribution that may be known or unknown (it can be any distribution). Using a subscript that matches the random variable, suppose: If you draw random samples of size \(n\), then as \(n\) increases, the random variable \(\bar{X}\) which consists of sample means, tends to be normally distributed and \[\bar{X}
\sim N \left(\mu_{x}, \dfrac{\sigma_{x}}{\sqrt{n}}\right).\] The central limit theorem for sample means says that if you keep drawing larger and larger samples (such as rolling one, two, five, and finally, ten dice) and calculating their means, the sample means form their own normal distribution (the sampling distribution). The normal distribution has the same mean as the original distribution and a variance that equals the original variance divided by, the sample size. The
variable \(n\) is the number of values that are averaged together, not the number of times the experiment is done. To put it more formally, if you draw random samples of size \(n\), the distribution of the random variable \(\bar{X}\), which consists of sample means, is called the sampling distribution of the mean. The sampling distribution of the mean approaches a normal distribution as \(n\), the sample size, increases. The random variable \(\bar{X}\) has a different
\(z\)-score associated with it from that of the random variable \(X\). The mean \(\bar{x}\) is the value of \(\bar{X}\) in one sample. \[z = \dfrac{\bar{x}-\mu_{x}}{\left(\dfrac{\sigma_{x}}{\sqrt{n}}\right)}\] Howto: Find probabilities for means on the
calculator 2nd DISTR 2:normalcdf \(\text{normalcdf} \left(\text{lower value of the area, upper value of the area, mean}, \dfrac{\text{standard deviation}}{\sqrt{\text{sample size}}}\right)\) where: Example \(\PageIndex{1}\) An unknown distribution has a mean of 90 and a standard deviation of 15. Samples of size \(n = 25\) are drawn randomly from the population.
Answer a. Let \(X =\) one value from the original unknown population. The probability question asks you to find a probability for the sample mean. Let \(\bar{X} =\) the mean of a sample of size 25. Since \(\mu_{x} = 90, \sigma_{x} = 15\), and \(n = 25\), \[\bar{X} \sim N(90, \dfrac{15}{\sqrt{25}}). \nonumber\] Find \(P(85 < x < 92)\). Draw a graph. \[P(85 < x < 92) = 0.6997 \nonumber\] The probability that the sample mean is between 85 and 92 is 0.6997. Figure \(\PageIndex{1}\).
The parameter list is abbreviated (lower value, upper value, \(\mu\), \(\dfrac{\sigma}{\sqrt{n}}\))
b. To find the value that is two standard deviations above the expected value 90, use the formula: \[ \begin{align*} \text{value} &= \mu_{x} + (\#\text{ofTSDEVs})\left(\dfrac{\sigma_{x}}{\sqrt{n}}\right) \\[5pt] &= 90 + 2 \left(\dfrac{15}{\sqrt{25}}\right) = 96 \end{align*}\] The value that is two standard deviations above the expected value is 96. The standard error of the mean is \[\dfrac{\sigma_{x}}{\sqrt{n}} = \dfrac{15}{\sqrt{25}} = 3. \nonumber\] Recall that the standard error of the mean is a description of how far (on average) that the sample mean will be from the population mean in repeated simple random samples of size \(n\). Exercise \(\PageIndex{1}\) An unknown distribution has a mean of 45 and a standard deviation of eight. Samples of size \(n\) = 30 are drawn randomly from the population. Find the probability that the sample mean is between 42 and 50. Answer\(P(42 < \bar{x} < 50) = \left(42, 50, 45, \dfrac{8}{\sqrt{30}}\right) = 0.9797\) Example \(\PageIndex{2}\) The length of time, in hours, it takes an "over 40" group of people to play one soccer match is normally distributed with a mean of two hours and a standard deviation of 0.5 hours. A sample of size \(n = 50\) is drawn randomly from the population. Find the probability that the sample mean is between 1.8 hours and 2.3 hours. Answer Let \(X =\) the time, in hours, it takes to play one soccer match. The probability question asks you to find a probability for the sample mean time, in hours, it takes to play one soccer match. Let \(\bar{X} =\) the mean time, in hours, it takes to play one soccer match. If \(\mu_{x} =\) _________, \(\sigma_{x} =\) __________, and \(n =\) ___________, then \(X \sim N\)(______, ______) by the central limit theorem for means. \(\mu_{x} = 2, \sigma_{x} = 0.5, n = 50\), and \(X \sim N \left(2, \dfrac{0.5}{\sqrt{50}}\right)\) Find \(P(1.8 < \bar{x} < 2.3)\). Draw a graph. \(P(1.8 < \bar{x} < 2.3) = 0.9977\)
The probability that the mean time is between 1.8 hours and 2.3 hours is 0.9977. Exercise \(\PageIndex{2}\) The length of time taken on the SAT for a group of students is normally distributed with a mean of 2.5 hours and a standard deviation of 0.25 hours. A sample size of \(n = 60\) is drawn randomly from the population. Find the probability that the sample mean is between two hours and three hours. Answer\[P(2 < \bar{x} < 3) = \text{normalcdf}\left(2, 3, 2.5, \dfrac{0.25}{\sqrt{60}}\right) = 1 \nonumber\] Calculator SKills To find percentiles for means on the calculator, follow these steps.
\(k = \text{invNorm} \left(\text{area to the left of} k, \text{mean}, \dfrac{\text{standard deviation}}{\sqrt{sample size}}\right)\) where:
Example \(\PageIndex{3}\) In a recent study reported Oct. 29, 2012 on the Flurry Blog, the mean age of tablet users is 34 years. Suppose the standard deviation is 15 years. Take a sample of size \(n = 100\).
Answer
Exercise \(\PageIndex{3}\) In an article on Flurry Blog, a gaming marketing gap for men between the ages of 30 and 40 is identified. You are researching a startup game targeted at the 35-year-old demographic. Your idea is to develop a strategy game that can be played by men from their late 20s through their late 30s. Based on the article’s data, industry research shows that the average strategy player is 28 years old with a standard deviation of 4.8 years. You take a sample of 100 randomly selected gamers. If your target market is 29- to 35-year-olds, should you continue with your development strategy? AnswerYou need to determine the probability for men whose mean age is between 29 and 35 years of age wanting to play a strategy game. \[P(29 < \bar{x} < 35) = \text{normalcdf} \left(29, 35, 28,\dfrac{4.8}{\sqrt{100}}\right) = 0.0186\] You can conclude there is approximately a 1.9% chance that your game will be played by men whose mean age is between 29 and 35. Example \(\PageIndex{4}\) The mean number of minutes for app engagement by a tablet user is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample of 60.
Answer
Exercise \(\PageIndex{4}\) Cans of a cola beverage claim to contain 16 ounces. The amounts in a sample are measured and the statistics are \(n = 34\), \(\bar{x} = 16.01\) ounces. If the cans are filled so that \(\mu = 16.00\) ounces (as labeled) and \(\sigma = 0.143\) ounces, find the probability that a sample of 34 cans will have an average amount greater than 16.01 ounces. Do the results suggest that cans are filled with an amount greater than 16 ounces? AnswerWe have \(P(\bar{x} > 16.01) = \text{normalcdf} \left(16.01,E99,16, \dfrac{0.143}{\sqrt{34}}\right) = 0.3417\). Since there is a 34.17% probability that the average sample weight is greater than 16.01 ounces, we should be skeptical of the company’s claimed volume. If I am a consumer, I should be glad that I am probably receiving free cola. If I am the manufacturer, I need to determine if my bottling processes are outside of acceptable limits. SummaryIn a population whose distribution may be known or unknown, if the size (\(n\)) of samples is sufficiently large, the distribution of the sample means will be approximately normal. The mean of the sample means will equal the population mean. The standard deviation of the distribution of the sample means, called the standard error of the mean, is equal to the population standard deviation divided by the square root of the sample size (\(n\)). Formula Review
GlossaryAveragea number that describes the central tendency of the data; there are a number of specialized averages, including the arithmetic mean, weighted mean, median, mode, and geometric mean.Central Limit TheoremGiven a random variable (RV) with known mean \(\mu\) and known standard deviation, \(\sigma\), we are sampling with size \(n\), and we are interested in two new RVs: the sample mean, \(\bar{X}\), and the sample sum, \(\sum X\). If the size (\(n\)) of the sample is sufficiently large, then \(\bar{X} \sim N\left(\mu, \dfrac{\sigma}{\sqrt{n}}\right)\) and \(\sum X \sim N(n\mu, (\sqrt{n})(\sigma))\). If the size (\(n\)) of the sample is sufficiently large, then the distribution of the sample means and the distribution of the sample sums will approximate a normal distributions regardless of the shape of the population. The mean of the sample means will equal the population mean, and the mean of the sample sums will equal \(n\) times the population mean. The standard deviation of the distribution of the sample means, \(\dfrac{\sigma}{\sqrt{n}}\), is called the standard error of the mean.Normal Distributiona continuous random variable (RV) with pdf \(f(x) = \dfrac{1}{\sigma \sqrt{2 \pi}}e^{\dfrac{-(x-\mu)^{2}}{2 \sigma^{2}}}\), where \(\mu\) is the mean of the distribution and \(\sigma\) is the standard deviation; notation: \(X \sim N(\mu, \sigma)\). If \(\mu = 0\) and \(\sigma = 1\), the RV is called a standard normal distribution.Standard Error of the Mean the standard deviation of the distribution of the sample means, or \(\dfrac{\sigma}{\sqrt{n}}\).References
How is the central limit theorem related the normal distribution?The central limit theorem says that the sampling distribution of the mean will always be normally distributed, as long as the sample size is large enough. Regardless of whether the population has a normal, Poisson, binomial, or any other distribution, the sampling distribution of the mean will be normal.
What does the central limit theorem tell us about the sampling distribution of the sample mean quizlet?The central limit theorem states that the sampling distribution of any statistic will be normal or nearly normal, if the sample size is large enough.
|