‎

Title slide slide

(org-show-animate '("Quantitative Methods, Part-II" "Introduction to Statistical Inference" "Vikas Rawal" "" "" ""))

Variable	Value
Standard deviation of population	130
Standard errors of samples of size
5	58
20	29
50	18
200	9

\(Z=\frac{x_{i}-\mu}{\sigma}\)

William Gosset’s 1908 Biometrika paper written with a pseudonym Student.
t-distribution with df degrees of freedom
- If \(Z \sim N(0,1)\) and \(U \sim \chi^{2}_{df}\) are independent then the ratio
  
  \(\frac{Z}{\sqrt{\frac{U}{df}}} \sim t_{df}\)
Properties:
- It is a bell-shaped distribution that is symmetric around 0.
- Its mean is equal to 0, and its variance is equal to df/(df-2) for df > 2.
- It has heavier tails than the normal distribution.
- For \(df=1\), it is a standard Cauchy distribution
- For \(df \rightarrow \infty, \thickmuskip t_{df} \rightarrow N(0,1)\)

Scale: discrete/quantitative
Use of randomisation in generating data
Population distribution
- Various assumptions may have to be made, usually on the basis of prior information or the sample data, about the underlying population distribution
- For example, about the population variance
Sample size

Null hypothesis (\(H_{0}\))

Statement that the parameter takes a particular value. Usually to capture no effect or no difference.

Alternative hypothesis (\(H_{a}\))

Statement that the parameter does not take this value.

(no term)

For example

The difference between the estimate of the test statistic and the parameter value in \(H_{0}\) in terms of number of standard error units.
For example
- For testing that population mean takes a particular value (\(\mu_{0}\))
  
  \(t=\frac{\bar{y}-\mu{0}}{\frac{s}{\sqrt{n}}}\)
- For the population means of two groups (say, men and women) being equal
  
  \(t=\frac{\bar{y_{a}}-\bar{y_{b}}}{\frac{s}{\sqrt{n}}}\)

We construct the distribution of the test statistic assuming that \(H_{0}\) is true.
Given the probability distribution, we find out the probability that the test statistic may take the estimated value.
This is the p-value.
It gives us the probability that the test statistic equals the observed value or a value even more extreme (in the direction predicted by \(H_{a}\)).
The smaller the p-value, stronger is the evidence against \(H_{0}\).
Essentially, we are saying that the data are very unusual if \(H_{0}\) is true.
The probability at which we reject \(H_{0}\) (usually 0.05 or less) is called the \(\alpha-level\) of the test.

	Reject \(H_{0}\)	Do not reject \(H_{0}\)
\(H_{0}\) is true	Type I Error
\(H_{0}\) is false		Type II Error