Model Inference
Hypothesis Tests
Model Inference
Hypothesis tests are used to test whether claims are valid or not. This is conducted by collecting data, setting the Null and Alternative Hypothesis.
The null hypothesis is the claim that is initially believed to be true. For the most part, it is always equal to the hypothesized value.
The alternative hypothesis contradicts the null hypothesis.
We want to see if \(\mu\) is different from \(\mu_0\)
Null Hypothesis | Alternative Hypothesis |
---|---|
\(H_0: \mu=\mu_0\) | \(H_a: \mu\ne\mu_0\) |
\(H_0: \mu\le\mu_0\) | \(H_a: \mu>\mu_0\) |
\(H_0: \mu\ge\mu_0\) | \(H_0: \mu<\mu_0\) |
Notice how there are 3 types of null and alternative hypothesis, The first type of hypothesis (\(H_a:\mu\ne\mu_0\)) is considered a 2-sided hypothesis because the rejection region is located in 2 regions. The remaining two hypotheses are considered 1-sided because the rejection region is located on one side of the distribution.
Null Hypothesis | Alternative Hypothesis | Side |
---|---|---|
\(H_0: \mu=\mu_0\) | \(H_a: \mu\ne\mu_0\) | 2-sided |
\(H_0: \mu\le\mu_0\) | \(H_a: \mu>\mu_0\) | 1-sided |
\(H_0: \mu\ge\mu_0\) | \(H_0: \mu<\mu_0\) | 1-sided |
Test | # of Samples | Null Hypothesis | Distribution | DF | Test Statistics | Notes |
---|---|---|---|---|---|---|
t-test | 1 | \(\mu=\mu_0\) | \(t_{DF}\) | \(DF=n-1\) | \(\frac{\bar X-\mu_0}{\sqrt{s^2/n}}\) | \(n<30\) |
z-test | 1 | \(\mu=\mu_0\) | \(N(0,1)\) | None | \(\frac{\bar X-\mu_0}{\sqrt{\sigma^2/n}}\) | \(n>30\) \(\sigma^2\) known |
z-test | 1 | \(\mu=\mu_0\) | \(N(0,1)\) | None | \(\frac{\bar X-\mu_0}{\sqrt{s^2/n}}\) | \(n>30\) \(\sigma^2\) unknown |
paired t-test | 2 | \(\mu_1-\mu_2=D\) | \(t_{DF}\) | \(DF=n-1\) | \(\frac{\bar X_d-D}{\sqrt{s_d^2/n}}\) | \(n\) is the number of pairs |
independent t-test | 2 | \(\mu_1-\mu_2=D\) | \(t_{DF}\) | \(DF=n_1+n_2-2\) | \(\frac{\bar X_1-\bar X_2-D}{\sqrt{s_p^2(1/n_1+1/n_2)}}\) | \(s_p^2=\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}\) |
independent t-test | 2 | \(\mu_1-\mu_2=D\) | \(t_{DF}\) | \(DF=\frac{(s^2_1/n_1+s^2_2/n_2)^2}{(s^2_1/n_1)^2/(n_1-1)+(s^2_2/n_2)^2/(n_2-1)}\) | \(\frac{\bar X_1-\bar X_2-D}{\sqrt{s_1^2/n_1+s^2_2/n_2}}\) | |
ANOVA | >2 | \(\mu_1=\mu_2=\cdots=\mu_k\) | \(F_{DF1,DF2}\) |
The p-value approach is one of the most common methods to report significant results. It is easier to interpret the p-value because it provides the probability of observing our test statistics, or something more extreme, given that the null hypothesis is true. Depending on the type of test, your p-value may be constructed as:
Alternative Hypothesis | p-value |
---|---|
\(\mu>\mu_0\) | \(P(X>T(x))=p\) |
\(\mu<\mu_0\) | \(P(X<T(x))=p\) |
\(\mu\ne\mu_0\) | \(2\times P(X>|T(X)|)=p\) |
If \(p < \alpha\), then you reject \(H_0\); otherwise, you will fail to reject \(H_0\).
The confidence interval approach can evaluate a hypothesis test where the alternative hypothesis is \(\mu\ne\mu_0\). For this approach you will construct a \((1-\alpha)100\%\) confidence interval as
\[ PE \pm CV *SE \]
where PE is the point estimate, CV is the critical value based on \(\alpha\) and SE is the standard error. This will result in a lower and upper bound denoted as: \((LB, UB)\). The confidence intervals provides a range values to capture the parameter \(\mu\) such that if you repeat this process \(n\) times, \((1-\alpha)100\%\) of \(n\) will capture the true value of \(\mu\). If \(\mu_0 \in (LB,UB)\), then you fail to reject \(H_0\). If \(\mu_0\notin (LB,UB)\), then you reject \(H_0\).
\[ \frac{\hat\beta_j-\theta}{\mathrm{se}(\hat\beta_j)} \sim t_{n-p^\prime} \]
\[ \hat \beta_j \pm CV \times se(\hat\beta_j) \]
CV: Critical Value \(P(X<CV) = 1-\alpha/2\)
\(\alpha\): significance level
SE: Standard Error
We conduct model inference to determine if different models are better at explaining variation. A common example is to compare a linear model (\(\hat Y=\hat\beta_0 + \hat\beta_1 X\)) to the mean of Y (\(\hat \mu_y\)). We determine the significance of the variation explained using an Analysis of Variance (ANOVA) table and F test.
Source | DF | SS | MS | F |
---|---|---|---|---|
Model | \(DFR=k-1\) | \(SSR\) | \(MSR=\frac{SSM}{DFR}\) | \(\hat F=\frac{MSR}{MSE}\) |
Error | \(DFE=n-k\) | \(SSE\) | \(MSE=\frac{SSE}{DFE}\) | |
Total | \(TDF=n-1\) | \(TSS=SSR+SSE\) |
\[ \hat F \sim F(DFR, DFE) \]
Given:
\[ M1:\ \hat y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 \]
\[ M2:\ \hat y = \beta_0 + \beta_1 X_1 \]
Let \(M1\) be the FULL (larger) model, and let \(M2\) be the RED (Reduced, smaller) model.
He can test the following Hypothesis:
\[ \hat F = \frac{[SSE(RED) - SSE(FULL)]/[DFE(RED)-DFE(FULL)]}{MSE(FULL)} \]
\[ \hat F \sim F[DFE(RED) - DFE(FULL), DFE(FULL)] \]
This can be conducted as \(Y=\beta_0\)
Test the following hypothesis: \(H_0:\ \beta_0 = 205\) OR \(H_0: \mu = 205\)
This can be conducted as \(Y=\beta_0 + \beta_1 X\), where \(X\) is a binary variable for 2 samples.
Test the following hypothesis: \(H_0:\ \beta_1 = 0\) OR \(H_0: \mu_1-\mu_2 = 0\)
This can be conducted as \(Y_1 - Y_2=\beta_0\)
Test the following hypothesis: \(H_0:\ \beta_0 = 0\) OR \(H_0: \mu_1 - \mu_2 = 0\)
# DATA PREP
before <-c(200.1, 190.9, 192.7, 213, 241.4, 196.9, 172.2, 185.5, 205.2, 193.7)
after <-c(392.9, 393.2, 345.1, 393, 434, 427.9, 422, 383.9, 392.3, 352.2)
df <- data.frame(before, after)
df |> lm(after - before ~ 1, data = _) |> (\(x) c(coef(x), confint(x)))()
## OR
df |> with(t.test(before, after, paired = T))
This can be conducted as \(Y=\beta_0 + \beta_1 X_1 + \cdots + \beta_k X_k\), where \(X_k\) are binary variable for k samples.
Test the following hypothesis: \(H_0:\ \beta_1 =\beta_2 =\cdots=\beta_k = 0\) OR \(H_0: \mu_1 =\mu_2 =\cdots=\mu_k = 0\)