Understanding the Null Hypothesis for Linear Regression
Linear regression is a technique we can use to understand the relationship between one or more predictor variables and a response variable .
If we only have one predictor variable and one response variable, we can use simple linear regression , which uses the following formula to estimate the relationship between the variables:
ŷ = β 0 + β 1 x
 ŷ: The estimated response value.
 β 0 : The average value of y when x is zero.
 β 1 : The average change in y associated with a one unit increase in x.
 x: The value of the predictor variable.
Simple linear regression uses the following null and alternative hypotheses:
 H 0 : β 1 = 0
 H A : β 1 ≠ 0
The null hypothesis states that the coefficient β 1 is equal to zero. In other words, there is no statistically significant relationship between the predictor variable, x, and the response variable, y.
The alternative hypothesis states that β 1 is not equal to zero. In other words, there is a statistically significant relationship between x and y.
If we have multiple predictor variables and one response variable, we can use multiple linear regression , which uses the following formula to estimate the relationship between the variables:
ŷ = β 0 + β 1 x 1 + β 2 x 2 + … + β k x k
 β 0 : The average value of y when all predictor variables are equal to zero.
 β i : The average change in y associated with a one unit increase in x i .
 x i : The value of the predictor variable x i .
Multiple linear regression uses the following null and alternative hypotheses:
 H 0 : β 1 = β 2 = … = β k = 0
 H A : β 1 = β 2 = … = β k ≠ 0
The null hypothesis states that all coefficients in the model are equal to zero. In other words, none of the predictor variables have a statistically significant relationship with the response variable, y.
The alternative hypothesis states that not every coefficient is simultaneously equal to zero.
The following examples show how to decide to reject or fail to reject the null hypothesis in both simple linear regression and multiple linear regression models.
Example 1: Simple Linear Regression
Suppose a professor would like to use the number of hours studied to predict the exam score that students will receive in his class. He collects data for 20 students and fits a simple linear regression model.
The following screenshot shows the output of the regression model:
The fitted simple linear regression model is:
Exam Score = 67.1617 + 5.2503*(hours studied)
To determine if there is a statistically significant relationship between hours studied and exam score, we need to analyze the overall F value of the model and the corresponding pvalue:
 Overall FValue: 47.9952
 Pvalue: 0.000
Since this pvalue is less than .05, we can reject the null hypothesis. In other words, there is a statistically significant relationship between hours studied and exam score received.
Example 2: Multiple Linear Regression
Suppose a professor would like to use the number of hours studied and the number of prep exams taken to predict the exam score that students will receive in his class. He collects data for 20 students and fits a multiple linear regression model.
The fitted multiple linear regression model is:
Exam Score = 67.67 + 5.56*(hours studied) – 0.60*(prep exams taken)
To determine if there is a jointly statistically significant relationship between the two predictor variables and the response variable, we need to analyze the overall F value of the model and the corresponding pvalue:
 Overall FValue: 23.46
 Pvalue: 0.00
Since this pvalue is less than .05, we can reject the null hypothesis. In other words, hours studied and prep exams taken have a jointly statistically significant relationship with exam score.
Note: Although the pvalue for prep exams taken (p = 0.52) is not significant, prep exams combined with hours studied has a significant relationship with exam score.
Additional Resources
Understanding the FTest of Overall Significance in Regression How to Read and Interpret a Regression Table How to Report Regression Results How to Perform Simple Linear Regression in Excel How to Perform Multiple Linear Regression in Excel
The Complete Guide: How to Report Regression Results
R vs. rsquared: what’s the difference, related posts, how to normalize data between 1 and 1, vba: how to check if string contains another..., how to interpret fvalues in a twoway anova, how to create a vector of ones in..., how to find the mode of a histogram..., how to find quartiles in even and odd..., how to determine if a probability distribution is..., what is a symmetric histogram (definition & examples), how to calculate sxy in statistics (with example), how to calculate sxx in statistics (with example).
Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
 Knowledge Base
 Null and Alternative Hypotheses  Definitions & Examples
Null & Alternative Hypotheses  Definitions, Templates & Examples
Published on May 6, 2022 by Shaun Turney . Revised on June 22, 2023.
The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :
 Null hypothesis ( H 0 ): There’s no effect in the population .
 Alternative hypothesis ( H a or H 1 ) : There’s an effect in the population.
Table of contents
Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, similarities and differences between null and alternative hypotheses, how to write null and alternative hypotheses, other interesting articles, frequently asked questions.
The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”:
 The null hypothesis ( H 0 ) answers “No, there’s no effect in the population.”
 The alternative hypothesis ( H a ) answers “Yes, there is an effect in the population.”
The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses .
You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.
Receive feedback on language, structure, and formatting
Professional editors proofread and edit your paper by focusing on:
 Academic style
 Vague sentences
 Style consistency
See an example
The null hypothesis is the claim that there’s no effect in the population.
If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.
Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept . Be careful not to say you “prove” or “accept” the null hypothesis.
Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).
You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error . When you incorrectly fail to reject it, it’s a type II error.
Examples of null hypotheses
The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.
( )  
Does tooth flossing affect the number of cavities?  Tooth flossing has on the number of cavities.  test: The mean number of cavities per person does not differ between the flossing group (µ ) and the nonflossing group (µ ) in the population; µ = µ . 
Does the amount of text highlighted in the textbook affect exam scores?  The amount of text highlighted in the textbook has on exam scores.  : There is no relationship between the amount of text highlighted and exam scores in the population; β = 0. 
Does daily meditation decrease the incidence of depression?  Daily meditation the incidence of depression.*  test: The proportion of people with depression in the dailymeditation group ( ) is greater than or equal to the nomeditation group ( ) in the population; ≥ . 
*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .
The alternative hypothesis ( H a ) is the other answer to your research question . It claims that there’s an effect in the population.
Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.
The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.
Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.
Examples of alternative hypotheses
The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.
Does tooth flossing affect the number of cavities?  Tooth flossing has an on the number of cavities.  test: The mean number of cavities per person differs between the flossing group (µ ) and the nonflossing group (µ ) in the population; µ ≠ µ . 
Does the amount of text highlighted in a textbook affect exam scores?  The amount of text highlighted in the textbook has an on exam scores.  : There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0. 
Does daily meditation decrease the incidence of depression?  Daily meditation the incidence of depression.  test: The proportion of people with depression in the dailymeditation group ( ) is less than the nomeditation group ( ) in the population; < . 
Null and alternative hypotheses are similar in some ways:
 They’re both answers to the research question.
 They both make claims about the population.
 They’re both evaluated by statistical tests.
However, there are important differences between the two types of hypotheses, summarized in the following table.
A claim that there is in the population.  A claim that there is in the population.  
 
Equality symbol (=, ≥, or ≤)  Inequality symbol (≠, <, or >)  
Rejected  Supported  
Failed to reject  Not supported 
Prevent plagiarism. Run a free check.
To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the testspecific template sentences. Otherwise, you can use the general template sentences.
General template sentences
The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:
Does independent variable affect dependent variable ?
 Null hypothesis ( H 0 ): Independent variable does not affect dependent variable.
 Alternative hypothesis ( H a ): Independent variable affects dependent variable.
Testspecific template sentences
Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.
( )  
test
with two groups  The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ .  The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ . 
with three groups  The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ .  The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population. 
There is no correlation between independent variable and dependent variable in the population; ρ = 0.  There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0.  
There is no relationship between independent variable and dependent variable in the population; β = 0.  There is a relationship between independent variable and dependent variable in the population; β ≠ 0.  
Twoproportions test  The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = .  The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ . 
Note: The template sentences above assume that you’re performing onetailed tests . Onetailed tests are appropriate for most studies.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
 Normal distribution
 Descriptive statistics
 Measures of central tendency
 Correlation coefficient
Methodology
 Cluster sampling
 Stratified sampling
 Types of interviews
 Cohort study
 Thematic analysis
Research bias
 Implicit bias
 Cognitive bias
 Survivorship bias
 Availability heuristic
 Nonresponse bias
 Regression to the mean
Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.
Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.
The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).
The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).
A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).
A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a welldesigned study , the statistical hypotheses correspond logically to the research hypothesis.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Turney, S. (2023, June 22). Null & Alternative Hypotheses  Definitions, Templates & Examples. Scribbr. Retrieved August 29, 2024, from https://www.scribbr.com/statistics/nullandalternativehypotheses/
Is this article helpful?
Shaun Turney
Other students also liked, inferential statistics  an easy introduction & examples, hypothesis testing  a stepbystep guide with easy examples, type i & type ii errors  differences, examples, visualizations, what is your plagiarism score.
9.1 Null and Alternative Hypotheses
The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.
H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.
H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .
Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.
After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.
Mathematical Symbols Used in H 0 and H a :
equal (=)  not equal (≠) greater than (>) less than (<) 
greater than or equal to (≥)  less than (<) 
less than or equal to (≤)  more than (>) 
H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.
Example 9.1
H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30
A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.
Example 9.2
We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0
We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.
 H 0 : μ __ 66
 H a : μ __ 66
Example 9.3
We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5
We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.
 H 0 : μ __ 45
 H a : μ __ 45
Example 9.4
An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066
On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.
 H 0 : p __ 0.40
 H a : p __ 0.40
Collaborative Exercise
Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.
This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.
Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/teastatistics . Changes were made to the original material, including updates to art, structure, and other content updates.
Access for free at https://openstax.org/books/statistics/pages/1introduction
 Authors: Barbara Illowsky, Susan Dean
 Publisher/website: OpenStax
 Book title: Statistics
 Publication date: Mar 27, 2020
 Location: Houston, Texas
 Book URL: https://openstax.org/books/statistics/pages/1introduction
 Section URL: https://openstax.org/books/statistics/pages/91nullandalternativehypotheses
© Apr 16, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.
Simple Linear Regression
Simple linear regression #.
RStudio: RMarkdown , Quarto
Father & son data #
Pearson’s height data #.
fheight  sheight  

<dbl>  <dbl>  
1  65.04851  59.77827 
2  63.25094  63.21404 
3  64.95532  63.34242 
4  65.75250  62.79238 
5  61.13723  64.28113 
6  63.02254  64.24221 
An example of simple linear regression model .
Breakdown of terms:
regression model : a model for the mean of a response given features
linear : model of the mean is linear in parameters of interest
simple : only a single feature
Slicewise model #
A simple linear regression model fits a line through the above scatter plot by modelling slices .
Conditional means #
The average height of sons whose height fell within our slice [69.5,79.5] is about 69.8 inches.
This height varies by slice…
At 65 inches it’s about 67.2 inches:
Multiple samples model ( longevity example) #
We’ve seen this slicewise model before: error in each slice the same.
Regression as slicewise model #
In longevity model: no relation between the means in each slice! We needed to use a parameter for each Diet …
Regression model says that the mean in slice father is
This ties together all (father, son) points in the scatterplot.
Chooses \((\beta_0, \beta_1)\) by jointly modeling the mean in each slice…
Height data as slices #
What is a “regression” model? #
Model of the relationships between some covariates / predictors / features and an outcome .
A regression model is a model of the average outcome given the covariates .
Mathematical formulation #
For height data: a mathematical model:
\(f\) describes how mean of son varies with father
\(\varepsilon\) is the random variation within the slice.
Linear regression models #
A linear regression model says that the function \(f\) is a sum (linear combination) of functions of father .
Simple linear regression model:
Parameters of \(f\) are \((\beta_0, \beta_1)\)
Could also be a sum (linear combination) of fixed functions of father :
Statistical model #
Symbol \(Y\) usually used for outcomes, \(X\) for covariates…
where \(\varepsilon_i \sim N(0, \sigma^2)\) are independent.
This specifies a distribution for the \(Y\) ’s given the \(X\) ’s, i.e. it is a statistical model .
Regression equation #
The regression equation is our slicewise model.
Formally, this is a model of the conditional mean function
Book uses the notation \(\mu\{YX\}\) .
Fitting the model #
We will be using least squares regression. This measures the goodness of fit of a line by the sum of squared errors, \(SSE\) .
Least squares regression chooses the line that minimizes
In principle, we might measure goodness of fit differently:
For some other loss function \(L\) we might try to minimize
Why least squares? #
With least squares, the minimizers have explicit formulae – not so important with today’s computer power.
Resulting formulae are linear in the outcome \(Y\) . This is important for inferential reasons. For only predictive power, this is also not so important.
If assumptions are correct, then this is maximum likelihood estimation .
Statistical theory tells us the maximum likelihood estimators (MLEs) are generally good estimators.
Choice of loss function #
Suppose we try to minimize squared error over \(\mu\) :
We know (by calculus) that the minimizer is the sample mean.
If we minimize absolute error over \(\mu\)
We know (similarly by calculus) that the minimizer(s) is (are) the sample median(s).
Visualizing the loss function #
Let’s take some a random scatter plot and view the loss function.
Let’s plot the loss as a function of the parameters. Note that the true intercept is 1.5 while the true slope is 0.1.
Let’s contrast this with the sum of absolute errors.
Geometry of least squares #
Some things to note:
Minimizing sum of squares is the same as finding the point in the X,1 plane closest to \(Y\) .
The total dimension of the space is 1078.
The dimension of the plane is 2dimensional.
The axis marked “ \(\perp\) ” should be thought of as \((n2)\) dimensional, or, 1076 in this case.
Least squares #
The (squared) lengths of the above vectors are important quantities in what follows.
Important lengths #
There are three to note:
An important summary of the fit is the ratio
Measures how much variability in \(Y\) is explained by \(X\) .
Case study A: data suggesting the Big Bang #
Let’s fit the linear regression model.
Let’s look at the summary :
Hubble’s model #
Hubble’s theory of the Big Bang suggests that the correct slicewise (i.e. regression) model is
To fit without an intercept
Least squares estimators #
There are explicit formulae for the least squares estimators, i.e. the minimizers of the error sum of squares.
For the slope, \(\hat{\beta}_1\) , it can be shown that
Knowing the slope estimate, the intercept estimate can be found easily:
Example: big_bang #
 0.399170439725205
 0.00137240753172474
Estimate of \(\sigma^2\) #
The estimate most commonly used is
We’ll use the common practice of replacing the quantity \(SSE(\hat{\beta}_0,\hat{\beta}_1)\) , i.e. the minimum of this function, with just \(SSE\) .
The term MSE above refers to mean squared error: a sum of squares divided by its degrees of freedom . The degrees of freedom of SSE , the error sum of squares is therefore \(n2\) .
Mathematical aside #
We divide by \(n2\) because some calculations tell us:
Above, the right hand side denotes a chisquared distribution with \(n2\) degrees of freedom.
Dividing by \(n2\) gives an unbiased estimate of \(\sigma^2\) (assuming our modeling assumptions are correct).
Inference for the simple linear regression model #
Remember: \(X\) can be fixed or random in our model…
Case study B: predicting pH based on time after slaughter #
In this study, researches fixed \(X\) ( Time ) before measuring \(Y\) ( pH )
Ultimate goal: how long after slaughter is pH around 6?
Inference for \(\beta_0\) or \(\beta_1\) #
Recall our model
The errors \(\varepsilon_i\) are independent \(N(0, \sigma^2)\) .
In our heights example, we might want to now if there really is a linear association between \({\tt son}=Y\) and \({\tt father}=X\) . This can be answered with a hypothesis test of the null hypothesis \(H_0:\beta_1=0\) . This assumes the model above is correct, AND \(\beta_1=0\) .
Alternatively, we might want to have a range of values that we can be fairly certain \(\beta_1\) lies within. This is a confidence interval for \(\beta_1\) .
A mathematical aside #
Let \(L\) be the subspace of \(\mathbb{R}^n\) spanned \(\pmb{1}=(1, \dots, 1)\) and \({X}=(X_1, \dots, X_n)\) .
We can decompose \(Y\) as
In our model, \(\mu=\beta_0 \pmb{1} + \beta_1 {X} \in L\) so that
Our assumption that \(\varepsilon_i\) ’s are independent \(N(0,\sigma^2)\) tells us that: \({e}\) and \(\widehat{{Y}}\) are independent; \(\widehat{\sigma}^2 = \{e}\^2 / (n2) \sim \sigma^2 \cdot \chi^2_{n2} / (n2)\) .
Setup for inference #
All of this implies
The other quantity we need is the standard error or SE of \(\hat{\beta}_1\) :
Testing \(H_0:\beta_1=\beta_1^0\) #
Suppose we want to test that \(\beta_1\) is some prespecified value, \(\beta_1^0\) (this is often 0: i.e. is there a linear association)
Under \(H_0:\beta_1=\beta_1^0\)
Reject \(H_0:\beta_1=\beta_1^0\) if \(T > t_{n2, 1\alpha/2}\) .
Let’s perform this test for the Big Bang data.
We see that R performs our \(t\) test in the second row of the Coefficients table.
It is clear that Distance is correlated with Velocity .
There seems to be some flaw in Hubble’s theory: we reject \(H_0:\beta_0=0\) at level 5%: \(p\) value is 0.0028!
Why reject for large T? #
Logic is the same as other \(t\) tests: observing a large \(T\) is unlikely if \(\beta_1 = \beta_1^0\) (i.e. if \(H_0\) were true). \(\implies\) it is reasonable to conclude that \(H_0\) is false.
Common to report \(p\) value:
Confidence interval for regression parameters #
Applying the above to the parameter \(\beta_1\) yields a confidence interval of the form
Earlier, we computed \(SE(\hat{\beta}_1)\) using this formula
with \((a_0,a_1) = (0, 1)\) .
We also need to find the quantity \(t_{n2,1\alpha/2}\) . This is defined by
In R , this is computed by the function qt .
We will not need to use these explicit formulae all the time, as R has some built in functions to compute confidence intervals.
2.5 %  97.5 %  

(Intercept)  0.1530719058  0.64526897 
Velocity  0.0008999349  0.00184488 
5 %  95 %  

(Intercept)  0.1954035267  0.60293735 
Velocity  0.0009812054  0.00176361 
Predicting the mean #
Once we have estimated a slope \((\hat{\beta}_1)\) and an intercept \((\hat{\beta}_0)\) , we can predict the height of the son born to a father of any particular height by the pluggingin the height of the new father, \(F_{new}\) into our regression equation:
Confidence interval for the average height of sons born to a father of height \(F_{new}=70\) (or maybe \(65\) ) inches:
fit  lwr  upr  

1  69.87312  69.71333  70.03291 
2  67.30265  67.13165  67.47366 
Computing \(SE(\hat{\beta}_0 + 70 \cdot \hat{\beta}_1)\) #
We use the previous formula
with \((a_0, a_1) = (1, 70)\) .
Plugging in
As \(n\) grows (taking a larger sample), \(SE(\hat{\beta}_0 + 70 \hat{\beta}_1)\) should shrink to 0. Why?
Forecasting / prediction intervals #
Can we find an interval that covers the height of a particular son knowing only that her father’s height as 70 inches?
Must cover the variability of the new random variation \(\implies\) it must be at least as wide as \(\sigma\) .
With so much data in our heights example, this 90% interval will have width roughly \(2 \cdot 1.96 \cdot \hat{\sigma}\)
 8.01555526222726
 8.18982582776275
Actual width will depend on how accurately we have estimated \((\beta_0, \beta_1)\) as well as \(\hat{\sigma}\) .
The final interval is
Computed in R as follows
fit  lwr  upr  

1  69.87312  65.8587  73.88753 
Have a thesis expert improve your writing
Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.
 Knowledge Base
 Null and Alternative Hypotheses  Definitions & Examples
Null and Alternative Hypotheses  Definitions & Examples
Published on 5 October 2022 by Shaun Turney . Revised on 6 December 2022.
The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :
 Null hypothesis (H 0 ): There’s no effect in the population .
 Alternative hypothesis (H A ): There’s an effect in the population.
The effect is usually the effect of the independent variable on the dependent variable .
Table of contents
Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, differences between null and alternative hypotheses, how to write null and alternative hypotheses, frequently asked questions about null and alternative hypotheses.
The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”, the null hypothesis (H 0 ) answers “No, there’s no effect in the population.” On the other hand, the alternative hypothesis (H A ) answers “Yes, there is an effect in the population.”
The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample.
You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.
The null hypothesis is the claim that there’s no effect in the population.
If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.
Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept. Be careful not to say you “prove” or “accept” the null hypothesis.
Null hypotheses often include phrases such as “no effect”, “no difference”, or “no relationship”. When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).
Examples of null hypotheses
The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.
( )  
Does tooth flossing affect the number of cavities?  Tooth flossing has on the number of cavities.  test: The mean number of cavities per person does not differ between the flossing group (µ ) and the nonflossing group (µ ) in the population; µ = µ . 
Does the amount of text highlighted in the textbook affect exam scores?  The amount of text highlighted in the textbook has on exam scores.  : There is no relationship between the amount of text highlighted and exam scores in the population; β = 0. 
Does daily meditation decrease the incidence of depression?  Daily meditation the incidence of depression.*  test: The proportion of people with depression in the dailymeditation group ( ) is greater than or equal to the nomeditation group ( ) in the population; ≥ . 
*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .
The alternative hypothesis (H A ) is the other answer to your research question . It claims that there’s an effect in the population.
Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.
The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.
Alternative hypotheses often include phrases such as “an effect”, “a difference”, or “a relationship”. When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes > or <). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.
Examples of alternative hypotheses
The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.
Does tooth flossing affect the number of cavities?  Tooth flossing has an on the number of cavities.  test: The mean number of cavities per person differs between the flossing group (µ ) and the nonflossing group (µ ) in the population; µ ≠ µ . 
Does the amount of text highlighted in a textbook affect exam scores?  The amount of text highlighted in the textbook has an on exam scores.  : There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0. 
Does daily meditation decrease the incidence of depression?  Daily meditation the incidence of depression.  test: The proportion of people with depression in the dailymeditation group ( ) is less than the nomeditation group ( ) in the population; < . 
Null and alternative hypotheses are similar in some ways:
 They’re both answers to the research question
 They both make claims about the population
 They’re both evaluated by statistical tests.
However, there are important differences between the two types of hypotheses, summarized in the following table.
A claim that there is in the population.  A claim that there is in the population.  
 
Equality symbol (=, ≥, or ≤)  Inequality symbol (≠, <, or >)  
Rejected  Supported  
Failed to reject  Not supported 
To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the testspecific template sentences. Otherwise, you can use the general template sentences.
The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:
Does independent variable affect dependent variable ?
 Null hypothesis (H 0 ): Independent variable does not affect dependent variable .
 Alternative hypothesis (H A ): Independent variable affects dependent variable .
Testspecific
Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.
( )  
test
with two groups  The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ .  The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ . 
with three groups  The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ .  The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population. 
There is no correlation between independent variable and dependent variable in the population; ρ = 0.  There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0.  
There is no relationship between independent variable and dependent variable in the population; β = 0.  There is a relationship between independent variable and dependent variable in the population; β ≠ 0.  
Twoproportions test  The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = .  The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ . 
Note: The template sentences above assume that you’re performing onetailed tests . Onetailed tests are appropriate for most studies.
The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).
The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).
A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).
A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a welldesigned study , the statistical hypotheses correspond logically to the research hypothesis.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.
Turney, S. (2022, December 06). Null and Alternative Hypotheses  Definitions & Examples. Scribbr. Retrieved 29 August 2024, from https://www.scribbr.co.uk/stats/nullandalternativehypothesis/
Is this article helpful?
Shaun Turney
Other students also liked, levels of measurement: nominal, ordinal, interval, ratio, the standard normal distribution  calculator, examples & uses, types of variables in research  definitions & examples.
Linear regression  Hypothesis testing
by Marco Taboga , PhD
This lecture discusses how to perform tests of hypotheses about the coefficients of a linear regression model estimated by ordinary least squares (OLS).
Table of contents
Normal vs nonnormal model
The linear regression model, matrix notation, tests of hypothesis in the normal linear regression model, test of a restriction on a single coefficient (t test), test of a set of linear restrictions (f test), tests based on maximum likelihood procedures (wald, lagrange multiplier, likelihood ratio), tests of hypothesis when the ols estimator is asymptotically normal, test of a restriction on a single coefficient (z test), test of a set of linear restrictions (chisquare test), learn more about regression analysis.
The lecture is divided in two parts:
in the first part, we discuss hypothesis testing in the normal linear regression model , in which the OLS estimator of the coefficients has a normal distribution conditional on the matrix of regressors;
in the second part, we show how to carry out hypothesis tests in linear regression analyses where the hypothesis of normality holds only in large samples (i.e., the OLS estimator can be proved to be asymptotically normal).
We also denote:
We now explain how to derive tests about the coefficients of the normal linear regression model.
It can be proved (see the lecture about the normal linear regression model ) that the assumption of conditional normality implies that:
How the acceptance region is determined depends not only on the desired size of the test , but also on whether the test is:
onetailed (only one of the two things, i.e., either smaller or larger, is possible).
For more details on how to determine the acceptance region, see the glossary entry on critical values .
The F test is onetailed .
A critical value in the right tail of the F distribution is chosen so as to achieve the desired size of the test.
Then, the null hypothesis is rejected if the F statistics is larger than the critical value.
In this section we explain how to perform hypothesis tests about the coefficients of a linear regression model when the OLS estimator is asymptotically normal.
As we have shown in the lecture on the properties of the OLS estimator , in several cases (i.e., under different sets of assumptions) it can be proved that:
These two properties are used to derive the asymptotic distribution of the test statistics used in hypothesis testing.
The test can be either onetailed or twotailed . The same comments made for the ttest apply here.
Like the F test, also the Chisquare test is usually onetailed .
The desired size of the test is achieved by appropriately choosing a critical value in the right tail of the Chisquare distribution.
The null is rejected if the Chisquare statistics is larger than the critical value.
Want to learn more about regression analysis? Here are some suggestions:
R squared of a linear regression ;
GaussMarkov theorem ;
Generalized Least Squares ;
Multicollinearity ;
Dummy variables ;
Selection of linear regression models
Partitioned regression ;
Ridge regression .
How to cite
Please cite as:
Taboga, Marco (2021). "Linear regression  Hypothesis testing", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentalsofstatistics/linearregressionhypothesistesting.
Most of the learning materials found on this website are now available in a traditional textbook format.
 F distribution
 Beta distribution
 Conditional probability
 Central Limit Theorem
 Binomial distribution
 Mean square convergence
 Delta method
 Almost sure convergence
 Mathematical tools
 Fundamentals of probability
 Probability distributions
 Asymptotic theory
 Fundamentals of statistics
 About Statlect
 Cookies, privacy and terms of use
 Loss function
 Almost sure
 Type I error
 Precision matrix
 Integrable variable
 To enhance your privacy,
 we removed the social buttons,
 but don't forget to share .
 Prompt Library
 DS/AI Trends
 Stats Tools
 Interview Questions
 Generative AI
 Machine Learning
 Deep Learning
Linear regression hypothesis testing: Concepts, Examples
In relation to machine learning , linear regression is defined as a predictive modeling technique that allows us to build a model which can help predict continuous response variables as a function of a linear combination of explanatory or predictor variables. While training linear regression models, we need to rely on hypothesis testing in relation to determining the relationship between the response and predictor variables. In the case of the linear regression model, two types of hypothesis testing are done. They are Ttests and Ftests . In other words, there are two types of statistics that are used to assess whether linear regression models exist representing response and predictor variables. They are tstatistics and fstatistics. As data scientists , it is of utmost importance to determine if linear regression is the correct choice of model for our particular problem and this can be done by performing hypothesis testing related to linear regression response and predictor variables. Many times, it is found that these concepts are not very clear with a lot many data scientists. In this blog post, we will discuss linear regression and hypothesis testing related to tstatistics and fstatistics . We will also provide an example to help illustrate how these concepts work.
Table of Contents
What are linear regression models?
A linear regression model can be defined as the function approximation that represents a continuous response variable as a function of one or more predictor variables. While building a linear regression model, the goal is to identify a linear equation that best predicts or models the relationship between the response or dependent variable and one or more predictor or independent variables.
There are two different kinds of linear regression models. They are as follows:
 Simple or Univariate linear regression models : These are linear regression models that are used to build a linear relationship between one response or dependent variable and one predictor or independent variable. The form of the equation that represents a simple linear regression model is Y=mX+b, where m is the coefficients of the predictor variable and b is bias. When considering the linear regression line, m represents the slope and b represents the intercept.
 Multiple or Multivariate linear regression models : These are linear regression models that are used to build a linear relationship between one response or dependent variable and more than one predictor or independent variable. The form of the equation that represents a multiple linear regression model is Y=b0+b1X1+ b2X2 + … + bnXn, where bi represents the coefficients of the ith predictor variable. In this type of linear regression model, each predictor variable has its own coefficient that is used to calculate the predicted value of the response variable.
While training linear regression models, the requirement is to determine the coefficients which can result in the bestfitted linear regression line. The learning algorithm used to find the most appropriate coefficients is known as least squares regression . In the leastsquares regression method, the coefficients are calculated using the leastsquares error function. The main objective of this method is to minimize or reduce the sum of squared residuals between actual and predicted response values. The sum of squared residuals is also called the residual sum of squares (RSS). The outcome of executing the leastsquares regression method is coefficients that minimize the linear regression cost function .
The residual e of the ith observation is represented as the following where [latex]Y_i[/latex] is the ith observation and [latex]\hat{Y_i}[/latex] is the prediction for ith observation or the value of response variable for ith observation.
[latex]e_i = Y_i – \hat{Y_i}[/latex]
The residual sum of squares can be represented as the following:
[latex]RSS = e_1^2 + e_2^2 + e_3^2 + … + e_n^2[/latex]
The leastsquares method represents the algorithm that minimizes the above term, RSS.
Once the coefficients are determined, can it be claimed that these coefficients are the most appropriate ones for linear regression? The answer is no. After all, the coefficients are only the estimates and thus, there will be standard errors associated with each of the coefficients. Recall that the standard error is used to calculate the confidence interval in which the mean value of the population parameter would exist. In other words, it represents the error of estimating a population parameter based on the sample data. The value of the standard error is calculated as the standard deviation of the sample divided by the square root of the sample size. The formula below represents the standard error of a mean.
[latex]SE(\mu) = \frac{\sigma}{\sqrt(N)}[/latex]
Thus, without analyzing aspects such as the standard error associated with the coefficients, it cannot be claimed that the linear regression coefficients are the most suitable ones without performing hypothesis testing. This is where hypothesis testing is needed . Before we get into why we need hypothesis testing with the linear regression model, let’s briefly learn about what is hypothesis testing?
Train a Multiple Linear Regression Model using R
Before getting into understanding the hypothesis testing concepts in relation to the linear regression model, let’s train a multivariate or multiple linear regression model and print the summary output of the model which will be referred to, in the next section.
The data used for creating a multilinear regression model is BostonHousing which can be loaded in RStudioby installing mlbench package. The code is shown below:
install.packages(“mlbench”) library(mlbench) data(“BostonHousing”)
Once the data is loaded, the code shown below can be used to create the linear regression model.
attach(BostonHousing) BostonHousing.lm < lm(log(medv) ~ crim + chas + rad + lstat) summary(BostonHousing.lm)
Executing the above command will result in the creation of a linear regression model with the response variable as medv and predictor variables as crim, chas, rad, and lstat. The following represents the details related to the response and predictor variables:
 log(medv) : Log of the median value of owneroccupied homes in USD 1000’s
 crim : Per capita crime rate by town
 chas : Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
 rad : Index of accessibility to radial highways
 lstat : Percentage of the lower status of the population
The following will be the output of the summary command that prints the details relating to the model including hypothesis testing details for coefficients (tstatistics) and the model as a whole (fstatistics)
Hypothesis tests & Linear Regression Models
Hypothesis tests are the statistical procedure that is used to test a claim or assumption about the underlying distribution of a population based on the sample data. Here are key steps of doing hypothesis tests with linear regression models:
 Hypothesis formulation for Ttests: In the case of linear regression, the claim is made that there exists a relationship between response and predictor variables, and the claim is represented using the nonzero value of coefficients of predictor variables in the linear equation or regression model. This is formulated as an alternate hypothesis. Thus, the null hypothesis is set that there is no relationship between response and the predictor variables . Hence, the coefficients related to each of the predictor variables is equal to zero (0). So, if the linear regression model is Y = a0 + a1x1 + a2x2 + a3x3, then the null hypothesis for each test states that a1 = 0, a2 = 0, a3 = 0 etc. For all the predictor variables, individual hypothesis testing is done to determine whether the relationship between response and that particular predictor variable is statistically significant based on the sample data used for training the model. Thus, if there are, say, 5 features, there will be five hypothesis tests and each will have an associated null and alternate hypothesis.
 Hypothesis formulation for Ftest : In addition, there is a hypothesis test done around the claim that there is a linear regression model representing the response variable and all the predictor variables. The null hypothesis is that the linear regression model does not exist . This essentially means that the value of all the coefficients is equal to zero. So, if the linear regression model is Y = a0 + a1x1 + a2x2 + a3x3, then the null hypothesis states that a1 = a2 = a3 = 0.
 Fstatistics for testing hypothesis for linear regression model : Ftest is used to test the null hypothesis that a linear regression model does not exist, representing the relationship between the response variable y and the predictor variables x1, x2, x3, x4 and x5. The null hypothesis can also be represented as x1 = x2 = x3 = x4 = x5 = 0. Fstatistics is calculated as a function of sum of squares residuals for restricted regression (representing linear regression model with only intercept or bias and all the values of coefficients as zero) and sum of squares residuals for unrestricted regression (representing linear regression model). In the above diagram, note the value of fstatistics as 15.66 against the degrees of freedom as 5 and 194.
 Evaluate tstatistics against the critical value/region : After calculating the value of tstatistics for each coefficient, it is now time to make a decision about whether to accept or reject the null hypothesis. In order for this decision to be made, one needs to set a significance level, which is also known as the alpha level. The significance level of 0.05 is usually set for rejecting the null hypothesis or otherwise. If the value of tstatistics fall in the critical region, the null hypothesis is rejected. Or, if the pvalue comes out to be less than 0.05, the null hypothesis is rejected.
 Evaluate fstatistics against the critical value/region : The value of Fstatistics and the pvalue is evaluated for testing the null hypothesis that the linear regression model representing response and predictor variables does not exist. If the value of fstatistics is more than the critical value at the level of significance as 0.05, the null hypothesis is rejected. This means that the linear model exists with at least one valid coefficients.
 Draw conclusions : The final step of hypothesis testing is to draw a conclusion by interpreting the results in terms of the original claim or hypothesis. If the null hypothesis of one or more predictor variables is rejected, it represents the fact that the relationship between the response and the predictor variable is not statistically significant based on the evidence or the sample data we used for training the model. Similarly, if the fstatistics value lies in the critical region and the value of the pvalue is less than the alpha value usually set as 0.05, one can say that there exists a linear regression model.
Why hypothesis tests for linear regression models?
The reasons why we need to do hypothesis tests in case of a linear regression model are following:
 By creating the model, we are establishing a new truth (claims) about the relationship between response or dependent variable with one or more predictor or independent variables. In order to justify the truth, there are needed one or more tests. These tests can be termed as an act of testing the claim (or new truth) or in other words, hypothesis tests.
 One kind of test is required to test the relationship between response and each of the predictor variables (hence, Ttests)
 Another kind of test is required to test the linear regression model representation as a whole. This is called Ftest.
While training linear regression models, hypothesis testing is done to determine whether the relationship between the response and each of the predictor variables is statistically significant or otherwise. The coefficients related to each of the predictor variables is determined. Then, individual hypothesis tests are done to determine whether the relationship between response and that particular predictor variable is statistically significant based on the sample data used for training the model. If at least one of the null hypotheses is rejected, it represents the fact that there exists no relationship between response and that particular predictor variable. Tstatistics is used for performing the hypothesis testing because the standard deviation of the sampling distribution is unknown. The value of tstatistics is compared with the critical value from the tdistribution table in order to make a decision about whether to accept or reject the null hypothesis regarding the relationship between the response and predictor variables. If the value falls in the critical region, then the null hypothesis is rejected which means that there is no relationship between response and that predictor variable. In addition to Ttests, Ftest is performed to test the null hypothesis that the linear regression model does not exist and that the value of all the coefficients is zero (0). Learn more about the linear regression and ttest in this blog – Linear regression ttest: formula, example .
Recent Posts
 ROC Curve & AUC Explained with Python Examples  August 28, 2024
 Accuracy, Precision, Recall & F1Score – Python Examples  August 28, 2024
 Logistic Regression in Machine Learning: Python Example  August 26, 2024
Ajitesh Kumar
One response.
Very informative
Leave a Reply Cancel reply
Your email address will not be published. Required fields are marked *
 Search for:
ChatGPT Prompts (250+)
 Generate Design Ideas for App
 Expand Feature Set of App
 Create a User Journey Map for App
 Generate Visual Design Ideas for App
 Generate a List of Competitors for App
 ROC Curve & AUC Explained with Python Examples
 Accuracy, Precision, Recall & F1Score – Python Examples
 Logistic Regression in Machine Learning: Python Example
 Reducing Overfitting vs Models Complexity: Machine Learning
 Model Parallelism vs Data Parallelism: Examples
Data Science / AI Trends
 • Prepend any arxiv.org link with talk2 to load the paper into a responsive chat application
 • Custom LLM and AI Agents (RAG) On Structured + Unstructured Data  AI Brain For Your Organization
 • Guides, papers, lecture, notebooks and resources for prompt engineering
 • Common tricks to make LLMs efficient and stable
 • Machine learning in finance
Free Online Tools
 Create Scatter Plots Online for your Excel Data
 Histogram / Frequency Distribution Creation Tool
 Online Pie Chart Maker Tool
 Ztest vs Ttest Decision Tool
 Independent samples ttest calculator
Recent Comments
I found it very helpful. However the differences are not too understandable for me
Very Nice Explaination. Thankyiu very much,
in your case E respresent Member or Oraganization which include on e or more peers?
Such a informative post. Keep it up
Thank you....for your support. you given a good solution for me.
 5.6  The General Linear FTest
The " general linear Ftest " involves three basic steps, namely:
 Define a larger full model . (By "larger," we mean one with more parameters.)
 Define a smaller reduced model . (By "smaller," we mean one with fewer parameters.)
 Use an F statistic to decide whether or not to reject the smaller reduced model in favor of the larger full model.
As you can see by the wording of the third step, the null hypothesis always pertains to the reduced model, while the alternative hypothesis always pertains to the full model.
The easiest way to learn about the general linear Ftest is to first go back to what we know, namely the simple linear regression model. Once we understand the general linear Ftest for the simple case, we then see that it can be easily extended to the multiple case. We take that approach here.
The full model
The " full model ", which is also sometimes referred to as the " unrestricted model ," is the model thought to be most appropriate for the data. For simple linear regression, the full model is:
\[y_i=(\beta_0+\beta_1x_{i1})+\epsilon_i\]
Here's a plot of a hypothesized full model for a set of data that we worked with previously in this course (student heights and grade point averages):
And, here's another plot of a hypothesized full model that we previously encountered (state latitudes and skin cancer mortalities):
In each plot, the solid line represents what the hypothesized population regression line might look like for the full model. The question we have to answer in each case is "does the full model describe the data well?" Here, we might think that the full model does well in summarizing the trend in the second plot but not the first.
The reduced model
The " reduced model ," which is sometimes also referred to as the " restricted model ," is the model described by the null hypothesis H 0 . For simple linear regression, a common null hypothesis is H 0 : β 1 = 0. In this case, the reduced model is obtained by "zeroingout" the slope β 1 that appears in the full model. That is, the reduced model is:
\[y_i=\beta_0+\epsilon_i\]
This reduced model suggests that each response y i is a function only of some overall mean, β 0 , and some error ε i .
Let's take another look at the plot of student grade point average against height, but this time with a line representing what the hypothesized population regression line might look like for the reduced model:
Not bad — there (fortunately?!) doesn't appear to be a relationship between height and grade point average. And, it appears as if the reduced model might be appropriate in describing the lack of a relationship between heights and grade point averages. How does the reduced model do for the skin cancer mortality example?
It doesn't appear as if the reduced model would do a very good job of summarizing the trend in the population.
How do we decide if the reduced model or the full model does a better job of describing the trend in the data when it can't be determined by simply looking at a plot? What we need to do is to quantify how much error remains after fitting each of the two models to our data. That is, we take the general linear Ftest approach:
 Obtain the least squares estimates of β 0 and β 1 .
 Determine the error sum of squares, which we denote " SSE ( F )."
 Obtain the least squares estimate of β 0 .
 Determine the error sum of squares, which we denote " SSE ( R )."
Recall that, in general, the error sum of squares is obtained by summing the squared distances between the observed and fitted (estimated) responses:
\[\sum(\text{observed }  \text{ fitted})^2\]
Therefore, since \(y_i\) is the observed response and \(\hat{y}_i\) is the fitted response for the full model :
\[SSE(F)=\sum(y_i\hat{y}_i)^2\]
And, since \(y_i\) is the observed response and \(\bar{y}\) is the fitted response for the reduced model :
\[SSE(R)=\sum(y_i\bar{y})^2\]
Let's get a better feel for the general linear Ftest approach by applying it to two different two datasets. First, let's look at the heightgpa data . The following plot of grade point averages against heights contains two estimated regression lines — the solid line is the estimated line for the full model, and the dashed line is the estimated line for the reduced model:
As you can see, the estimated lines are almost identical. Calculating the error sum of squares for each model, we obtain:
\[SSE(F)=\sum(y_i\hat{y}_i)^2=9.7055\]
\[SSE(R)=\sum(y_i\bar{y})^2=9.7331\]
The two quantities are almost identical. Adding height to the reduced model to obtain the full model reduces the amount of error by only 0.0276 (from 9.7331 to 9.7055). That is, adding height to the model does very little in reducing the variability in grade point averages. In this case, there appears to be no advantage in using the larger full model over the simpler reduced model.
Look what happens when we fit the full and reduced models to the skin cancer mortality and latitude dataset :
Here, there is quite a big difference in the estimated equation for the reduced model (solid line) and the estimated equation for the full model (dashed line). The error sums of squares quantify the substantial difference in the two estimated equations:
\[SSE(F)=\sum(y_i\hat{y}_i)^2=17173\]
\[SSE(R)=\sum(y_i\bar{y})^2=53637\]
Adding latitude to the reduced model to obtain the full model reduces the amount of error by 36464 (from 53637 to 17173). That is, adding latitude to the model substantially reduces the variability in skin cancer mortality. In this case, there appears to be a big advantage in using the larger full model over the simpler reduced model.
Where are we going with this general linear Ftest approach? In short:
 The general linear Ftest involves a comparison between SSE ( R ) and SSE ( F ).
 If SSE ( F ) is close to SSE ( R ), then the variation around the estimated full model regression function is almost as large as the variation around the estimated reduced model regression function. If that's the case, it makes sense to use the simpler reduced model.
 On the other hand, if SSE ( F ) and SSE ( R ) differ greatly, then the additional parameter(s) in the full model substantially reduce the variation around the estimated regression function. In this case, it makes sense to go with the larger full model.
How different does SSE ( R ) have to be from SSE ( F ) in order to justify using the larger full model? The general linear F statistic:
\[F^*=\left( \frac{SSE(R)SSE(F)}{df_Rdf_F}\right)\div\left( \frac{SSE(F)}{df_F}\right)\]
helps answer this question. The F statistic intuitively makes sense — it is a function of SSE ( R ) SSE ( F ), the difference in the error between the two models. The degrees of freedom — denoted df R and df F — are those associated with the reduced and full model error sum of squares, respectively.
We use the general linear F statistic to decide whether or not:
 to reject the null hypothesis H 0 : the reduced model,
 in favor of the alternative hypothesis H A : the full model.
In general, we reject H 0 if F * is large — or equivalently if its associated P value is small.
The test applied to the simple linear regression model
For simple linear regression, it turns out that the general linear F test is just the same ANOVA F test that we learned before. As noted earlier for the simple linear regression case, the full model is:
and the reduced model is:
Therefore, the appropriate null and alternative hypotheses are specified either as:
 H 0 : y i = β 0 + ε i
 H A : y i = β 0 + β 1 x i + ε i
 H 0 : β 1 = 0
 H A : β 1 ≠ 0
The degrees of freedom associated with the error sum of squares for the reduced model is n 1, and:
\[SSE(R)=\sum(y_i\bar{y})^2=SSTO\]
The degrees of freedom associated with the error sum of squares for the full model is n 2, and:
\[SSE(F)=\sum(y_i\hat{y}_i)^2=SSE\]
Now, we can see how the general linear F statistic just reduces algebraically to the ANOVA F test that we know:
\(F^*=\left( \frac{SSE(R)SSE(F)}{df_Rdf_F}\right)\div\left( \frac{SSE(F)}{df_F}\right)\)  
 1 =  2  ( )

\(F^*=\left( \frac{SSTOSSE}{(n1)(n2)}\right)\div\left( \frac{SSE}{(n2)}\right)=\frac{MSR}{MSE}\) 
That is, the general linear F statistic reduces to the ANOVA F statistic:
\[F^*=\frac{MSR}{MSE}\]
For the student height and grade point average example:
\[F^*=\frac{MSR}{MSE}=\frac{0.0276/1}{9.7055/33}=\frac{0.0276}{0.2941}=0.094\]
For the skin cancer mortality example:
\[F^*=\frac{MSR}{MSE}=\frac{36464/1}{17173/47}=\frac{36464}{365.4}=99.8\]
The P value is calculated as usual. The P value answers the question: "what is the probability that we’d get an F* statistic as large as we did, if the null hypothesis were true?" The P value is determined by comparing F * to an F distribution with 1 numerator degree of freedom and n 2 denominator degrees of freedom. For the student height and grade point average example, the P value is 0.761 (so we fail to reject H 0 and we favor the reduced model), while for the skin cancer mortality example, the P value is 0.000 (so we reject H 0 and we favor the full model).
Does alcoholism have an effect on muscle strength? Some researchers (UrbanoMarquez, et al , 1989) who were interested in answering this question collected the following data ( alcoholarm.txt ) on a sample of 50 alcoholic men:
 x = the total lifetime dose of alcohol ( kg per kg of body weight) consumed
 y = the strength of the deltoid muscle in the man's nondominant arm
The full model is the model that would summarize a linear relationship between alcohol consumption and arm strength. The reduced model, on the other hand, is the model that claims there is no relationship between alcohol consumption and arm strength.
Upon fitting the reduced model to the data, we obtain:
\[SSE(R)=\sum(y_i\bar{y})^2=1224.32\]
Note that the reduced model does not appear to summarize the trend in the data very well.
Upon fitting the full model to the data, we obtain:
\[SSE(F)=\sum(y_i\hat{y}_i)^2=720.27\]
The full model appears to decribe the trend in the data better than the reduced model.
The good news is that in the simple linear regression case, we don't have to bother with calculating the general linear F statistic. Statistical software does it for us in the ANOVA table:
As you can see, the output reports both SSE ( F ) — the amount of error associated with the full model — and SSE ( R ) — the amount of error associated with the reduced model. The F statistic is:
\[F^*=\frac{MSR}{MSE}=\frac{504.04/1}{720.27/48}=\frac{504.04}{15.006}=33.59\]
and its associated P value is < 0.001 (so we reject H 0 and we favor the full model). We can conclude that there is a statistically significant linear association between lifetime alcohol consumption and arm strength.
Start Here!
 Welcome to STAT 462!
 Search Course Materials
 Lesson 1: Statistical Inference Foundations
 Lesson 2: Simple Linear Regression (SLR) Model
 Lesson 3: SLR Evaluation
 Lesson 4: SLR Assumptions, Estimation & Prediction
 5.1  Example on IQ and Physical Characteristics
 5.2  Example on Underground Air Quality
 5.3  The Multiple Linear Regression Model
 5.4  A Matrix Formulation of the Multiple Regression Model
 5.5  Three Types of MLR Parameter Tests
 5.7  MLR Parameter Tests
 5.8  Partial Rsquared
 5.9 Further MLR Examples
 Lesson 6: MLR Assumptions, Estimation & Prediction
 Lesson 7: Transformations & Interactions
 Lesson 8: Categorical Predictors
 Lesson 9: Influential Points
 Lesson 10: Regression Pitfalls
 Lesson 11: Model Building
 Lesson 12: Logistic, Poisson & Nonlinear Regression
 Website for Applied Regression Modeling, 2nd edition
 Notation Used in this Course
 R Software Help
 Minitab Software Help
Copyright © 2018 The Pennsylvania State University Privacy and Legal Statements Contact the Department of Statistics Online Programs
IMAGES
VIDEO
COMMENTS
x: The value of the predictor variable. Simple linear regression uses the following null and alternative hypotheses: H0: β1 = 0. HA: β1 ≠ 0. The null hypothesis states that the coefficient β1 is equal to zero. In other words, there is no statistically significant relationship between the predictor variable, x, and the response variable, y.
The null hypothesis of a twotailed test states that there is not a linear relationship between \(x\) and \(y\). The alternative hypothesis of a twotailed test states that there is a significant linear relationship between \(x\) and \(y\). Either a ttest or an Ftest may be used to see if the slope is significantly different from zero.
218 CHAPTER 9. SIMPLE LINEAR REGRESSION 9.2 Statistical hypotheses For simple linear regression, the chief null hypothesis is H 0: β 1 = 0, and the corresponding alternative hypothesis is H 1: β 1 6= 0. If this null hypothesis is true, then, from E(Y) = β 0 + β 1x we can see that the population mean of Y is β 0 for
The following examples show how to decide to reject or fail to reject the null hypothesis in both simple linear regression and multiple linear regression models. Example 1: Simple Linear Regression. Suppose a professor would like to use the number of hours studied to predict the exam score that students will receive in his class. He collects ...
Simple linear regression example. You are a social researcher interested in the relationship between income and happiness. You survey 500 people whose incomes range from 15k to 75k and ask them to rank their happiness on a scale from 1 to 10. Your independent variable (income) and dependent variable (happiness) are both quantitative, so you can ...
Simple Linear Regression ANOVA Hypothesis Test Example: Rainfall and sales of sunglasses We will now describe a hypothesis test to determine if the regression model is meaningful; in other words, does the value of \(X\) in any way help predict the expected value of \(Y\)?
The null hypothesis (H0) answers "No, there's no effect in the population.". The alternative hypothesis (Ha) answers "Yes, there is an effect in the population.". The null and alternative are always claims about the population. That's because the goal of hypothesis testing is to make inferences about a population based on a sample.
Interpreting the hypothesis test# If we reject the null hypothesis, can we assume there is an exact linear relationship? No. A quadratic relationship may be a better fit, for example. This test assumes the simple linear regression model is correct which precludes a quadratic relationship.
The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.
In our example today: the bigger model is the simple linear regression model, the smaller is the model with constant mean (one sample model). If the \ ... The \(F\)statistic for simple linear regression revisited# The null hypothesis is \[ H_0: \text{reduced model (R) is correct}. \]
A linear regression model says that the function f is a sum (linear combination) of functions of father. Simple linear regression model: (1) # f ( f a t h e r) = β 0 + β 1 ⋅ f a t h e r. Parameters of f are ( β 0, β 1) Could also be a sum (linear combination) of fixed functions of father: (2) # f ( f a t h e r) = β 0 + β 1 ⋅ f a t h e ...
For Bob's simple linear regression example, he wants to see how changes in the number of critical areas (the predictor variable) impact the dollar amount for land development (the response variable). ... we test the null hypothesis that a value is zero. We extend this principle to the slope, with a null hypothesis that the slope is equal to ...
The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (HA): There's an effect in the population. The effect is usually the effect of the independent variable on the dependent ...
The lecture is divided in two parts: in the first part, we discuss hypothesis testing in the normal linear regression model, in which the OLS estimator of the coefficients has a normal distribution conditional on the matrix of regressors; in the second part, we show how to carry out hypothesis tests in linear regression analyses where the ...
The Pvalue is smaller than the significance level \(\alpha = 0.05\) — we reject the null hypothesis in favor of the alternative. There is sufficient evidence at the \(\alpha = 0.05\) level to conclude that there is a lack of fit in the simple linear regression model. In light of the scatterplot, the lack of fit test provides the answer we ...
The simple linear regression model for n observations can be written as yi = β0 +β1xi +†i, i = 1,2,··· ,n. (1) The designation simple indicates that there is only one predictor variable x, and linear means that the model is linear in β0 and β1. The intercept β0 and the slope β1 are unknown constants, and they are both called ...
For the simple linear regression model, there is only one slope parameter about which one can perform hypothesis tests. For the multiple linear regression model, there are three different hypothesis tests for slopes that one could conduct. They are: Hypothesis test for testing that all of the slope parameters are 0.
Here are key steps of doing hypothesis tests with linear regression models: Formulate null and alternate hypotheses: The first step of hypothesis testing is to formulate the null and alternate hypotheses. The null hypothesis (H0) is a statement that represents the state of the real world where the truth about something needs to be justified.
The "reduced model," which is sometimes also referred to as the "restricted model," is the model described by the null hypothesis H 0. For simple linear regression, a common null hypothesis is H 0: β 1 = 0. In this case, the reduced model is obtained by "zeroingout" the slope β 1 that appears in the full model. That is, the reduced model is:
Power The probability of rejecting a false null hypothesis when the alternative hypothesis is true. N The size of the sample drawn from the population. B0 The slope under the null hypothesis, H0. B1 The slope under the alternative hypothesis, H1. This is the slope at which the power is calculated. σх The standard deviation of the X values.
As in simple linear regression, under the null hypothesis t 0 = βˆ j seˆ(βˆ j) ∼ t n−p−1. We reject H 0 if t 0 > t n−p−1,1−α/2. This is a partial test because βˆ j depends on all of the other predictors x i, i 6= j that are in the model. Thus, this is a test of the contribution of x j given the other predictors in the model.
In simple linear regression, this is equivalent to saying "Are X an Y correlated?". In reviewing the model, Y = β0 +β1X + ε Y = β 0 + β 1 X + ε, as long as the slope ( β1 β 1) has any non‐zero value, X X will add value in helping predict the expected value of Y Y. However, if there is no correlation between X and Y, the value of ...