Understanding the Null Hypothesis for Linear Regression

Linear regression is a technique we can use to understand the relationship between one or more predictor variables and a response variable .

If we only have one predictor variable and one response variable, we can use simple linear regression , which uses the following formula to estimate the relationship between the variables:

ŷ = β 0 + β 1 x

  • ŷ: The estimated response value.
  • β 0 : The average value of y when x is zero.
  • β 1 : The average change in y associated with a one unit increase in x.
  • x: The value of the predictor variable.

Simple linear regression uses the following null and alternative hypotheses:

  • H 0 : β 1 = 0
  • H A : β 1 ≠ 0

The null hypothesis states that the coefficient β 1 is equal to zero. In other words, there is no statistically significant relationship between the predictor variable, x, and the response variable, y.

The alternative hypothesis states that β 1 is not equal to zero. In other words, there is a statistically significant relationship between x and y.

If we have multiple predictor variables and one response variable, we can use multiple linear regression , which uses the following formula to estimate the relationship between the variables:

ŷ = β 0 + β 1 x 1 + β 2 x 2 + … + β k x k

  • β 0 : The average value of y when all predictor variables are equal to zero.
  • β i : The average change in y associated with a one unit increase in x i .
  • x i : The value of the predictor variable x i .

Multiple linear regression uses the following null and alternative hypotheses:

  • H 0 : β 1 = β 2 = … = β k = 0
  • H A : β 1 = β 2 = … = β k ≠ 0

The null hypothesis states that all coefficients in the model are equal to zero. In other words, none of the predictor variables have a statistically significant relationship with the response variable, y.

The alternative hypothesis states that not every coefficient is simultaneously equal to zero.

The following examples show how to decide to reject or fail to reject the null hypothesis in both simple linear regression and multiple linear regression models.

Example 1: Simple Linear Regression

Suppose a professor would like to use the number of hours studied to predict the exam score that students will receive in his class. He collects data for 20 students and fits a simple linear regression model.

The following screenshot shows the output of the regression model:

Output of simple linear regression in Excel

The fitted simple linear regression model is:

Exam Score = 67.1617 + 5.2503*(hours studied)

To determine if there is a statistically significant relationship between hours studied and exam score, we need to analyze the overall F value of the model and the corresponding p-value:

  • Overall F-Value:  47.9952
  • P-value:  0.000

Since this p-value is less than .05, we can reject the null hypothesis. In other words, there is a statistically significant relationship between hours studied and exam score received.

Example 2: Multiple Linear Regression

Suppose a professor would like to use the number of hours studied and the number of prep exams taken to predict the exam score that students will receive in his class. He collects data for 20 students and fits a multiple linear regression model.

Multiple linear regression output in Excel

The fitted multiple linear regression model is:

Exam Score = 67.67 + 5.56*(hours studied) – 0.60*(prep exams taken)

To determine if there is a jointly statistically significant relationship between the two predictor variables and the response variable, we need to analyze the overall F value of the model and the corresponding p-value:

  • Overall F-Value:  23.46
  • P-value:  0.00

Since this p-value is less than .05, we can reject the null hypothesis. In other words, hours studied and prep exams taken have a jointly statistically significant relationship with exam score.

Note: Although the p-value for prep exams taken (p = 0.52) is not significant, prep exams combined with hours studied has a significant relationship with exam score.

Additional Resources

Understanding the F-Test of Overall Significance in Regression How to Read and Interpret a Regression Table How to Report Regression Results How to Perform Simple Linear Regression in Excel How to Perform Multiple Linear Regression in Excel

The Complete Guide: How to Report Regression Results

R vs. r-squared: what’s the difference, related posts, how to normalize data between -1 and 1, vba: how to check if string contains another..., how to interpret f-values in a two-way anova, how to create a vector of ones in..., how to find the mode of a histogram..., how to find quartiles in even and odd..., how to determine if a probability distribution is..., what is a symmetric histogram (definition & examples), how to calculate sxy in statistics (with example), how to calculate sxx in statistics (with example).

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null & Alternative Hypotheses | Definitions, Templates & Examples

Published on May 6, 2022 by Shaun Turney . Revised on June 22, 2023.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis ( H 0 ): There’s no effect in the population .
  • Alternative hypothesis ( H a or H 1 ) : There’s an effect in the population.

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, similarities and differences between null and alternative hypotheses, how to write null and alternative hypotheses, other interesting articles, frequently asked questions.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”:

  • The null hypothesis ( H 0 ) answers “No, there’s no effect in the population.”
  • The alternative hypothesis ( H a ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses .

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

null and alternative hypothesis for simple linear regression

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept . Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error . When you incorrectly fail to reject it, it’s a type II error.

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

( )
Does tooth flossing affect the number of cavities? Tooth flossing has on the number of cavities. test:

The mean number of cavities per person does not differ between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ = µ .

Does the amount of text highlighted in the textbook affect exam scores? The amount of text highlighted in the textbook has on exam scores. :

There is no relationship between the amount of text highlighted and exam scores in the population; β = 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression.* test:

The proportion of people with depression in the daily-meditation group ( ) is greater than or equal to the no-meditation group ( ) in the population; ≥ .

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis ( H a ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Does tooth flossing affect the number of cavities? Tooth flossing has an on the number of cavities. test:

The mean number of cavities per person differs between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ ≠ µ .

Does the amount of text highlighted in a textbook affect exam scores? The amount of text highlighted in the textbook has an on exam scores. :

There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression. test:

The proportion of people with depression in the daily-meditation group ( ) is less than the no-meditation group ( ) in the population; < .

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question.
  • They both make claims about the population.
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

A claim that there is in the population. A claim that there is in the population.

Equality symbol (=, ≥, or ≤) Inequality symbol (≠, <, or >)
Rejected Supported
Failed to reject Not supported

Prevent plagiarism. Run a free check.

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

General template sentences

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis ( H 0 ): Independent variable does not affect dependent variable.
  • Alternative hypothesis ( H a ): Independent variable affects dependent variable.

Test-specific template sentences

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

( )
test 

with two groups

The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ . The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ .
with three groups The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ . The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population.
There is no correlation between independent variable and dependent variable in the population; ρ = 0. There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0.
There is no relationship between independent variable and dependent variable in the population; β = 0. There is a relationship between independent variable and dependent variable in the population; β ≠ 0.
Two-proportions test The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = . The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ .

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Turney, S. (2023, June 22). Null & Alternative Hypotheses | Definitions, Templates & Examples. Scribbr. Retrieved August 29, 2024, from https://www.scribbr.com/statistics/null-and-alternative-hypotheses/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, inferential statistics | an easy introduction & examples, hypothesis testing | a step-by-step guide with easy examples, type i & type ii errors | differences, examples, visualizations, what is your plagiarism score.

9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

equal (=) not equal (≠) greater than (>) less than (<)
greater than or equal to (≥) less than (<)
less than or equal to (≤) more than (>)

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Apr 16, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

STATS191 - Home

Simple Linear Regression

Simple linear regression #.

RStudio: RMarkdown , Quarto

Father & son data #

Pearson’s height data #.

A data.frame: 6 × 2
fheightsheight
<dbl><dbl>
165.0485159.77827
263.2509463.21404
364.9553263.34242
465.7525062.79238
561.1372364.28113
663.0225464.24221

../../_images/e1cbbc9d19e41e6780c6c4fb0b1562590948e56190ff21687a1ae0b6aabdc3c9.png

An example of simple linear regression model .

Breakdown of terms:

regression model : a model for the mean of a response given features

linear : model of the mean is linear in parameters of interest

simple : only a single feature

Slicewise model #

../../_images/c35acdaff81f4b41309c5e6b61715b9896ce840398a087cb70b949a0a72ad18d.png

A simple linear regression model fits a line through the above scatter plot by modelling slices .

Conditional means #

The average height of sons whose height fell within our slice [69.5,79.5] is about 69.8 inches.

This height varies by slice…

At 65 inches it’s about 67.2 inches:

../../_images/b3ba5eb14baccf213e3b871643ab8684203993f1e7291cbec978bfac0b918077.png

Multiple samples model ( longevity example) #

We’ve seen this slicewise model before: error in each slice the same.

../../_images/543953bc7f4ffd42d90b8f771a52ee593adcedea286a20cfd7e76e0261400c62.png

Regression as slicewise model #

In longevity model: no relation between the means in each slice! We needed to use a parameter for each Diet …

Regression model says that the mean in slice father is

This ties together all (father, son) points in the scatterplot.

Chooses \((\beta_0, \beta_1)\) by jointly modeling the mean in each slice…

Height data as slices #

../../_images/34a02e8c4dfca8925f3a93a01aebc57ae9c9812e69ebebc0f081dcbad8c0943a.png

What is a “regression” model? #

Model of the relationships between some covariates / predictors / features and an outcome .

A regression model is a model of the average outcome given the covariates .

Mathematical formulation #

For height data: a mathematical model:

\(f\) describes how mean of son varies with father

\(\varepsilon\) is the random variation within the slice.

Linear regression models #

A linear regression model says that the function \(f\) is a sum (linear combination) of functions of father .

Simple linear regression model:

Parameters of \(f\) are \((\beta_0, \beta_1)\)

Could also be a sum (linear combination) of fixed functions of father :

Statistical model #

Symbol \(Y\) usually used for outcomes, \(X\) for covariates…

where \(\varepsilon_i \sim N(0, \sigma^2)\) are independent.

This specifies a distribution for the \(Y\) ’s given the \(X\) ’s, i.e. it is a statistical model .

Regression equation #

The regression equation is our slicewise model.

Formally, this is a model of the conditional mean function

Book uses the notation \(\mu\{Y|X\}\) .

Fitting the model #

We will be using least squares regression. This measures the goodness of fit of a line by the sum of squared errors, \(SSE\) .

Least squares regression chooses the line that minimizes

In principle, we might measure goodness of fit differently:

For some other loss function \(L\) we might try to minimize

Why least squares? #

With least squares, the minimizers have explicit formulae – not so important with today’s computer power.

Resulting formulae are linear in the outcome \(Y\) . This is important for inferential reasons. For only predictive power, this is also not so important.

If assumptions are correct, then this is maximum likelihood estimation .

Statistical theory tells us the maximum likelihood estimators (MLEs) are generally good estimators.

Choice of loss function #

Suppose we try to minimize squared error over \(\mu\) :

We know (by calculus) that the minimizer is the sample mean.

If we minimize absolute error over \(\mu\)

We know (similarly by calculus) that the minimizer(s) is (are) the sample median(s).

Visualizing the loss function #

Let’s take some a random scatter plot and view the loss function.

../../_images/ab12734dc6167f80a2d6cff049d96048c00392dd86b10b86d1001f6d3a007e0c.png

Let’s plot the loss as a function of the parameters. Note that the true intercept is 1.5 while the true slope is 0.1.

../../_images/bd4945aeeae246c363245e1e42fa508a1fe003420748d3518385c48f870b872a.png

Let’s contrast this with the sum of absolute errors.

../../_images/cb3a50000e7a0c4150e5c8294cc5119d0593b50bc1905daf74f6654b77e667bf.png

Geometry of least squares #

Some things to note:

Minimizing sum of squares is the same as finding the point in the X,1 plane closest to \(Y\) .

The total dimension of the space is 1078.

The dimension of the plane is 2-dimensional.

The axis marked “ \(\perp\) ” should be thought of as \((n-2)\) dimensional, or, 1076 in this case.

Least squares #

The (squared) lengths of the above vectors are important quantities in what follows.

Important lengths #

There are three to note:

An important summary of the fit is the ratio

Measures how much variability in \(Y\) is explained by \(X\) .

Case study A: data suggesting the Big Bang #

../../_images/512071d95e645bb2e8b27e89f24d69b519a0ce999e04ede21cb78e777988950d.png

Let’s fit the linear regression model.

Let’s look at the summary :

Hubble’s model #

Hubble’s theory of the Big Bang suggests that the correct slicewise (i.e. regression) model is

To fit without an intercept

../../_images/2c8e9449b64512c4d51f5b51bbb070a97ef219d82cf241a4d89460fc12593108.png

Least squares estimators #

There are explicit formulae for the least squares estimators, i.e. the minimizers of the error sum of squares.

For the slope, \(\hat{\beta}_1\) , it can be shown that

Knowing the slope estimate, the intercept estimate can be found easily:

Example: big_bang #

  • 0.399170439725205
  • 0.00137240753172474

Estimate of \(\sigma^2\) #

The estimate most commonly used is

We’ll use the common practice of replacing the quantity \(SSE(\hat{\beta}_0,\hat{\beta}_1)\) , i.e. the minimum of this function, with just \(SSE\) .

The term MSE above refers to mean squared error: a sum of squares divided by its degrees of freedom . The degrees of freedom of SSE , the error sum of squares is therefore \(n-2\) .

Mathematical aside #

We divide by \(n-2\) because some calculations tell us:

Above, the right hand side denotes a chi-squared distribution with \(n-2\) degrees of freedom.

Dividing by \(n-2\) gives an unbiased estimate of \(\sigma^2\) (assuming our modeling assumptions are correct).

Inference for the simple linear regression model #

Remember: \(X\) can be fixed or random in our model…

Case study B: predicting pH based on time after slaughter #

In this study, researches fixed \(X\) ( Time ) before measuring \(Y\) ( pH )

Ultimate goal: how long after slaughter is pH around 6?

../../_images/ddccb92ee4ad71721393baae0f29b04ec252b97070a9e93ba4b82bae6ee5e285.png

Inference for \(\beta_0\) or \(\beta_1\) #

Recall our model

The errors \(\varepsilon_i\) are independent \(N(0, \sigma^2)\) .

In our heights example, we might want to now if there really is a linear association between \({\tt son}=Y\) and \({\tt father}=X\) . This can be answered with a hypothesis test of the null hypothesis \(H_0:\beta_1=0\) . This assumes the model above is correct, AND \(\beta_1=0\) .

Alternatively, we might want to have a range of values that we can be fairly certain \(\beta_1\) lies within. This is a confidence interval for \(\beta_1\) .

A mathematical aside #

Let \(L\) be the subspace of \(\mathbb{R}^n\) spanned \(\pmb{1}=(1, \dots, 1)\) and \({X}=(X_1, \dots, X_n)\) .

We can decompose \(Y\) as

In our model, \(\mu=\beta_0 \pmb{1} + \beta_1 {X} \in L\) so that

Our assumption that \(\varepsilon_i\) ’s are independent \(N(0,\sigma^2)\) tells us that: \({e}\) and \(\widehat{{Y}}\) are independent; \(\widehat{\sigma}^2 = \|{e}\|^2 / (n-2) \sim \sigma^2 \cdot \chi^2_{n-2} / (n-2)\) .

Setup for inference #

All of this implies

The other quantity we need is the standard error or SE of \(\hat{\beta}_1\) :

Testing \(H_0:\beta_1=\beta_1^0\) #

Suppose we want to test that \(\beta_1\) is some pre-specified value, \(\beta_1^0\) (this is often 0: i.e. is there a linear association)

Under \(H_0:\beta_1=\beta_1^0\)

Reject \(H_0:\beta_1=\beta_1^0\) if \(|T| > t_{n-2, 1-\alpha/2}\) .

Let’s perform this test for the Big Bang data.

We see that R performs our \(t\) -test in the second row of the Coefficients table.

It is clear that Distance is correlated with Velocity .

There seems to be some flaw in Hubble’s theory: we reject \(H_0:\beta_0=0\) at level 5%: \(p\) -value is 0.0028!

Why reject for large |T|? #

Logic is the same as other \(t\) tests: observing a large \(|T|\) is unlikely if \(\beta_1 = \beta_1^0\) (i.e. if \(H_0\) were true). \(\implies\) it is reasonable to conclude that \(H_0\) is false.

Common to report \(p\) -value:

Confidence interval for regression parameters #

Applying the above to the parameter \(\beta_1\) yields a confidence interval of the form

Earlier, we computed \(SE(\hat{\beta}_1)\) using this formula

with \((a_0,a_1) = (0, 1)\) .

We also need to find the quantity \(t_{n-2,1-\alpha/2}\) . This is defined by

In R , this is computed by the function qt .

We will not need to use these explicit formulae all the time, as R has some built in functions to compute confidence intervals.

A matrix: 2 × 2 of type dbl
2.5 %97.5 %
(Intercept)0.15307190580.64526897
Velocity0.00089993490.00184488
A matrix: 2 × 2 of type dbl
5 %95 %
(Intercept)0.19540352670.60293735
Velocity0.00098120540.00176361

Predicting the mean #

Once we have estimated a slope \((\hat{\beta}_1)\) and an intercept \((\hat{\beta}_0)\) , we can predict the height of the son born to a father of any particular height by the plugging-in the height of the new father, \(F_{new}\) into our regression equation:

Confidence interval for the average height of sons born to a father of height \(F_{new}=70\) (or maybe \(65\) ) inches:

A matrix: 2 × 3 of type dbl
fitlwrupr
169.8731269.7133370.03291
267.3026567.1316567.47366

../../_images/83fe3f7377b1342f8f95ffd3a62f68d5950c5a517e1dbc84c13817e6930c8886.png

Computing \(SE(\hat{\beta}_0 + 70 \cdot \hat{\beta}_1)\) #

We use the previous formula

with \((a_0, a_1) = (1, 70)\) .

Plugging in

As \(n\) grows (taking a larger sample), \(SE(\hat{\beta}_0 + 70 \hat{\beta}_1)\) should shrink to 0. Why?

Forecasting / prediction intervals #

Can we find an interval that covers the height of a particular son knowing only that her father’s height as 70 inches?

Must cover the variability of the new random variation \(\implies\) it must be at least as wide as \(\sigma\) .

With so much data in our heights example, this 90% interval will have width roughly \(2 \cdot 1.96 \cdot \hat{\sigma}\)

  • 8.01555526222726
  • 8.18982582776275

Actual width will depend on how accurately we have estimated \((\beta_0, \beta_1)\) as well as \(\hat{\sigma}\) .

The final interval is

Computed in R as follows

A matrix: 1 × 3 of type dbl
fitlwrupr
169.8731265.858773.88753

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null and Alternative Hypotheses | Definitions & Examples

Published on 5 October 2022 by Shaun Turney . Revised on 6 December 2022.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis (H 0 ): There’s no effect in the population .
  • Alternative hypothesis (H A ): There’s an effect in the population.

The effect is usually the effect of the independent variable on the dependent variable .

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, differences between null and alternative hypotheses, how to write null and alternative hypotheses, frequently asked questions about null and alternative hypotheses.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”, the null hypothesis (H 0 ) answers “No, there’s no effect in the population.” On the other hand, the alternative hypothesis (H A ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample.

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept. Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect”, “no difference”, or “no relationship”. When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

( )
Does tooth flossing affect the number of cavities? Tooth flossing has on the number of cavities. test:

The mean number of cavities per person does not differ between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ = µ .

Does the amount of text highlighted in the textbook affect exam scores? The amount of text highlighted in the textbook has on exam scores. :

There is no relationship between the amount of text highlighted and exam scores in the population; β = 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression.* test:

The proportion of people with depression in the daily-meditation group ( ) is greater than or equal to the no-meditation group ( ) in the population; ≥ .

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis (H A ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect”, “a difference”, or “a relationship”. When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes > or <). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Does tooth flossing affect the number of cavities? Tooth flossing has an on the number of cavities. test:

The mean number of cavities per person differs between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ ≠ µ .

Does the amount of text highlighted in a textbook affect exam scores? The amount of text highlighted in the textbook has an on exam scores. :

There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression. test:

The proportion of people with depression in the daily-meditation group ( ) is less than the no-meditation group ( ) in the population; < .

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question
  • They both make claims about the population
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

A claim that there is in the population. A claim that there is in the population.

Equality symbol (=, ≥, or ≤) Inequality symbol (≠, <, or >)
Rejected Supported
Failed to reject Not supported

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis (H 0 ): Independent variable does not affect dependent variable .
  • Alternative hypothesis (H A ): Independent variable affects dependent variable .

Test-specific

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

( )
test 

with two groups

The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ . The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ .
with three groups The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ . The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population.
There is no correlation between independent variable and dependent variable in the population; ρ = 0. There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0.
There is no relationship between independent variable and dependent variable in the population; β = 0. There is a relationship between independent variable and dependent variable in the population; β ≠ 0.
Two-proportions test The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = . The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ .

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Turney, S. (2022, December 06). Null and Alternative Hypotheses | Definitions & Examples. Scribbr. Retrieved 29 August 2024, from https://www.scribbr.co.uk/stats/null-and-alternative-hypothesis/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, levels of measurement: nominal, ordinal, interval, ratio, the standard normal distribution | calculator, examples & uses, types of variables in research | definitions & examples.

Linear regression - Hypothesis testing

by Marco Taboga , PhD

This lecture discusses how to perform tests of hypotheses about the coefficients of a linear regression model estimated by ordinary least squares (OLS).

Table of contents

Normal vs non-normal model

The linear regression model, matrix notation, tests of hypothesis in the normal linear regression model, test of a restriction on a single coefficient (t test), test of a set of linear restrictions (f test), tests based on maximum likelihood procedures (wald, lagrange multiplier, likelihood ratio), tests of hypothesis when the ols estimator is asymptotically normal, test of a restriction on a single coefficient (z test), test of a set of linear restrictions (chi-square test), learn more about regression analysis.

The lecture is divided in two parts:

in the first part, we discuss hypothesis testing in the normal linear regression model , in which the OLS estimator of the coefficients has a normal distribution conditional on the matrix of regressors;

in the second part, we show how to carry out hypothesis tests in linear regression analyses where the hypothesis of normality holds only in large samples (i.e., the OLS estimator can be proved to be asymptotically normal).

How to choose which test to carry out after estimating a linear regression model.

We also denote:

We now explain how to derive tests about the coefficients of the normal linear regression model.

It can be proved (see the lecture about the normal linear regression model ) that the assumption of conditional normality implies that:

How the acceptance region is determined depends not only on the desired size of the test , but also on whether the test is:

one-tailed (only one of the two things, i.e., either smaller or larger, is possible).

For more details on how to determine the acceptance region, see the glossary entry on critical values .

[eq28]

The F test is one-tailed .

A critical value in the right tail of the F distribution is chosen so as to achieve the desired size of the test.

Then, the null hypothesis is rejected if the F statistics is larger than the critical value.

In this section we explain how to perform hypothesis tests about the coefficients of a linear regression model when the OLS estimator is asymptotically normal.

As we have shown in the lecture on the properties of the OLS estimator , in several cases (i.e., under different sets of assumptions) it can be proved that:

These two properties are used to derive the asymptotic distribution of the test statistics used in hypothesis testing.

The test can be either one-tailed or two-tailed . The same comments made for the t-test apply here.

[eq50]

Like the F test, also the Chi-square test is usually one-tailed .

The desired size of the test is achieved by appropriately choosing a critical value in the right tail of the Chi-square distribution.

The null is rejected if the Chi-square statistics is larger than the critical value.

Want to learn more about regression analysis? Here are some suggestions:

R squared of a linear regression ;

Gauss-Markov theorem ;

Generalized Least Squares ;

Multicollinearity ;

Dummy variables ;

Selection of linear regression models

Partitioned regression ;

Ridge regression .

How to cite

Please cite as:

Taboga, Marco (2021). "Linear regression - Hypothesis testing", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/linear-regression-hypothesis-testing.

Most of the learning materials found on this website are now available in a traditional textbook format.

  • F distribution
  • Beta distribution
  • Conditional probability
  • Central Limit Theorem
  • Binomial distribution
  • Mean square convergence
  • Delta method
  • Almost sure convergence
  • Mathematical tools
  • Fundamentals of probability
  • Probability distributions
  • Asymptotic theory
  • Fundamentals of statistics
  • About Statlect
  • Cookies, privacy and terms of use
  • Loss function
  • Almost sure
  • Type I error
  • Precision matrix
  • Integrable variable
  • To enhance your privacy,
  • we removed the social buttons,
  • but don't forget to share .
  • Prompt Library
  • DS/AI Trends
  • Stats Tools
  • Interview Questions
  • Generative AI
  • Machine Learning
  • Deep Learning

Linear regression hypothesis testing: Concepts, Examples

Simple linear regression model

In relation to machine learning , linear regression is defined as a predictive modeling technique that allows us to build a model which can help predict continuous response variables as a function of a linear combination of explanatory or predictor variables. While training linear regression models, we need to rely on hypothesis testing in relation to determining the relationship between the response and predictor variables. In the case of the linear regression model, two types of hypothesis testing are done. They are T-tests and F-tests . In other words, there are two types of statistics that are used to assess whether linear regression models exist representing response and predictor variables. They are t-statistics and f-statistics. As data scientists , it is of utmost importance to determine if linear regression is the correct choice of model for our particular problem and this can be done by performing hypothesis testing related to linear regression response and predictor variables. Many times, it is found that these concepts are not very clear with a lot many data scientists. In this blog post, we will discuss linear regression and hypothesis testing related to t-statistics and f-statistics . We will also provide an example to help illustrate how these concepts work.

Table of Contents

What are linear regression models?

A linear regression model can be defined as the function approximation that represents a continuous response variable as a function of one or more predictor variables. While building a linear regression model, the goal is to identify a linear equation that best predicts or models the relationship between the response or dependent variable and one or more predictor or independent variables.

There are two different kinds of linear regression models. They are as follows:

  • Simple or Univariate linear regression models : These are linear regression models that are used to build a linear relationship between one response or dependent variable and one predictor or independent variable. The form of the equation that represents a simple linear regression model is Y=mX+b, where m is the coefficients of the predictor variable and b is bias. When considering the linear regression line, m represents the slope and b represents the intercept.
  • Multiple or Multi-variate linear regression models : These are linear regression models that are used to build a linear relationship between one response or dependent variable and more than one predictor or independent variable. The form of the equation that represents a multiple linear regression model is Y=b0+b1X1+ b2X2 + … + bnXn, where bi represents the coefficients of the ith predictor variable. In this type of linear regression model, each predictor variable has its own coefficient that is used to calculate the predicted value of the response variable.

While training linear regression models, the requirement is to determine the coefficients which can result in the best-fitted linear regression line. The learning algorithm used to find the most appropriate coefficients is known as least squares regression . In the least-squares regression method, the coefficients are calculated using the least-squares error function. The main objective of this method is to minimize or reduce the sum of squared residuals between actual and predicted response values. The sum of squared residuals is also called the residual sum of squares (RSS). The outcome of executing the least-squares regression method is coefficients that minimize the linear regression cost function .

The residual e of the ith observation is represented as the following where [latex]Y_i[/latex] is the ith observation and [latex]\hat{Y_i}[/latex] is the prediction for ith observation or the value of response variable for ith observation.

[latex]e_i = Y_i – \hat{Y_i}[/latex]

The residual sum of squares can be represented as the following:

[latex]RSS = e_1^2 + e_2^2 + e_3^2 + … + e_n^2[/latex]

The least-squares method represents the algorithm that minimizes the above term, RSS.

Once the coefficients are determined, can it be claimed that these coefficients are the most appropriate ones for linear regression? The answer is no. After all, the coefficients are only the estimates and thus, there will be standard errors associated with each of the coefficients.  Recall that the standard error is used to calculate the confidence interval in which the mean value of the population parameter would exist. In other words, it represents the error of estimating a population parameter based on the sample data. The value of the standard error is calculated as the standard deviation of the sample divided by the square root of the sample size. The formula below represents the standard error of a mean.

[latex]SE(\mu) = \frac{\sigma}{\sqrt(N)}[/latex]

Thus, without analyzing aspects such as the standard error associated with the coefficients, it cannot be claimed that the linear regression coefficients are the most suitable ones without performing hypothesis testing. This is where hypothesis testing is needed . Before we get into why we need hypothesis testing with the linear regression model, let’s briefly learn about what is hypothesis testing?

Train a Multiple Linear Regression Model using R

Before getting into understanding the hypothesis testing concepts in relation to the linear regression model, let’s train a multi-variate or multiple linear regression model and print the summary output of the model which will be referred to, in the next section. 

The data used for creating a multi-linear regression model is BostonHousing which can be loaded in RStudioby installing mlbench package. The code is shown below:

install.packages(“mlbench”) library(mlbench) data(“BostonHousing”)

Once the data is loaded, the code shown below can be used to create the linear regression model.

attach(BostonHousing) BostonHousing.lm <- lm(log(medv) ~ crim + chas + rad + lstat) summary(BostonHousing.lm)

Executing the above command will result in the creation of a linear regression model with the response variable as medv and predictor variables as crim, chas, rad, and lstat. The following represents the details related to the response and predictor variables:

  • log(medv) : Log of the median value of owner-occupied homes in USD 1000’s
  • crim : Per capita crime rate by town
  • chas : Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
  • rad : Index of accessibility to radial highways
  • lstat : Percentage of the lower status of the population

The following will be the output of the summary command that prints the details relating to the model including hypothesis testing details for coefficients (t-statistics) and the model as a whole (f-statistics) 

linear regression model summary table r.png

Hypothesis tests & Linear Regression Models

Hypothesis tests are the statistical procedure that is used to test a claim or assumption about the underlying distribution of a population based on the sample data. Here are key steps of doing hypothesis tests with linear regression models:

  • Hypothesis formulation for T-tests: In the case of linear regression, the claim is made that there exists a relationship between response and predictor variables, and the claim is represented using the non-zero value of coefficients of predictor variables in the linear equation or regression model. This is formulated as an alternate hypothesis. Thus, the null hypothesis is set that there is no relationship between response and the predictor variables . Hence, the coefficients related to each of the predictor variables is equal to zero (0). So, if the linear regression model is Y = a0 + a1x1 + a2x2 + a3x3, then the null hypothesis for each test states that a1 = 0, a2 = 0, a3 = 0 etc. For all the predictor variables, individual hypothesis testing is done to determine whether the relationship between response and that particular predictor variable is statistically significant based on the sample data used for training the model. Thus, if there are, say, 5 features, there will be five hypothesis tests and each will have an associated null and alternate hypothesis.
  • Hypothesis formulation for F-test : In addition, there is a hypothesis test done around the claim that there is a linear regression model representing the response variable and all the predictor variables. The null hypothesis is that the linear regression model does not exist . This essentially means that the value of all the coefficients is equal to zero. So, if the linear regression model is Y = a0 + a1x1 + a2x2 + a3x3, then the null hypothesis states that a1 = a2 = a3 = 0.
  • F-statistics for testing hypothesis for linear regression model : F-test is used to test the null hypothesis that a linear regression model does not exist, representing the relationship between the response variable y and the predictor variables x1, x2, x3, x4 and x5. The null hypothesis can also be represented as x1 = x2 = x3 = x4 = x5 = 0. F-statistics is calculated as a function of sum of squares residuals for restricted regression (representing linear regression model with only intercept or bias and all the values of coefficients as zero) and sum of squares residuals for unrestricted regression (representing linear regression model). In the above diagram, note the value of f-statistics as 15.66 against the degrees of freedom as 5 and 194. 
  • Evaluate t-statistics against the critical value/region : After calculating the value of t-statistics for each coefficient, it is now time to make a decision about whether to accept or reject the null hypothesis. In order for this decision to be made, one needs to set a significance level, which is also known as the alpha level. The significance level of 0.05 is usually set for rejecting the null hypothesis or otherwise. If the value of t-statistics fall in the critical region, the null hypothesis is rejected. Or, if the p-value comes out to be less than 0.05, the null hypothesis is rejected.
  • Evaluate f-statistics against the critical value/region : The value of F-statistics and the p-value is evaluated for testing the null hypothesis that the linear regression model representing response and predictor variables does not exist. If the value of f-statistics is more than the critical value at the level of significance as 0.05, the null hypothesis is rejected. This means that the linear model exists with at least one valid coefficients. 
  • Draw conclusions : The final step of hypothesis testing is to draw a conclusion by interpreting the results in terms of the original claim or hypothesis. If the null hypothesis of one or more predictor variables is rejected, it represents the fact that the relationship between the response and the predictor variable is not statistically significant based on the evidence or the sample data we used for training the model. Similarly, if the f-statistics value lies in the critical region and the value of the p-value is less than the alpha value usually set as 0.05, one can say that there exists a linear regression model.

Why hypothesis tests for linear regression models?

The reasons why we need to do hypothesis tests in case of a linear regression model are following:

  • By creating the model, we are establishing a new truth (claims) about the relationship between response or dependent variable with one or more predictor or independent variables. In order to justify the truth, there are needed one or more tests. These tests can be termed as an act of testing the claim (or new truth) or in other words, hypothesis tests.
  • One kind of test is required to test the relationship between response and each of the predictor variables (hence, T-tests)
  • Another kind of test is required to test the linear regression model representation as a whole. This is called F-test.

While training linear regression models, hypothesis testing is done to determine whether the relationship between the response and each of the predictor variables is statistically significant or otherwise. The coefficients related to each of the predictor variables is determined. Then, individual hypothesis tests are done to determine whether the relationship between response and that particular predictor variable is statistically significant based on the sample data used for training the model. If at least one of the null hypotheses is rejected, it represents the fact that there exists no relationship between response and that particular predictor variable. T-statistics is used for performing the hypothesis testing because the standard deviation of the sampling distribution is unknown. The value of t-statistics is compared with the critical value from the t-distribution table in order to make a decision about whether to accept or reject the null hypothesis regarding the relationship between the response and predictor variables. If the value falls in the critical region, then the null hypothesis is rejected which means that there is no relationship between response and that predictor variable. In addition to T-tests, F-test is performed to test the null hypothesis that the linear regression model does not exist and that the value of all the coefficients is zero (0). Learn more about the linear regression and t-test in this blog – Linear regression t-test: formula, example .

Recent Posts

Ajitesh Kumar

  • ROC Curve & AUC Explained with Python Examples - August 28, 2024
  • Accuracy, Precision, Recall & F1-Score – Python Examples - August 28, 2024
  • Logistic Regression in Machine Learning: Python Example - August 26, 2024

Ajitesh Kumar

One response.

Very informative

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Search for:

ChatGPT Prompts (250+)

  • Generate Design Ideas for App
  • Expand Feature Set of App
  • Create a User Journey Map for App
  • Generate Visual Design Ideas for App
  • Generate a List of Competitors for App
  • ROC Curve & AUC Explained with Python Examples
  • Accuracy, Precision, Recall & F1-Score – Python Examples
  • Logistic Regression in Machine Learning: Python Example
  • Reducing Overfitting vs Models Complexity: Machine Learning
  • Model Parallelism vs Data Parallelism: Examples

Data Science / AI Trends

  • • Prepend any arxiv.org link with talk2 to load the paper into a responsive chat application
  • • Custom LLM and AI Agents (RAG) On Structured + Unstructured Data - AI Brain For Your Organization
  • • Guides, papers, lecture, notebooks and resources for prompt engineering
  • • Common tricks to make LLMs efficient and stable
  • • Machine learning in finance

Free Online Tools

  • Create Scatter Plots Online for your Excel Data
  • Histogram / Frequency Distribution Creation Tool
  • Online Pie Chart Maker Tool
  • Z-test vs T-test Decision Tool
  • Independent samples t-test calculator

Recent Comments

I found it very helpful. However the differences are not too understandable for me

Very Nice Explaination. Thankyiu very much,

in your case E respresent Member or Oraganization which include on e or more peers?

Such a informative post. Keep it up

Thank you....for your support. you given a good solution for me.

  • 5.6 - The General Linear F-Test

The " general linear F-test " involves three basic steps, namely:

  • Define a larger full model . (By "larger," we mean one with more parameters.)
  • Define a smaller reduced model . (By "smaller," we mean one with fewer parameters.)
  • Use an F- statistic to decide whether or not to reject the smaller reduced model in favor of the larger full model.

As you can see by the wording of the third step, the null hypothesis always pertains to the reduced model, while the alternative hypothesis always pertains to the full model.

The easiest way to learn about the general linear F-test is to first go back to what we know, namely the simple linear regression model. Once we understand the general linear F-test for the simple case, we then see that it can be easily extended to the multiple case. We take that approach here.

The full model

The " full model ", which is also sometimes referred to as the " unrestricted model ," is the model thought to be most appropriate for the data. For simple linear regression, the full model is:

\[y_i=(\beta_0+\beta_1x_{i1})+\epsilon_i\]

Here's a plot of a hypothesized full model for a set of data that we worked with previously in this course (student heights and grade point averages):

plot

And, here's another plot of a hypothesized full model that we previously encountered (state latitudes and skin cancer mortalities):

plot

In each plot, the solid line represents what the hypothesized population regression line might look like for the full model. The question we have to answer in each case is "does the full model describe the data well?" Here, we might think that the full model does well in summarizing the trend in the second plot but not the first.

The reduced model

The " reduced model ," which is sometimes also referred to as the " restricted model ," is the model described by the null hypothesis H 0 . For simple linear regression, a common null hypothesis is H 0 : β 1 = 0. In this case, the reduced model is obtained by "zeroing-out" the slope β 1 that appears in the full model. That is, the reduced model is:

\[y_i=\beta_0+\epsilon_i\]

This reduced model suggests that each response y i is a function only of some overall mean, β 0 , and some error ε i .

Let's take another look at the plot of student grade point average against height, but this time with a line representing what the hypothesized population regression line might look like for the reduced model:

plot

Not bad — there (fortunately?!) doesn't appear to be a relationship between height and grade point average. And, it appears as if the reduced model might be appropriate in describing the lack of a relationship between heights and grade point averages. How does the reduced model do for the skin cancer mortality example?

plot

It doesn't appear as if the reduced model would do a very good job of summarizing the trend in the population.

How do we decide if the reduced model or the full model does a better job of describing the trend in the data when it can't be determined by simply looking at a plot? What we need to do is to quantify how much error remains after fitting each of the two models to our data. That is, we take the general linear F-test approach:

  • Obtain the least squares estimates of β 0 and β 1 .
  • Determine the error sum of squares, which we denote " SSE ( F )."
  • Obtain the least squares estimate of β 0 .
  • Determine the error sum of squares, which we denote " SSE ( R )."

Recall that, in general, the error sum of squares is obtained by summing the squared distances between the observed and fitted (estimated) responses:

\[\sum(\text{observed } - \text{ fitted})^2\]

Therefore, since \(y_i\) is the observed response and \(\hat{y}_i\) is the fitted response for the full model :

\[SSE(F)=\sum(y_i-\hat{y}_i)^2\]

And, since \(y_i\) is the observed response and \(\bar{y}\) is the fitted response for the reduced model :

\[SSE(R)=\sum(y_i-\bar{y})^2\]

Let's get a better feel for the general linear F-test approach by applying it to two different two datasets. First, let's look at the heightgpa data . The following plot of grade point averages against heights contains two estimated regression lines — the solid line is the estimated line for the full model, and the dashed line is the estimated line for the reduced model:

plot

As you can see, the estimated lines are almost identical. Calculating the error sum of squares for each model, we obtain:

\[SSE(F)=\sum(y_i-\hat{y}_i)^2=9.7055\]

\[SSE(R)=\sum(y_i-\bar{y})^2=9.7331\]

The two quantities are almost identical. Adding height to the reduced model to obtain the full model reduces the amount of error by only 0.0276 (from 9.7331 to 9.7055). That is, adding height to the model does very little in reducing the variability in grade point averages. In this case, there appears to be no advantage in using the larger full model over the simpler reduced model.

Look what happens when we fit the full and reduced models to the skin cancer mortality and latitude dataset :

plot

Here, there is quite a big difference in the estimated equation for the reduced model (solid line) and the estimated equation for the full model (dashed line). The error sums of squares quantify the substantial difference in the two estimated equations:

\[SSE(F)=\sum(y_i-\hat{y}_i)^2=17173\]

\[SSE(R)=\sum(y_i-\bar{y})^2=53637\]

Adding latitude to the reduced model to obtain the full model reduces the amount of error by 36464 (from 53637 to 17173). That is, adding latitude to the model substantially reduces the variability in skin cancer mortality. In this case, there appears to be a big advantage in using the larger full model over the simpler reduced model.

Where are we going with this general linear F-test approach? In short:

  • The general linear F-test involves a comparison between SSE ( R ) and SSE ( F ).
  • If SSE ( F ) is close to SSE ( R ), then the variation around the estimated full model regression function is almost as large as the variation around the estimated reduced model regression function. If that's the case, it makes sense to use the simpler reduced model.
  • On the other hand, if SSE ( F ) and SSE ( R ) differ greatly, then the additional parameter(s) in the full model substantially reduce the variation around the estimated regression function. In this case, it makes sense to go with the larger full model.

How different does SSE ( R ) have to be from SSE ( F ) in order to justify using the larger full model? The general linear F -statistic:

\[F^*=\left( \frac{SSE(R)-SSE(F)}{df_R-df_F}\right)\div\left( \frac{SSE(F)}{df_F}\right)\]

helps answer this question. The F -statistic intuitively makes sense — it is a function of SSE ( R )- SSE ( F ), the difference in the error between the two models. The degrees of freedom — denoted df R and df F — are those associated with the reduced and full model error sum of squares, respectively.

We use the general linear F -statistic to decide whether or not:

  • to reject the null hypothesis H 0 : the reduced model,
  • in favor of the alternative hypothesis H A : the full model.

In general, we reject H 0 if F * is large — or equivalently if its associated P -value is small.

The test applied to the simple linear regression model

For simple linear regression, it turns out that the general linear F -test is just the same ANOVA F -test that we learned before. As noted earlier for the simple linear regression case, the full model is:

and the reduced model is:

Therefore, the appropriate null and alternative hypotheses are specified either as:

  • H 0 : y i = β 0 + ε i
  • H A : y i = β 0 + β 1 x i + ε i
  • H 0 : β 1 = 0
  • H A : β 1 ≠ 0

The degrees of freedom associated with the error sum of squares for the reduced model is n -1, and:

\[SSE(R)=\sum(y_i-\bar{y})^2=SSTO\]

The degrees of freedom associated with the error sum of squares for the full model is n -2, and:

\[SSE(F)=\sum(y_i-\hat{y}_i)^2=SSE\]

Now, we can see how the general linear F -statistic just reduces algebraically to the ANOVA F -test that we know:

\(F^*=\left( \frac{SSE(R)-SSE(F)}{df_R-df_F}\right)\div\left( \frac{SSE(F)}{df_F}\right)\)

- 1

= - 2

( )

\(F^*=\left( \frac{SSTO-SSE}{(n-1)-(n-2)}\right)\div\left( \frac{SSE}{(n-2)}\right)=\frac{MSR}{MSE}\)

That is, the general linear F -statistic reduces to the ANOVA F -statistic:

\[F^*=\frac{MSR}{MSE}\]

For the student height and grade point average example:

\[F^*=\frac{MSR}{MSE}=\frac{0.0276/1}{9.7055/33}=\frac{0.0276}{0.2941}=0.094\]

For the skin cancer mortality example:

\[F^*=\frac{MSR}{MSE}=\frac{36464/1}{17173/47}=\frac{36464}{365.4}=99.8\]

The P -value is calculated as usual. The P -value answers the question: "what is the probability that we’d get an F* statistic as large as we did, if the null hypothesis were true?" The P -value is determined by comparing F * to an F distribution with 1 numerator degree of freedom and n -2 denominator degrees of freedom. For the student height and grade point average example, the P -value is 0.761 (so we fail to reject H 0 and we favor the reduced model), while for the skin cancer mortality example, the P -value is 0.000 (so we reject H 0 and we favor the full model).

Does alcoholism have an effect on muscle strength? Some researchers (Urbano-Marquez, et al , 1989) who were interested in answering this question collected the following data ( alcoholarm.txt ) on a sample of 50 alcoholic men:

  • x = the total lifetime dose of alcohol ( kg per kg of body weight) consumed
  • y = the strength of the deltoid muscle in the man's non-dominant arm

The full model is the model that would summarize a linear relationship between alcohol consumption and arm strength. The reduced model, on the other hand, is the model that claims there is no relationship between alcohol consumption and arm strength.

Upon fitting the reduced model to the data, we obtain:

plot

\[SSE(R)=\sum(y_i-\bar{y})^2=1224.32\]

Note that the reduced model does not appear to summarize the trend in the data very well.

Upon fitting the full model to the data, we obtain:

plot

\[SSE(F)=\sum(y_i-\hat{y}_i)^2=720.27\]

The full model appears to decribe the trend in the data better than the reduced model.

The good news is that in the simple linear regression case, we don't have to bother with calculating the general linear F -statistic. Statistical software does it for us in the ANOVA table:

As you can see, the output reports both SSE ( F ) — the amount of error associated with the full model — and SSE ( R ) — the amount of error associated with the reduced model. The F -statistic is:

\[F^*=\frac{MSR}{MSE}=\frac{504.04/1}{720.27/48}=\frac{504.04}{15.006}=33.59\]

and its associated P -value is < 0.001 (so we reject H 0 and we favor the full model). We can conclude that there is a statistically significant linear association between lifetime alcohol consumption and arm strength.

Start Here!

  • Welcome to STAT 462!
  • Search Course Materials
  • Lesson 1: Statistical Inference Foundations
  • Lesson 2: Simple Linear Regression (SLR) Model
  • Lesson 3: SLR Evaluation
  • Lesson 4: SLR Assumptions, Estimation & Prediction
  • 5.1 - Example on IQ and Physical Characteristics
  • 5.2 - Example on Underground Air Quality
  • 5.3 - The Multiple Linear Regression Model
  • 5.4 - A Matrix Formulation of the Multiple Regression Model
  • 5.5 - Three Types of MLR Parameter Tests
  • 5.7 - MLR Parameter Tests
  • 5.8 - Partial R-squared
  • 5.9- Further MLR Examples
  • Lesson 6: MLR Assumptions, Estimation & Prediction
  • Lesson 7: Transformations & Interactions
  • Lesson 8: Categorical Predictors
  • Lesson 9: Influential Points
  • Lesson 10: Regression Pitfalls
  • Lesson 11: Model Building
  • Lesson 12: Logistic, Poisson & Nonlinear Regression
  • Website for Applied Regression Modeling, 2nd edition
  • Notation Used in this Course
  • R Software Help
  • Minitab Software Help

Penn State Science

Copyright © 2018 The Pennsylvania State University Privacy and Legal Statements Contact the Department of Statistics Online Programs

IMAGES

  1. Interpreting simple linear regression equation

    null and alternative hypothesis for simple linear regression

  2. Simple Linier Regression

    null and alternative hypothesis for simple linear regression

  3. Simple regression

    null and alternative hypothesis for simple linear regression

  4. PPT

    null and alternative hypothesis for simple linear regression

  5. Hypothesis Test for Simple Linear Regession

    null and alternative hypothesis for simple linear regression

  6. Simple linear regression equation statistics

    null and alternative hypothesis for simple linear regression

VIDEO

  1. Hypothesis Testing: the null and alternative hypotheses

  2. Null Hypothesis vs Alternate Hypothesis

  3. 12 Simple Linear Regression

  4. Hypothesis || Part 16 || By Sunil Tailor Sir ||

  5. Writing the Null and Alternate Hypothesis in Statistics

  6. Illustrating Null and Alternative Hypothesis, Types of Error, Rejection Region

COMMENTS

  1. Understanding the Null Hypothesis for Linear Regression

    x: The value of the predictor variable. Simple linear regression uses the following null and alternative hypotheses: H0: β1 = 0. HA: β1 ≠ 0. The null hypothesis states that the coefficient β1 is equal to zero. In other words, there is no statistically significant relationship between the predictor variable, x, and the response variable, y.

  2. 12.2.1: Hypothesis Test for Linear Regression

    The null hypothesis of a two-tailed test states that there is not a linear relationship between \(x\) and \(y\). The alternative hypothesis of a two-tailed test states that there is a significant linear relationship between \(x\) and \(y\). Either a t-test or an F-test may be used to see if the slope is significantly different from zero.

  3. PDF Chapter 9 Simple Linear Regression

    218 CHAPTER 9. SIMPLE LINEAR REGRESSION 9.2 Statistical hypotheses For simple linear regression, the chief null hypothesis is H 0: β 1 = 0, and the corresponding alternative hypothesis is H 1: β 1 6= 0. If this null hypothesis is true, then, from E(Y) = β 0 + β 1x we can see that the population mean of Y is β 0 for

  4. Understanding the Null Hypothesis for Linear Regression

    The following examples show how to decide to reject or fail to reject the null hypothesis in both simple linear regression and multiple linear regression models. Example 1: Simple Linear Regression. Suppose a professor would like to use the number of hours studied to predict the exam score that students will receive in his class. He collects ...

  5. Simple Linear Regression

    Simple linear regression example. You are a social researcher interested in the relationship between income and happiness. You survey 500 people whose incomes range from 15k to 75k and ask them to rank their happiness on a scale from 1 to 10. Your independent variable (income) and dependent variable (happiness) are both quantitative, so you can ...

  6. 3.3.4: Hypothesis Test for Simple Linear Regression

    Simple Linear Regression ANOVA Hypothesis Test Example: Rainfall and sales of sunglasses We will now describe a hypothesis test to determine if the regression model is meaningful; in other words, does the value of \(X\) in any way help predict the expected value of \(Y\)?

  7. Null & Alternative Hypotheses

    The null hypothesis (H0) answers "No, there's no effect in the population.". The alternative hypothesis (Ha) answers "Yes, there is an effect in the population.". The null and alternative are always claims about the population. That's because the goal of hypothesis testing is to make inferences about a population based on a sample.

  8. Simple linear regression

    Interpreting the hypothesis test# If we reject the null hypothesis, can we assume there is an exact linear relationship? No. A quadratic relationship may be a better fit, for example. This test assumes the simple linear regression model is correct which precludes a quadratic relationship.

  9. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  10. Simple Linear Regression Assumptions

    In our example today: the bigger model is the simple linear regression model, the smaller is the model with constant mean (one sample model). If the \ ... The \(F\)-statistic for simple linear regression revisited# The null hypothesis is \[ H_0: \text{reduced model (R) is correct}. \]

  11. Simple Linear Regression

    A linear regression model says that the function f is a sum (linear combination) of functions of father. Simple linear regression model: (1) # f ( f a t h e r) = β 0 + β 1 ⋅ f a t h e r. Parameters of f are ( β 0, β 1) Could also be a sum (linear combination) of fixed functions of father: (2) # f ( f a t h e r) = β 0 + β 1 ⋅ f a t h e ...

  12. 8.2

    For Bob's simple linear regression example, he wants to see how changes in the number of critical areas (the predictor variable) impact the dollar amount for land development (the response variable). ... we test the null hypothesis that a value is zero. We extend this principle to the slope, with a null hypothesis that the slope is equal to ...

  13. Null and Alternative Hypotheses

    The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (HA): There's an effect in the population. The effect is usually the effect of the independent variable on the dependent ...

  14. Linear regression

    The lecture is divided in two parts: in the first part, we discuss hypothesis testing in the normal linear regression model, in which the OLS estimator of the coefficients has a normal distribution conditional on the matrix of regressors; in the second part, we show how to carry out hypothesis tests in linear regression analyses where the ...

  15. 2.11

    The P-value is smaller than the significance level \(\alpha = 0.05\) — we reject the null hypothesis in favor of the alternative. There is sufficient evidence at the \(\alpha = 0.05\) level to conclude that there is a lack of fit in the simple linear regression model. In light of the scatterplot, the lack of fit test provides the answer we ...

  16. PDF simple linear regression simple linear regression coefficients

    The simple linear regression model for n observations can be written as yi = β0 +β1xi +†i, i = 1,2,··· ,n. (1) The designation simple indicates that there is only one predictor variable x, and linear means that the model is linear in β0 and β1. The intercept β0 and the slope β1 are unknown constants, and they are both called ...

  17. 6.4

    For the simple linear regression model, there is only one slope parameter about which one can perform hypothesis tests. For the multiple linear regression model, there are three different hypothesis tests for slopes that one could conduct. They are: Hypothesis test for testing that all of the slope parameters are 0.

  18. Linear regression hypothesis testing: Concepts, Examples

    Here are key steps of doing hypothesis tests with linear regression models: Formulate null and alternate hypotheses: The first step of hypothesis testing is to formulate the null and alternate hypotheses. The null hypothesis (H0) is a statement that represents the state of the real world where the truth about something needs to be justified.

  19. 5.6

    The "reduced model," which is sometimes also referred to as the "restricted model," is the model described by the null hypothesis H 0. For simple linear regression, a common null hypothesis is H 0: β 1 = 0. In this case, the reduced model is obtained by "zeroing-out" the slope β 1 that appears in the full model. That is, the reduced model is:

  20. PDF Non-Zero Null Tests for Simple Linear Regression

    Power The probability of rejecting a false null hypothesis when the alternative hypothesis is true. N The size of the sample drawn from the population. B0 The slope under the null hypothesis, H0. B1 The slope under the alternative hypothesis, H1. This is the slope at which the power is calculated. σх The standard deviation of the X values.

  21. PDF Lecture 5 Hypothesis Testing in Multiple Linear Regression

    As in simple linear regression, under the null hypothesis t 0 = βˆ j seˆ(βˆ j) ∼ t n−p−1. We reject H 0 if |t 0| > t n−p−1,1−α/2. This is a partial test because βˆ j depends on all of the other predictors x i, i 6= j that are in the model. Thus, this is a test of the contribution of x j given the other predictors in the model.

  22. 14.4: Hypothesis Test for Simple Linear Regression

    In simple linear regression, this is equivalent to saying "Are X an Y correlated?". In reviewing the model, Y = β0 +β1X + ε Y = β 0 + β 1 X + ε, as long as the slope ( β1 β 1) has any non‐zero value, X X will add value in helping predict the expected value of Y Y. However, if there is no correlation between X and Y, the value of ...