• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Difference between Descriptive and Inferential Statistics

By Jim Frost 92 Comments

Descriptive and inferential statistics are two broad categories in the field of statistics . In this blog post, I show you how both types of statistics are important for different purposes. Interestingly, some of the statistical measures are similar, but the goals and methodologies are very different.

Descriptive Statistics

Image of a person holding a pen with a calculator and graphs.

Use descriptive statistics to summarize and graph the data for a group that you choose. This process allows you to understand that specific set of observations.

Descriptive statistics describe a sample. That’s pretty straightforward. You simply take a group that you’re interested in, record data about the group members, and then use summary statistics and graphs to present the group properties. With descriptive statistics, there is no uncertainty because you are describing only the people or items that you actually measure. You’re not trying to infer properties about a larger population.

The process involves taking a potentially large number of data points in the sample and reducing them down to a few meaningful summary values and graphs. This procedure allows us to gain more insights and visualize the data than simply pouring through row upon row of raw numbers!

Common tools of descriptive statistics

Descriptive statistics frequently use the following statistical measures to describe groups:

Central tendency : Use the mean or the median to locate the center of the dataset. This measure tells you where most values fall.

Dispersion : How far out from the center do the data extend? You can use the range or standard deviation to measure the dispersion. A low dispersion indicates that the values cluster more tightly around the center. Higher dispersion signifies that data points fall further away from the center. We can also graph the frequency distribution.

Skewness : The measure tells you whether the distribution of values is symmetric or skewed. See: Skewed Distributions

You can present this summary information using both numbers and graphs. These are the standard descriptive statistics, but there are other descriptive analyses you can perform, such as assessing the relationships of paired data using correlation and scatterplots .

Related posts : Measures of Central Tendency and Measures of Dispersion

Example of descriptive statistics

Suppose we want to describe the test scores in a specific class of 30 students. We record all of the test scores and calculate the summary statistics and produce graphs. Here is the CSV data file: Descriptive_statistics .

Histogram of test score distribution for the descriptive statistics example.

Mean 79.18
Range 66.21 – 96.53
Proportion >= 70 86.7%

These results indicate that the mean score of this class is 79.18. The scores range from 66.21 to 96.53, and the distribution is symmetrically centered around the mean. A score of at least 70 on the test is acceptable. The data show that 86.7% of the students have acceptable scores.

Collectively, this information gives us a pretty good picture of this specific class. There is no uncertainty surrounding these statistics because we gathered the scores for everyone in the class. However, we can’t take these results and extrapolate to a larger population of students.

We’ll do that later.

A good exploratory tool for descriptive statistics is the five-number summary , which presents a set of distributional properties for your sample.

Related post : Analyzing Descriptive Statistics in Excel

Inferential Statistics

Inferential statistics takes data from a sample and makes inferences about the larger population from which the sample was drawn. Because the goal of inferential statistics is to draw conclusions from a sample and generalize them to a population, we need to have confidence that our sample accurately reflects the population. This requirement affects our process. At a broad level, we must do the following:

  • Define the population we are studying.
  • Draw a representative sample from that population.
  • Use analyses that incorporate the sampling error.

We don’t get to pick a convenient group. Instead, random sampling allows us to have confidence that the sample represents the population. This process is a primary method for obtaining samples that mirrors the population on average. Random sampling produces statistics, such as the mean, that do not tend to be too high or too low. Using a random sample, we can generalize from the sample to the broader population. Unfortunately, gathering a truly random sample can be a complicated process. Learn more about Making Statistical Inferences .

You can use the following methods to collect a representative sample:

  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Systematic sampling

In contrast, convenience sampling doesn’t tend to obtain representative samples. These samples are easier to collect but the results are minimally useful.

Related posts : Populations vs Samples , Parameters vs. Statistics and Populations, Parameters, and Samples in Inferential Statistics

Pros and cons of working with samples

You gain tremendous benefits by working with a random sample drawn from a population. In most cases, it is simply impossible to measure the entire population to understand its properties. The alternative is to gather a random sample and then use the methodologies of inferential statistics to analyze the sample data.

While samples are much more practical and less expensive to work with, there are tradeoffs. Typically, we learn about the population by drawing a relatively small sample from it. We are a very long way off from measuring all people or objects in that population. Consequently, when you estimate the properties of a population from a sample, the sample statistics are unlikely to equal the actual population value exactly.

For instance, your sample mean is unlikely to equal the population mean exactly. The difference between the sample statistic and the population value is the sampling error. Inferential statistics incorporate estimates of this error into the statistical results.

In contrast, summary values in descriptive statistics are straightforward. The average score in a specific class is a known value because we measured all individuals in that class. There is no uncertainty.

Related post : Sample Statistics Are Always Wrong (to Some Extent)!

Standard analysis tools of inferential statistics

The most common methodologies in inferential statistics are hypothesis tests, confidence intervals, and regression analysis. Interestingly, these inferential methods can produce similar summary values as descriptive statistics, such as the mean and standard deviation. However, as I’ll show you, we use them very differently when making inferences.

Hypothesis tests

Hypothesis tests use sample data answer questions like the following:

  • Is the population mean greater than or less than a particular value?
  • Are the means of two or more populations different from each other?

For example, if we study the effectiveness of a new medication by comparing the outcomes in a treatment and control group, hypothesis tests can tell us whether the drug’s effect that we observe in the sample is likely to exist in the population. After all, we don’t want to use the medication if it is effective only in our specific sample. Instead, we need evidence that it’ll be useful in the entire population of patients. Hypothesis tests allow us to draw these types of conclusions about entire populations.

Related post : Statistical Hypothesis Testing Overview and Sample Mean vs. Population Mean

Confidence intervals (CIs)

In inferential statistics, a primary goal is to estimate population parameters. These parameters are the unknown values for the entire population, such as the population mean and standard deviation. These parameter values are not only unknown but almost always unknowable. Typically, it’s impossible to measure an entire population. The sampling error I mentioned earlier produces uncertainty, or a margin of error, around our estimates.

Suppose we define our population as all high school basketball players. Then, we draw a random sample from this population and calculate the mean height of 181 cm. This sample estimate of 181 cm is the best estimate of the mean height of the population. However, it’s virtually guaranteed that our estimate of the population parameter is not exactly correct.

Confidence intervals incorporate the uncertainty and sample error to create a range of values the actual population value is like to fall within. For example, a confidence interval of [176 186] indicates that we can be confident that the real population mean falls within this range.

Related post : Understanding Confidence Intervals

Regression analysis

Regression analysis describes the relationship between a set of independent variables and a dependent variable. This analysis incorporates hypothesis tests that help determine whether the relationships observed in the sample data actually exist in the population.

For example, the fitted line plot below displays the relationship in the regression model between height and weight in adolescent girls. Because the relationship is statistically significant, we have sufficient evidence to conclude that this relationship exists in the population rather than just our sample.

Related post : When Should I Use Regression Analysis?

Example of inferential statistics

For this example, suppose we conducted our study on test scores for a specific class as I detailed in the descriptive statistics section. Now we want to perform an inferential statistics study for that same test. Let’s assume it is a standardized statewide test. By using the same test, but now with the goal of drawing inferences about a population, I can show you how that changes the way we conduct the study and the results that we present.

In descriptive statistics, we picked the specific class that we wanted to describe and recorded all of the test scores for that class. Nice and simple. For inferential statistics, we need to define the population and then draw a random sample from that population.

Let’s define our population as 8 th -grade students in public schools in the State of Pennsylvania in the United States. We need to devise a random sampling plan to help ensure a representative sample. This process can actually be arduous. For the sake of this example, assume that we are provided a list of names for the entire population and draw a random sample of 100 students from it and obtain their test scores. Note that these students will not be in one class, but from many different classes in different schools across the state.

Inferential statistics results

For inferential statistics, we can calculate the point estimate for the mean, standard deviation, and proportion for our random sample. However, it is staggeringly improbable that any of these point estimates are exactly correct, and there is no way to know for sure anyway. Because we can’t measure all subjects in this population, there is a margin of error around these statistics. Consequently, I’ll report the confidence intervals for the mean, standard deviation, and the proportion of satisfactory scores (>=70). Here is the CSV data file: Inferential_statistics .

Mean 77.4 – 80.9
Standard deviation 7.7 – 10.1
Proportion scores >= 70 77% – 92%

Given the uncertainty associated with these estimates, we can be 95% confident that the population mean is between 77.4 and 80.9. The population standard deviation (a measure of dispersion) is likely to fall between 7.7 and 10.1. And, the population proportion of satisfactory scores is expected to be between 77% and 92%.

Another key inferential statistic is the standard error of the mean. To learn more about it, read my post The Standard Error of the Mean .

Differences between Descriptive and Inferential Statistics

As you can see, the difference between descriptive and inferential statistics lies in the process as much as it does the statistics that you report.

For descriptive statistics, we choose a group that we want to describe and then measure all subjects in that group. The statistical summary describes this group with complete certainty (outside of measurement error).

For inferential statistics, we need to define the population and then devise a sampling plan that produces a representative sample. The statistical results incorporate the uncertainty that is inherent in using a sample to understand an entire population. The sample size becomes a vital characteristic. The law of large numbers states that as the sample size grows, the sample statistics (i.e., sample mean) will converge on the population value.

A study using descriptive statistics is simpler to perform. However, if you need evidence that an effect or relationship between variables exists in an entire population rather than only your sample, you need to use inferential statistics.

If you’re learning about statistics and like the approach I use in my blog, check out my Introduction to Statistics book! It’s available at Amazon and other retailers.

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Share this:

descriptive and inferential statistics assignment

Reader Interactions

' src=

June 7, 2024 at 3:02 pm

THANK YOU DEAR JIM FOR THE BRILLIANT GREENLIGHT . MY QUESTION IS BASICALLY IS IT POSSIBLE TO USE BOTH F-TEST AND QUI-SQUARE TEST ON CATEGORICAL DATA

' src=

June 7, 2024 at 3:07 pm

Hi Francis,

Please do not use ALL CAPS! It hurts the eyes!

To answer your question, no it’s not possible to use both tests on categorical data. F-tests require continuous data because they involve the ratio of two variances. You can’t calculate variance for categorical data. Consequently, you cannot use an F-test with categorical data.

A chi-square test is designed to handle categorical data.

I hope that helps!

' src=

December 8, 2022 at 5:17 pm

It is generally not recommended to use a chi-squared test or cross-tabulation on data from purposive sampling because the sample is not representative of the population. Purposive sampling is a non-random sampling method in which the researcher deliberately selects the sample based on specific criteria, such as individuals who are experts in a particular field or who have a specific characteristic. This means that the sample is not randomly selected from the population and may not be representative of the population as a whole.

December 8, 2022 at 7:17 pm

Thanks for writing! I certainly agree with your comment. In general, all inferential statistics are questionable when you don’t have a representative sample. That includes all hypothesis tests, including chi-square tests.

To collect a representative sample, you need to use a probability sampling method. You mention purposive sampling, which is one of several types of NON-probability sampling. You don’t expect non-probability methods to produce representative samples. For more information, read my post about Sampling Methods .

' src=

May 5, 2022 at 3:30 pm

can I use both descriptive and inferential statistics for one research?

' src=

August 26, 2021 at 9:20 am

Hi I have conducted sampling purposive sampling, as I am collect data I came to realize that the population size could be bigger as compared to the figures I have. I am now analysising data, can I use Chi- squre test and cross tabulation using data from purposive sampling?

' src=

March 12, 2021 at 4:16 am

Interesting Explanation!

' src=

November 27, 2020 at 3:40 am

I’m pursuing my phd on child labour laws in India. In research methodology, I adopted ‘Questionnaire’ (close ended ques) and ‘Interview Schedule’ methods (structure based ques). My both methods are covered under quantitative research. Now ques it what statics tool i will use for prove my hypothesis. Can I use inferential or descriptive? If its inferential statistics then my hypothesis will prove under parameteric test?

' src=

November 7, 2020 at 3:31 pm

You say under Confidence intervals that, “For example, a confidence interval of [176 186] indicates that we can be confident that the real population mean falls within this range.” Do you mean that there is a, e.g., 95% probability that the interval [176 186] covers the population parameter (mean)? Since the population parameter (mean) is an unknown constant and no probability statement concerning its value can be made, wouldn’t the 95% probability relate to the estimation procedure and not a single calculated interval? What I mean is that if the method of deriving a CI from a new sampling of the population data is done a large number of times, the number of CIs containing the population parameter will tend towards 95%. So, for any one CI calculated, the population parameter either will or will not be within the CI. Thank you for sharing your knowledge.

' src=

November 4, 2020 at 5:19 am

Well written article with clear explanations and examples. Is the book yet published? I am interested to have a read…

November 4, 2020 at 10:54 pm

Hi Tichafara,

Yes, I have three books published, all are in both ebook and print formats! For more information, and free samples, go to My Webstore !

' src=

November 1, 2020 at 10:38 pm

Thank you so much for your plain but informative illustration! The use of examples is the biggest strength that suits novice researchers like me. Currently, I don’t have specific questions but I may request some help from you some day. Thank you in advance!

' src=

October 23, 2020 at 7:14 am

Hi Jim, I am doing statistical analysis to see the radiation dose to heart of two methods for breast cancer treatment, 3D versus IMRT. I have two groups of patients, one treated with 3D (6 samples) and another treated with IMRT (4 samples). What is a good test to use knowing that the radiation dose to heart is a critical? thank you

' src=

October 15, 2020 at 7:09 pm

If I am conducting research that there is an increased likelihood of adolescent loneliness and isolation stemmed from the coronavirus pandemic, I would use inferential analysis correct? If not what data analysis technique would be best to use. I am trying to show that there is.a positive relationship between adolescent loneliness and isolation due to the coronavirus.

' src=

September 30, 2020 at 8:01 am

HI Jim if some one want to study TUBERCULOSIS Patients in a district. he collects data from tuberculosis patients in that particular district who have been registered from 1st July 2019 to 31st DEC 2019. what sampling method have been used here if he can collect information from 80% of the said patients. is it a convenient sampling or else other. please explain ?

September 30, 2020 at 3:55 pm

It’s not entirely certain. It sounds like they’ve defined their population as that specific group. However, how did the researchers obtain the subjects? If they obtain data from all tuberculosis patients in that district during that timeframe, then it’s not a sample at all. It is the population under study. You’d use descriptive statistics to describe that complete group. However, it sounds like the researcher wants to collect 80% of those patients, which makes it a sample. The research could draw either a random sample or a convenience sample from that population. He could use either to get up to 80%. Hopefully, it’s a random sample though!

' src=

June 15, 2020 at 2:26 am

Hi, very good blog! In medicine to help us to understand degree of abnormality ,ex: caliber of a blood vessel we use z score. Will there be t score as well for that and can we use that? Regards

June 15, 2020 at 3:27 pm

I’m not sure of the context but I suspect that they’re using Z-scores to show how an individual’s blood vessel compares to the average blood vessel. You would not use t-scores for that purpose. However, if you were performing a hypothesis test on the mean differences between blood vessels for two groups, you’d use a t-test, which does use t-scores. Read my post about performing t-tests for more information.

Additionally, stayed tuned, as I will be releasing my brand new book about hypothesis testing very soon!

June 15, 2020 at 2:06 am

Hi Jim, this is a great stuff! Would you please help me to understand the following question. In medical research, broadly 2 types of studies. One is descriptive and the other analytical.In descrptive studies we use operational verb ‘to estimate’ and in in analytical studies ‘to determine’. Analytical studies are to test the hypothesis and descriptive ones to generate hypothesis. Do descriptive studies use inferential statistics too? Thanks in advance!

June 15, 2020 at 3:34 pm

Descriptive studies simply describe the group that is measures. The results are not generalized beyond that group. There is no need to use inferential procedures in a descriptive study. There is also no need for estimates because you’re measuring all subjects. For example, if you are simply describing the tests results of a class, you know the average score. It is not an estimate of a population value. You don’t test hypotheses in descriptive studies.

Inferential studies will generalize the results beyond the group and draw inferences about a larger population. All scientific studies use inferential statistics because they don’t want to know whether an effect exists just in a small group of subjects. However, scientific studies can include descriptive statistics about the subjects for informational purposes and to verify they are not unusual in some manner. These studies do derive estimates of the population values and test hypotheses about the properties of the population.

This post describes all of this information throughout. I’d read through it again more carefully.

' src=

April 22, 2020 at 4:05 pm

Hi, Thanks for the post. I am completing a paper and my instructor asked me to use inferential statistics to lend credence to my results. I am confused because I only have 11 participants. I have collected the following data from these 11 participants : baseline and week five questionnaires, and 5 weeks of weekly adherence reports. Is it possible to use inferential statistics to present my results? Thank you, BM

' src=

April 17, 2020 at 12:37 pm

Thank a lot for your answer. You definitely help me a lot with how to manage the results of my experiment.

' src=

April 17, 2020 at 11:25 am

Thanks alot,,,,very informative

April 3, 2020 at 11:31 pm

Hi Jim, I am wondering if I performed experiments in wich I measured strain for two differentes materials in a condition in wich I varying the external temperature. It is possible to use descriptive statistic for show the results and then inferential statistics for try to compare the behaivor both materials or just i have to choose one of both statistics?

April 5, 2020 at 6:58 pm

If you want to apply the results from your sample beyond just the sample, you’ll need to be sure to use a representative sampling method and to use inferential procedures that incorporate estimates of the sampling error. You can include the descriptive statistics, but be sure to mention the results of the inferential procedures, such as statistical significance and confidence intervals. For example, you can say that the difference in mean strength of the two materials is X (descriptive), and that the difference is statistically significant and the CI is [y z] (inferential).

I hope this helps!

' src=

February 16, 2020 at 5:24 pm

Hello sir, Your explanations has really helped me to understands most of the concepts. God bless you and I wish you continue this way.

' src=

February 4, 2020 at 11:04 am

Very helpful thanks a lot

' src=

January 30, 2020 at 12:18 am

Hi Jim, thank you for supporting me on the topic I didn’t know before. God bless. And, I’ve one question if you have time, tell me in detail. All types of inferential statistical tools and describe for which purpose could applied it.

January 31, 2020 at 4:38 pm

That’s an extremely broad question, which I can’t answer in a blog comment. However, in this post, I do mention the key tools of inferential statistics. So, look through this blog post for your answers!

' src=

December 18, 2019 at 7:38 am

Thank you.This is very helpful me to understand the difference of these two..

' src=

December 11, 2019 at 6:35 pm

thank you sir , i wanna ask you one question is there any relationship between the sampling techniques used to gather data and descriptive/inferential statistics ?

December 13, 2019 at 10:31 am

I hope you’ve read this blog post thoroughly. It should be clear from this post that for descriptive statistics you just pick the group(s) you’re interested and measure all people/items in them. You don’t collect a sample from those groups but instead measure all members of the group(s). No sampling in descriptive statistics. For inferential statistics, you do take a sample of the larger population and that sample must be representative of that population.

' src=

December 6, 2019 at 3:13 am

Amazing explanation! Much better than the textbook given to me. Thanks!

' src=

August 14, 2019 at 6:33 pm

Yyoy simplify complex processes. Amazing

' src=

August 2, 2019 at 11:24 pm

Thanks, I now understand the difference between descriptive stat and inferential stat

' src=

July 25, 2019 at 8:18 am

It was very helpful. Thanks Sir

' src=

May 24, 2019 at 7:29 am

Thanks so much for the clear explanations and your time. A quick question please, can i use inferential statistics to test the hypothesis of ” there is no significant relationship between congestion and the ambient air condition”? Also if Yes, was method is most appropriate regression analysis or t test or ANOVA. Thanks

May 28, 2019 at 9:47 am

Hi, inferential statistics are a collection of procedures that allow you to use random samples drawn from a population to make conclusions about the entire population. Assuming you can define a population for your study area of ambient air condition and draw a random sample from it, you can probably use inferential statistics. The correct analysis to use depends on the goals of your analysis and the type of variables that you use. Because I don’t know that information, I couldn’t tell you whether t-tests or ANOVA are the correct procedure.

' src=

March 6, 2019 at 7:31 am

thank you Sir, very easy to understand.

' src=

February 11, 2019 at 12:55 pm

Fantastic description, makes the concepts really clear for someone who wants a revision of these topics. Much appreciated, Sir!

' src=

February 1, 2019 at 7:51 am

One of the best article i have witnessed on that topic, thanks sir 🙂

February 1, 2019 at 9:53 am

Thank you, Raja! I really appreciate that!

' src=

January 15, 2019 at 10:23 pm

I like how you simplify the words just to make the topic much clear.

January 15, 2019 at 11:47 pm

Thanks, Lance!

' src=

December 26, 2018 at 5:54 am

Looking fwd fr ur book it will b a great help

December 29, 2018 at 6:50 pm

Thank you, Anum!

' src=

October 20, 2018 at 11:57 am

Pretty cleared about this concept now 🙂, you are doing a great job 👍

October 21, 2018 at 1:04 am

Thank you, Zed. I really appreciate the nice comment!

' src=

October 18, 2018 at 4:25 am

Thanks a lots for your clear and conscious note posts. I understood the better know-how on the area of descriptive and inferential statistics.

October 18, 2018 at 2:02 pm

You’re very welcome, Amanuel. I appreciated your nice comment!

' src=

October 17, 2018 at 8:18 am

A great help for us who are studying statistics. Thank you for making it easier for us to understand this subject. God bless.

October 17, 2018 at 10:44 am

Thank you, Maria!

' src=

October 5, 2018 at 2:41 am

Nice article, I had a question….

I have a dataset which is skewed to the right and when I perform “Descriptive Statistics” it provides MEAN as one of the parameter (Mean = sum/Number of data points), but when I fit the same data to a distribution and I found “Weibull” to be a best fit and calculate “Mean” [Mean of Weibull = Scale *Gamma(1+1/Beta)], now the “Descriptive Statistics” Mean and Weibull Mean have same value, how is this possible when the formulas of calculating Mean are different for each approach?

October 5, 2018 at 9:25 am

Just a guess but either beta equals 1, or the descriptive statistics procedure simply uses the general calculation of the mean rather than the Weibull specific calculation.

' src=

September 23, 2018 at 2:15 pm

This was so unbelievably helpful! Thanks for making this so easy to understand!

September 24, 2018 at 10:33 am

You’re very welcome, Rosa! I’m glad it was helpful!

' src=

September 5, 2018 at 3:15 pm

hy Jim you are inspirational worldwide by helping us thank you so much im now a distinction student in statistics all because of you,you are a blessing to us

September 6, 2018 at 12:46 am

Hi Motlatsi,

Thank you so much for your very kind comment! I really appreciate it. I put a lot of work into my website because I want to make statistics easier to learn for all.

That all said, I’m sure you put in a lot of hard work learning statistics! Congratulations on being such a great student!

I wish you the very best!

' src=

July 24, 2018 at 9:53 am

Thank you for the insight! I wish someone told me this earlier. To follow up with another similar question, most example problems also state “assume alpha = 0.05.” Someone told me that in practice, we use alpha from similar research topics found in industry that pertains to your own. Would you agree with that statement?

July 24, 2018 at 2:32 pm

Hey again Nick,

As for significance levels, in the field, the most commonly used alpha by far is 0.05. I almost never see a different value. The most I see is that analysts will adjust the significance level when they’re making many comparisons, such as between the factor levels in an ANOVA.

I do agree with the practice of seeing what others in your industry have used and their rational. For example, if a Type I error is particularly costly, dangerous, or bad in whatever way, you might change the significance level to 0.01. If a Type II error is particularly bad, you might change alpha to 0.10. Although, I’m always leery of increasing alpha from 0.05 to say 0.10. Simulation studies show that p-values near 0.05 actually reflect very weak evidence of an effect–so decreasing the strength of evidence you require (e.g., by increasing alpha from 0.05 to 0.10) doesn’t seem like a good idea. I cover this a bit at the end of my post about interpreting p-values . But, I can often imagine a need to lower alpha to something like 0.01.

So, I do agree with the principle, but I often don’t see it in practice. Although, I think 0.05 is often a good value to use, so that’s probably part of why it is so ubiquitous. It’s probably a good value to use unless you can identify a specific and important reason to use a different value. And, that information is what you might gain by looking at similar research topics in your industry.

' src=

July 23, 2018 at 2:34 pm

Thank You Sir….! It’s really really nice, i have been found very simplistic way to understand the things which you have taken care of very well sir. thank you once again sir

' src=

July 22, 2018 at 11:39 pm

i’d been reading several readings but still confused… Thank you so much for the informations you shared… And now everything is clear…

' src=

July 21, 2018 at 3:45 pm

Thank you so much for such great content! I use your posts frequently to grasp all the material currently studied in school. I do have one question I can not wrap my head around. Was hoping you could help explain.

I would certainly agree that we can gain value by analyzing random samples because it is sometimes impossible to measure the entire population. With that being said, let us for a moment consider methods described in textbooks: estimating population mean using the Z statistic (when pop. st. dev. is known) or t statistic (when pop. st. dev. is unknown). If we can not measure the entire population and are unable to get a population standard deviation or a population mean as result, how can we use these methods or construct a confidence interval if we actually know nothing about the population? Most problems in textbooks state (assume population mean is xxx or std. dev. is yyy). To me, this does not sound practical… How is this process done in industry?

July 21, 2018 at 10:27 pm

I’m glad that my posts help you out!

You’re entirely correct about when to use t-values versus Z scores. Because you almost never know the population standard deviation, you never really use Z-tests in practice. After all, if you knew the population standard deviation, wouldn’t you probably also know the population mean? I don’t know why some statistics classes and textbooks use that test and assume you know the population standard deviation. I suppose it’s a little simpler case than using the t-distribution which changes depending on your degrees of freedom.

If you need to test hypotheses or find confidence intervals about a population mean and you’re using a sample, you’ll almost always use t-tests and t-values .

' src=

July 10, 2018 at 5:38 am

Happened to discover your website recently and have been going through it. Very helpful!

July 10, 2018 at 2:30 pm

Thanks, Karma! I’m glad you have found it to be helpful!

' src=

March 29, 2018 at 10:58 am

You’re welcome Jim! Your blog is pulling me into statistics every time I read any of your post. Statistics is nice and beautiful. I am a Geographer I like modelling. I would like to see some of your post talking something about spatial statistic, if you now something that might be useful.

I will be here every time with you teacher.

Thank you again, Jim.

March 29, 2018 at 12:17 am

I am getting addicted to your blog, Jim Frost. I think this is what should be taught at the first statistic class, before going to any math and formulas. I am safe here, at least I know who can help me solving my doubts.

Thank you Jim, God bless you always.

March 29, 2018 at 12:22 am

Hi Patrik, you have no idea how much your kind comments mean to me! Thank you!

' src=

March 24, 2018 at 7:48 am

Your blog explains statistics in a very student-friendly manner. Importantly, your explanations to various terminologies is nicely illustrated. Could you write more on multi-variate statistical analysis? Thanks.

March 24, 2018 at 5:52 pm

Hi, thanks so much! I strive to make statistics as easy to understand as possible. Your nice comments mean a lot to me!

I’ll try to write more about multivariate analyses in the future.

' src=

March 23, 2018 at 2:10 pm

Please help me on this assignment this is the following questions 1. Define the descriptive statistic and inferential statistics 2. The difference between descriptive statistics and inferential statistics

March 23, 2018 at 2:22 pm

Hi Ndamona, the information you need to answer your questions are in this blog post. You’re in the right place!

' src=

March 12, 2018 at 9:53 pm

This was a good introduction and an important help to me. I wish you had gone into a little more detail about standard deviation. I also wish there were a link to print this page. It is the kind I could go back to from time to time to refresh what I have learned. I am John and I am a PhD student in education. Thanks for this help.

March 12, 2018 at 10:42 pm

Hi John, I’m happy to hear that you found this helpful. I’m also adding new content all the time. As for the standard deviation, I write about it in a different post about Measures of Variability . You might find that helpful.

' src=

February 21, 2018 at 3:48 pm

Still waiting for your reply

February 21, 2018 at 4:07 pm

Hi Carlo, that’s a very broad question–I could write an entire book about that topic. Is there something more specific you want to know?

' src=

February 19, 2018 at 3:17 pm

Just discovered this website today very helpful. Thank you Jim..

February 21, 2018 at 3:08 pm

Hi Evelyn, thank you for you kind words! I’m glad you found it to be helpful!

' src=

February 5, 2018 at 10:02 pm

Very good one. Explains the basics well. Thanks

' src=

February 5, 2018 at 7:43 am

thank u so much continuously i need such brief explanation about statistics therefore i need another material specially about Bayesian distribution b/c i.m post graduate class a thesis on maternal mortality approach of bayesian model

' src=

February 5, 2018 at 6:59 am

I am a data scientist,i enjoy while going through your articles.thank you jim.

February 5, 2018 at 10:26 am

Hi Rama, I’m glad that you find my posts to be helpful!

' src=

February 4, 2018 at 9:27 pm

Hello sir, l want to know that what is the need of interval estimation while already we have point estimation?

February 4, 2018 at 10:55 pm

Hi Aayush, that is a great question! I talk about this in the Example of Inferential Statistics section. It is possible to calculate the point estimate for the population. However, it’s virtually guaranteed that this estimate is wrong by some amount. So, the question becomes, how far off is the point estimate likely to be?

Confidence intervals answer this question. The narrower the intervals, the more precise the estimate. With narrow intervals, you can be reasonably sure that the point estimate isn’t too far wrong. However, if the CI is wide, you know that you shouldn’t expect the point estimate to be too near the true value. In that case, don’t place to much confidence in the point estimate! Interval estimation provides additional information about the precision of the point estimate.

I hope this helps clarify things!

' src=

February 4, 2018 at 2:28 pm

I have seen definitions of sample standard deviation in social science textbooks using an n denominator for descriptive statistics and an n-1 for inferential statistics. I have never seen a math book using the n denominator for descriptive. Any comment on why the social science world goes off on a different direction here?

February 5, 2018 at 10:36 pm

Hi Jerry, I don’t know why social science takes that route. I can tell you that in statistics the correct formula to use for standard deviation depends on whether the data are the entire group or population or a sample from a larger population.

When the data are the entire group (descriptive statistics), the denominator is n. However, if you are using a sample to estimate the value of a population (inferential), you use n-1. This is because you need to account for the degrees of freedom that you use for the estimate.

' src=

February 4, 2018 at 8:16 am

Thank you Jim for making things simpler and better. I am Ann, PhD Scholar from India

February 4, 2018 at 10:56 pm

Hi Ann, you’re very welcome! I’m so glad that you find my posts to be helpful! I love India! I’ve been there several times!

February 4, 2018 at 7:26 am

Very useful presentation of the topic. What about their use in big data analysis?

' src=

February 4, 2018 at 3:23 am

Many thanks for this post. You’re a godsend. Have you authored any books?

February 5, 2018 at 1:24 am

Hi Sol, You’re very welcome! 🙂 And, that’s a timely question. I’m working on my first book at the moment!

Comments and Questions Cancel reply

Statology

Descriptive vs. Inferential Statistics: What’s the Difference?

There are two main branches in the field of statistics:

  • Descriptive Statistics

Inferential Statistics

This tutorial explains the difference between the two branches and why each one is useful in certain situations.

Descriptive  Statistics

In a nutshell,  descriptive statistics  aims to  describe  a chunk of raw data using summary statistics, graphs, and tables.

Descriptive statistics are useful because they allow you to understand a group of data much more quickly and easily compared to just staring at rows and rows of raw data values.

For example, suppose we have a set of raw data that shows the test scores of 1,000 students at a particular school. We might be interested in the average test score along with the distribution of test scores.

Using descriptive statistics, we could find the average score and create a graph that helps us visualize the distribution of scores.

This allows us to understand the test scores of the students much more easily compared to just staring at the raw data.

Common Forms of Descriptive Statistics

There are three common forms of descriptive statistics:

1. Summary statistics.  These are statistics that  summarize  the data using a single number. There are two popular types of summary statistics:

  • Measures of central tendency : these numbers describe where the center of a dataset is located. Examples include the  mean   and the  median .
  • Measures of dispersion : these numbers describe how spread out the values are in the dataset. Examples include the  range ,  interquartile range ,  standard deviation , and  variance .

2. Graphs . Graphs help us visualize data. Common types of graphs used to visualize data include boxplots , histograms , stem-and-leaf plots , and scatterplots .

3. Tables . Tables can help us understand how data is distributed. One common type of table is a  frequency table , which tells us how many data values fall within certain ranges. 

Example of Using Descriptive Statistics

The following example illustrates how we might use descriptive statistics in the real world.

Suppose 1,000 students at a certain school all take the same test. We are interested in understanding the distribution of test scores, so we use the following descriptive statistics:

1. Summary Statistics

Mean: 82.13 . This tells us that the average test score among all 1,000 students is 82.13.

Median: 84.  This tells us that half of all students scored higher than 84 and half scored lower than 84.

Max: 100. Min: 45.  This tells us the maximum score that any student obtained was 100 and the minimum score was 45. The  range – which tells us the difference between the max and the min – is 55.

To visualize the distribution of test scores, we can create a histogram – a type of chart that uses rectangular bars to represent frequencies.

descriptive and inferential statistics assignment

Based on this histogram, we can see that the distribution of test scores is roughly bell-shaped. Most of the students scored between 70 and 90, while very few scored above 95 and fewer still scored below 50.

Another easy way to gain an understanding of the distribution of scores is to create a frequency table. For example, the following frequency table shows what percentage of students scored between various ranges:

descriptive and inferential statistics assignment

We can see that just 4% of the total students scored above a 95. We can also see that (12% + 9% + 4% = ) 25% of all students scored an 85 or higher.

A frequency table is particularly helpful if we want to know what percentage of the data values fall above or below a certain value. For example, suppose the school considers an “acceptable” test score to be any score above a 75.

By looking at the frequency table, we can easily see that (20% + 22% + 12% + 9% + 4% = ) 67% of the students received an acceptable test score.

In a nutshell,  inferential statistics  uses a small sample of data to draw  inferences  about the larger population that the sample came from.

For example, we might be interested in understanding the political preferences of millions of people in a country.

However, it would take too long and be too expensive to actually survey every individual in the country. Thus, we would instead take a smaller survey of say, 1,000 Americans, and use the results of the survey to draw inferences about the population as a whole.

This is the whole premise behind inferential statistics – we want to answer some question about a population, so we obtain data for a small sample of that population and use the data from the sample to draw inferences about the population.

The Importance of a Representative Sample

In order to be confident in our ability to use a sample to draw inferences about a population, we need to make sure that we have a  representative sample   – that is, a sample in which the characteristics of the individuals in the sample closely match the characteristics of the overall population.

Ideally, we want our sample to be like a “mini version” of our population. So, if we want to draw inferences on a population of students composed of 50% girls and 50% boys, our sample would not be representative if it included 90% boys and only 10% girls.

descriptive and inferential statistics assignment

If our sample is not similar to the overall population, then we cannot generalize the findings from the sample to the overall population with any confidence.

How to Obtain a Representative Sample

To maximize the chances that you obtain a representative sample, you need to focus on two things:

1. Make sure you use a random sampling method.

There are several different random sampling methods that you can use that are likely to produce a representative sample, including:

  • A simple random sample
  • A systematic random sample
  • A cluster random sample
  • A stratified random sample

Random sampling methods tend to produce representative samples because every member of the population has an equal chance of being included in the sample.

2. Make sure your sample size is large enough . 

Along with using an appropriate sampling method, it’s important to ensure that the sample is large enough so that you have enough data to generalize to the larger population.

To determine how large your sample should be, you have to consider the population size you’re studying, the confidence level you’d like to use, and the margin of error you consider to be acceptable.

Fortunately, you can use online calculators to plug in these values and see how large your sample needs to be.

Common Forms of Inferential Statistics

There are three common forms of inferential statistics:

1. Hypothesis Tests.

Often we’re interested in answering questions about a population such as:

  • Is the percentage of people in Ohio in support of candidate A higher than 50%?
  • Is the mean height of a certain plant equal to 14 inches?
  • Is there a difference between the mean height of students at School A compared to School B?

To answer these questions we can perform a hypothesis test , which allows us to use data from a sample to draw conclusions about populations.

2. Confidence Intervals . 

Sometimes we’re interested in estimating some value for a population. For example, we might be interested in the mean height of a certain plant species in Australia.

Instead of going around and measuring every single plant in the country, we might collect a small sample of plants and measure each one. Then, we can use the mean height of the plants in the sample to estimate the mean height for the population.

However, our sample is unlikely to provide a perfect estimate for the population. Fortunately, we can account for this uncertainty by creating a confidence interval , which provides a range of values that we’re confident the true population parameter falls in.

For example, we might produce a 95% confidence interval of [13.2, 14.8], which says we’re 95% confident that the true mean height of this plant species is between 13.2 inches and 14.8 inches.

3. Regression .

Sometimes we’re interested in understanding the relationship between two variables in a population.

For example, suppose we want to know if  hours spent studying per week  is related to  test scores . To answer this question, we could perform a technique known as  regression analysis .

So, we may observe the number of hours studied along with the test scores for 100 students and perform a regression analysis to see if there is a significant relationship between the two variables.

If the p-value of the regression turns out to be significant , then we can conclude that there is a significant relationship between these two variables in the overall population of students.

The Difference Between Descriptive and Inferential Statistics

In summary, the difference between descriptive and inferential statistics can be described as follows:

Descriptive statistics  use summary statistics, graphs, and tables to describe  a data set.

This is useful for helping us gain a quick and easy understanding of a data set without pouring over all of the individual data values.

Inferential statistics  use samples to draw  inferences  about larger populations.

Depending on the question you want to answer about a population, you may decide to use one or more of the following methods: hypothesis tests, confidence intervals, and regression analysis.

If you do choose to use one of these methods, keep in mind that your sample needs to be representative of your population , or the conclusions you draw will be unreliable.

Featured Posts

descriptive and inferential statistics assignment

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

3 Replies to “Descriptive vs. Inferential Statistics: What’s the Difference?”

Wow! Awesome! So easily explained! I finally understood and know now how to create and answer my questions! Thank you!

I just came across this site and all I can say is “I love you Sir”

This site is the real treasure I was lucky to find. Thanks a million, Zach Bobbitt!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

  • Comprehensive Learning Paths
  • 150+ Hours of Videos
  • Complete Access to Jupyter notebooks, Datasets, References.

Rating

Descriptive and Inferential Statistics – Deep Dive into Descriptive and Inferential Statistics

  • September 14, 2023

In statistics understanding the difference between descriptive and inferential statistics is crucial for anyone looking to make sense of data, whether it’s for academic research, business decision-making, or just general curiosity. Let’s dive into these core concepts.

descriptive and inferential statistics assignment

In this Blog post we will learn:

  • What is Descriptive Statistics?
  • What is Inferential Statistics?
  • Difference Between Descriptive and Inferential Statistics: A Quick Glance
  • Types of Descriptive Statistics with Examples
  • Types of Inferential Statistics with Examples

1. What is Descriptive Statistics?

Descriptive statistic offer a way to capture the main features of a dataset in a summarized and comprehensible manner. It doesn’t make predictions or inferences but instead provides a concise overview of what the data shows.

For instance, imagine you’ve conducted a survey in your neighborhood asking how many books people read in a year. Descriptive statistics would provide you with insights like the average number of books read, the range between the highest and lowest figures, or the most common number reported.

2. What is Inferential Statistics?

Inferential statistics , on the other hand, goes a step beyond. Instead of just summarizing or describing data, inferential statistics aims to use the data to make predictions, inferences, or decisions about a broader context than just the sampled data.

Going back to our book-reading survey, inferential statistics might let us predict the average number of books a person in a larger area (say, the entire city) might read in a year, based on the data collected in your neighborhood.

3. Difference Between Descriptive and Inferential Statistics: A Quick Glance

Feature Descriptive Statistics Inferential Statistics
Summarize and describe data Make predictions or inferences
Specific dataset under study Sample data to infer about a larger population
Qualitative and quantitative Mostly quantitative
Mean, median, mode, standard deviation Hypothesis testing, regression analysis, ANOVA
What is happening in my data? What could be happening beyond my data?
Limited to the dataset in question Applies to a larger population or different scenarios

4. Types of Descriptive Statistics with Examples

Measures of Central Tendency :These provide insights into the central point of a dataset.

  • Example: For a dataset of ages (23, 25, 26, 29, 30), the mean age is $ \frac{23+25+26+29+30}{5} = 26.6 $ years.
  • Example: The central value in a sorted dataset. For the ages above, the median age is 26 years.
  • Example: The most frequent value(s). If the dataset is (23, 25, 25, 26, 29, 30), 25 is the mode.

Measures of Spread :These describe the distribution and dispersion of values in a dataset.

  • Example: For our ages dataset, the range is 7 years (from 23 to 30).
  • Example: For our ages dataset, variance can be calculated using a formula which takes into account the mean and the differences of each value from the mean. The standard deviation is the square root of this variance.

Frequency Distributions : This is often represented graphically, such as with histograms, to show how frequently each value appears in the dataset. – Example: A histogram might show how many people in our neighborhood read 0-5 books, 6-10 books, 11-15 books, and so on.

5. Types of Inferential Statistics with Examples

1. Hypothesis Testing : A systematic way to test claims or ideas about a group or population.

  • Example : Imagine a company claims that its weight loss pill helps people lose an average of 10 lbs in a month. To test this, a sample of individuals is selected and given the pill. If the sample shows an average weight loss significantly different from 10 lbs, the claim can be challenged.

2. Confidence Intervals : It gives a range of values used to estimate the true population parameter. This interval can give an idea of the uncertainty around a sample estimate.

  • Example : Based on a sample, a researcher might conclude that 40% of a city’s residents favor a new park, with a confidence interval of 5%. This means the researcher is confident that between 35% and 45% of all residents favor the new park.

3. p-value : L A p-value is used in hypothesis testing to determine the significance of the results of a study. It’s a measure of the evidence against a null hypothesis.

  • Example : If testing the effectiveness of a drug, a p-value of 0.03 might indicate that there’s only a 3% chance that the observed results were due to random chance (often p-values less than 0.05 are considered “statistically significant”).

4. Chi-Square Tests : Used to test relationships between categorical variables.

  • Example : Researchers might want to test if there’s a relationship between gender and the likelihood to vote for a particular candidate. The Chi-Square test can help determine if observed voting patterns are due to chance or a genuine relationship between the variables.

5. ANOVA (Analysis of Variance) : Compares the means of three or more groups to understand if they’re statistically different from each other.

descriptive and inferential statistics assignment

  • Example : A psychologist might want to test three different techniques to reduce anxiety. By applying ANOVA, the psychologist can determine if one technique is superior, or if all techniques produce the same results.

6. Regression Analysis : Examines the relationship between two or more variables. It allows for predictions based on this relationship.

  • Example : An economist might explore the relationship between a country’s GDP and its unemployment rate. If a strong relationship is found, the economist can make predictions about unemployment based on future GDP estimates.

7. T-tests : Compares the means of two groups to understand if they’re statistically different from each other.

  • Example : A researcher might want to test if a new teaching method is better than the traditional method. By using a t-test, the researcher can determine if there’s a significant difference in performance between students taught with the new method versus those taught with the traditional method.

8. Z-tests : Used when the dataset is large, and you know the population variance. It’s used to compare a sample mean to a population mean.

  • Example :A large factory might claim that its assembly line produces 500 units per hour on average. An inspector could use a Z-test to see if a different hourly rate in his inspection is significantly different from the claim.

6. Conclusion

While both descriptive and inferential statistics have their unique places in data analysis, understanding when and how to use them is crucial. Descriptive statistics give you the tools to succinctly summarize and describe data, whereas inferential statistics empowers you to draw conclusions and predictions about larger contexts or populations. Both are indispensable tools in the world of data-driven decision-making.

More Articles

F statistic formula – explained, correlation – connecting the dots, the role of correlation in data analysis, hypothesis testing – a deep dive into hypothesis testing, the backbone of statistical inference, sampling and sampling distributions – a comprehensive guide on sampling and sampling distributions, law of large numbers – a deep dive into the world of statistics, central limit theorem – a deep dive into central limit theorem and its significance in statistics, similar articles, complete introduction to linear regression in r, how to implement common statistical significance tests and find the p value, logistic regression – a complete tutorial with examples in r.

Subscribe to Machine Learning Plus for high value data science content

© Machinelearningplus. All rights reserved.

descriptive and inferential statistics assignment

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free sample videos:.

descriptive and inferential statistics assignment

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Inferential Statistics | An Easy Introduction & Examples

Inferential Statistics | An Easy Introduction & Examples

Published on September 4, 2020 by Pritha Bhandari . Revised on June 22, 2023.

While descriptive statistics summarize the characteristics of a data set, inferential statistics help you come to conclusions and make predictions based on your data.

When you have collected data from a sample , you can use inferential statistics to understand the larger population from which the sample is taken.

Inferential statistics have two main uses:

  • making estimates about populations (for example, the mean SAT score of all 11th graders in the US).
  • testing hypotheses to draw conclusions about populations (for example, the relationship between SAT scores and family income).

Table of contents

Descriptive versus inferential statistics, estimating population parameters from sample statistics, hypothesis testing, other interesting articles, frequently asked questions about inferential statistics.

Descriptive statistics allow you to describe a data set, while inferential statistics allow you to make inferences based on a data set.

  • Descriptive statistics

Using descriptive statistics, you can report characteristics of your data:

  • The distribution concerns the frequency of each value.
  • The central tendency concerns the averages of the values.
  • The variability concerns how spread out the values are.

In descriptive statistics, there is no uncertainty – the statistics precisely describe the data that you collected. If you collect data from an entire population, you can directly compare these descriptive statistics to those from other populations.

Inferential statistics

Most of the time, you can only acquire data from samples, because it is too difficult or expensive to collect data from the whole population that you’re interested in.

While descriptive statistics can only summarize a sample’s characteristics, inferential statistics use your sample to make reasonable guesses about the larger population.

With inferential statistics, it’s important to use random and unbiased sampling methods . If your sample isn’t representative of your population, then you can’t make valid statistical inferences or generalize .

Sampling error in inferential statistics

Since the size of a sample is always smaller than the size of the population, some of the population isn’t captured by sample data. This creates sampling error , which is the difference between the true population values (called parameters) and the measured sample values (called statistics).

Sampling error arises any time you use a sample, even if your sample is random and unbiased. For this reason, there is always some uncertainty in inferential statistics. However, using probability sampling methods reduces this uncertainty.

Prevent plagiarism. Run a free check.

The characteristics of samples and populations are described by numbers called statistics and parameters :

  • A statistic is a measure that describes the sample (e.g., sample mean ).
  • A parameter is a measure that describes the whole population (e.g., population mean).

Sampling error is the difference between a parameter and a corresponding statistic. Since in most cases you don’t know the real population parameter, you can use inferential statistics to estimate these parameters in a way that takes sampling error into account.

There are two important types of estimates you can make about the population: point estimates and interval estimates .

  • A point estimate is a single value estimate of a parameter. For instance, a sample mean is a point estimate of a population mean.
  • An interval estimate gives you a range of values where the parameter is expected to lie. A confidence interval is the most common type of interval estimate.

Both types of estimates are important for gathering a clear idea of where a parameter is likely to lie.

Confidence intervals

A confidence interval uses the variability around a statistic to come up with an interval estimate for a parameter. Confidence intervals are useful for estimating parameters because they take sampling error into account.

While a point estimate gives you a precise value for the parameter you are interested in, a confidence interval tells you the uncertainty of the point estimate. They are best used in combination with each other.

Each confidence interval is associated with a confidence level. A confidence level tells you the probability (in percentage) of the interval containing the parameter estimate if you repeat the study again.

A 95% confidence interval means that if you repeat your study with a new sample in exactly the same way 100 times, you can expect your estimate to lie within the specified range of values 95 times.

Although you can say that your estimate will lie within the interval a certain percentage of the time, you cannot say for sure that the actual population parameter will. That’s because you can’t know the true value of the population parameter without collecting data from the full population.

However, with random sampling and a suitable sample size, you can reasonably expect your confidence interval to contain the parameter a certain percentage of the time.

Your point estimate of the population mean paid vacation days is the sample mean of 19 paid vacation days.

Hypothesis testing is a formal process of statistical analysis using inferential statistics. The goal of hypothesis testing is to compare populations or assess relationships between variables using samples.

Hypotheses , or predictions, are tested using statistical tests . Statistical tests also estimate sampling errors so that valid inferences can be made.

Statistical tests can be parametric or non-parametric. Parametric tests are considered more statistically powerful because they are more likely to detect an effect if one exists.

Parametric tests make assumptions that include the following:

  • the population that the sample comes from follows a normal distribution of scores
  • the sample size is large enough to represent the population
  • the variances , a measure of variability , of each group being compared are similar

When your data violates any of these assumptions, non-parametric tests are more suitable. Non-parametric tests are called “distribution-free tests” because they don’t assume anything about the distribution of the population data.

Statistical tests come in three forms: tests of comparison, correlation or regression.

Comparison tests

Comparison tests assess whether there are differences in means, medians or rankings of scores of two or more groups.

To decide which test suits your aim, consider whether your data meets the conditions necessary for parametric tests, the number of samples, and the levels of measurement of your variables.

Means can only be found for interval or ratio data , while medians and rankings are more appropriate measures for ordinal data .

test Yes Means 2 samples
Yes Means 3+ samples
Mood’s median No Medians 2+ samples
Wilcoxon signed-rank No Distributions 2 samples
Wilcoxon rank-sum (Mann-Whitney ) No Sums of rankings 2 samples
Kruskal-Wallis No Mean rankings 3+ samples

Correlation tests

Correlation tests determine the extent to which two variables are associated.

Although Pearson’s r is the most statistically powerful test, Spearman’s r is appropriate for interval and ratio variables when the data doesn’t follow a normal distribution.

The chi square test of independence is the only test that can be used with nominal variables.

Pearson’s Yes Interval/ratio variables
Spearman’s No Ordinal/interval/ratio variables
Chi square test of independence No Nominal/ordinal variables

Regression tests

Regression tests demonstrate whether changes in predictor variables cause changes in an outcome variable. You can decide which regression test to use based on the number and types of variables you have as predictors and outcomes.

Most of the commonly used regression tests are parametric. If your data is not normally distributed, you can perform data transformations.

Data transformations help you make your data normally distributed using mathematical operations, like taking the square root of each value.

1 interval/ratio variable 1 interval/ratio variable
2+ interval/ratio variable(s) 1 interval/ratio variable
Logistic regression 1+ any variable(s) 1 binary variable
Nominal regression 1+ any variable(s) 1 nominal variable
Ordinal regression 1+ any variable(s) 1 ordinal variable

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Confidence interval
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

descriptive and inferential statistics assignment

Descriptive statistics summarize the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalizable to the broader population.

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

A sampling error is the difference between a population parameter and a sample statistic .

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. (2023, June 22). Inferential Statistics | An Easy Introduction & Examples. Scribbr. Retrieved August 21, 2024, from https://www.scribbr.com/statistics/inferential-statistics/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, parameter vs statistic | definitions, differences & examples, descriptive statistics | definitions, types, examples, hypothesis testing | a step-by-step guide with easy examples, what is your plagiarism score.

Print book

Descriptive and Inferential Statistics

Site:
Course: MA121: Introduction to Statistics
Book: Descriptive and Inferential Statistics
Printed by: Guest user
Date: Saturday, August 24, 2024, 1:55 AM

Description

Read these sections and complete the questions at the end of each section. Here, we introduce descriptive statistics using examples and discuss the difference between descriptive and inferential statistics. We also talk about samples and populations, explain how you can identify biased samples, and define differential statistics.

Table of contents

Populations and samples, simple random sampling, sample size matters, more complex sampling, random assignment, stratified sampling, descriptive statistics, learning objectives.

  • Define "descriptive statistics"
  • Distinguish between descriptive statistics and inferential statistics

Descriptive statistics are numbers that are used to summarize and describe data. The word "data" refers to the information that has been collected from an experiment, a survey, a historical record, etc. (By the way, "data" is plural. One piece of information is called a "datum"). If we are analyzing birth certificates, for example, a descriptive statistic might be the percentage of certificates issued in New York State, or the average age of the mother. Any other number we choose to compute also counts as a descriptive statistic for the data from which the statistic is computed. Several descriptive statistics are often used at one time to give a full picture of the data.

Descriptive statistics are just descriptive. They do not involve generalizing beyond the data at hand. Generalizing from our data to another set of cases is the business of inferential statistics, which you'll be studying in another section. Here we focus on (mere) descriptive statistics.

Some descriptive statistics are shown in Table 1. The table shows the average salaries for various occupations in the United States in 1999. 

Table 1. Average salaries for various occupations in 1999.

$112,760 pediatricians
$106,130 dentists
$100,090 podiatrists
$ 76,140 physicists
$ 53,410 architects
$ 49,720 school, clinical, and counseling psychologists
$ 47,910 flight attendants
$ 39,560 elementary school teachers
$ 38,710 police officers
$ 18,980 floral designers

Descriptive statistics like these offer insight into American society. It is interesting to note, for example, that we pay the people who educate our children and who protect our citizens a great deal less than we pay people who take care of our feet or our teeth.

For more descriptive statistics, consider Table 2 which shows the number of unmarried men per 100 unmarried women in U.S. Metro Areas in 1990. From this table we see that men outnumber women most in Jacksonville, NC, and women outnumber men most in Sarasota, FL. You can see that descriptive statistics can be useful if we are looking for an opposite-sex partner! (These data come from the Information Please Almanac).

Table 2. Number of unmarried men per 100 unmarried women in U.S. Metro Areas in 1990.

Cities with mostly men Men per 100 Women Cities with mostly women Men per 100 Women
1. Jacksonville, NC 1. Sarasota, FL
2. Killeen-Temple, TX 2. Bradenton, FL
3. Fayetteville, NC 3. Altoona, PA
4. Brazoria, TX 4. Springfield, IL
5. Lawton, OK 5. Jacksonville, TN
6. State College, PA 6. Gadsden, AL
7. Clarksville-Hopkinsville, TN-KY 7. Wheeling, WV
8. Anchorage, Alaska 8. Charleston, WV
9. Salinas-Seaside-Monterey, CA 9. St. Joseph, MO
10. Bryan-College Station, TX 10. Lynchburg, VA

NOTE: Unmarried includes never-married, widowed, and divorced persons, 15 years or older.

These descriptive statistics may make us ponder why the numbers are so disparate in these cities. One potential explanation, for instance, as to why there are more women in Florida than men may involve the fact that elderly individuals tend to move down to the Sarasota region and that women tend to outlive men. Thus, more women might live in Sarasota than men. However, in the absence of proper data, this is only speculation.

You probably know that descriptive statistics are central to the world of sports. Every sporting event produces numerous statistics such as the shooting percentage of players on a basketball team. For the Olympic marathon (a foot race of 26.2 miles), we possess data that cover more than a century of competition. (The first modern Olympics took place in 1896). The following table shows the winning times for both men and women (the latter have only been allowed to compete since 1984).

Women
Year Winner Country Time
1984 Joan Benoit USA 2:24:52
1988 Rosa Mota POR 2:25:40
1992 Valentina Yegorova UT 2:32:41
1996 Fatuma Roba ETH 2:26:05
2000 Naoko Takahashi JPN 2:23:14
2004 Mizuki Noguchi JPN 2:26:20
Men
Year Winner Country Time
1896 Spiridon Louis GRE 2:58:50
1900 Michel Theato FRA 2:59:45
1904 Thomas Hicks USA 3:28:53
1906 Billy Sherring CAN 2:51:23
1908 Johnny Hayes USA 2:55:18
1912 Kenneth McArthur S. Afr. 2:36:54
1920 Hannes Kolehmainen FIN 2:32:35
1924 Albin Stenroos FIN 2:41:22
1928 Boughra El Ouafi FRA 2:32:57
1932 Juan Carlos Zabala ARG 2:31:36
1936 Sohn Kee-Chung JPN 2:29:19
1948 Delfo Cabrera ARG 2:34:51
1952 Emil Ztopek CZE 2:23:03
1956 Alain Mimoun FRA 2:25:00
1960 Abebe Bikila ETH 2:15:16
1964 Abebe Bikila ETH 2:12:11
1968 Mamo Wolde ETH 2:20:26
1972 Frank Shorter USA 2:12:19
1976 Waldemar Cierpinski E.Ger 2:09:55
1980 Waldemar Cierpinski E.Ger 2:11:03
1984 Carlos Lopes POR 2:09:21
1988 Gelindo Bordin ITA 2:10:32
1992 Hwang Young-Cho S. Kor 2:13:23
1996 Josia Thugwane S. Afr. 2:12:36
2000 Gezahenge Abera ETH 2:10.10
2004 Stefano Baldini ITA 2:10:55

There are many descriptive statistics that we can compute from the data in the table. To gain insight into the improvement in speed over the years, let us divide the men's times into two pieces, namely, the first 13 races (up to 1952) and the second 13 (starting from 1956). The mean winning time for the first 13 races is 2 hours, 44 minutes, and 22 seconds (written 2:44:22). The mean winning time for the second 13 races is 2:13:18. This is quite a difference (over half an hour). Does this prove that the fastest men are running faster? Or is the difference just due to chance, no more than what often emerges from chance differences in performance from year to year? We can't answer this question with descriptive statistics alone. All we can affirm is that the two means are "suggestive".

Examining Table 3 leads to many other questions. We note that Takahashi (the lead female runner in 2000) would have beaten the male runner in 1956 and all male runners in the first 12 marathons. This fact leads us to ask whether the gender gap will close or remain constant. When we look at the times within each gender, we also wonder how much they will decrease (if at all) in the next century of the Olympics. Might we one day witness a sub-2 hour marathon? The study of statistics can help you make reasonable guesses about the answers to these questions.

Public Domain Mark

  https://onlinestatbook.com/movies/introduction/descriptive.mp4  

  • inferential and descriptive.
  • population and sample.
  • sampling and scaling.
  • mean and median.
  • allow random assignment to experimental conditions.
  • use data from a sample to answer questions about a population.
  • summarize and describe data.
  • allow you to generalize beyond the data at hand.
  • The mean age of people in Detroit.
  • The number of people who watched the superbowl in the year 2002.
  • A prediction of next month's unemployment rate.
  • The median price of new homes sold in Miami.
  • The height of the tallest woman in the world.
  • The two divisions of statistics are inferential and descriptive. (See the text for more details.)
  • Descriptive statistics summarize and describe data. Inferential statistics use data from a sample to answer questions about a population. Inferential statistics involves generalizing beyond the data at hand.
  • Descriptive statistics are numbers that are used to summarize and describe data. Predicting next month's unemployment rate involves predicting future data, no describing the data at hand.

Inferential Statistics

  • Distinguish between a sample and a population
  • Define inferential statistics
  • Identify biased samples
  • Distinguish between simple random sampling and stratified sampling
  • Distinguish between random sampling and random assignment

In statistics, we often rely on a  sample  - that is, a small subset of a larger set of data - to draw inferences about the larger set. The larger set is known as the  population  from which the sample is drawn.

Example #1: You have been hired by the National Election Commission to examine how the American people feel about the fairness of the voting procedures in the U.S. Whom will you ask?

It is not practical to ask every single American how he or she feels about the fairness of the voting procedures. Instead, we query a relatively small number of Americans, and draw inferences about the entire country from their responses. The Americans actually queried constitute our sample of the larger population of all Americans. The mathematical procedures whereby we convert information about the sample into intelligent guesses about the population fall under the rubric of  inferential statistics .

A sample is typically a small subset of the population. In the case of voting attitudes, we would sample a few thousand Americans drawn from the hundreds of millions that make up the country. In choosing a sample, it is therefore crucial that it not over-represent one kind of citizen at the expense of others. For example, something would be wrong with our sample if it happened to be made up entirely of Florida residents. If the sample held only Floridians, it could not be used to infer the attitudes of other Americans. The same problem would arise if the sample were comprised only of Republicans. Inferential statistics are based on the assumption that sampling is random. We trust a random sample to represent different segments of society in close to the appropriate proportions (provided the sample is large enough; see below).

Example #2: We are interested in examining how many math classes have been taken on average by current graduating seniors at American colleges and universities during their four years in school. Whereas our population in the last example included all US citizens, now it involves just the graduating seniors throughout the country. This is still a large set since there are thousands of colleges and universities, each enrolling many students. (New York University, for example, enrolls 48,000 students). It would be prohibitively costly to examine the transcript of every college senior. We therefore take a sample of college seniors and then make inferences to the entire population based on what we find. To make the sample, we might first choose some public and private colleges and universities across the United States. Then we might sample 50 students from each of these institutions. Suppose that the average number of math classes taken by the people in our sample were 3.2. Then we might speculate that 3.2 approximates the number we would find if we had the resources to examine every senior in the entire population. But we must be careful about the possibility that our sample is non-representative of the population. Perhaps we chose an overabundance of math majors, or chose too many technical institutions that have heavy math requirements. Such bad sampling makes our sample unrepresentative of the population of all seniors.

To solidify your understanding of sampling  bias , consider the following example. Try to identify the population and the sample, and then reflect on whether the sample is likely to yield the information desired.

Example #3: A substitute teacher wants to know how students in the class did on their last test. The teacher asks the 10 students sitting in the front row to state their latest test score. He concludes from their report that the class did extremely well. What is the sample? What is the population? Can you identify any problems with choosing the sample in the way that the teacher did?

In Example #3, the population consists of all students in the class. The sample is made up of just the 10 students sitting in the front row. The sample is not likely to be representative of the population. Those who sit in the front row tend to be more interested in the class and tend to perform higher on tests. Hence, the sample may perform at a higher level than the population.

Example #4: A coach is interested in how many cartwheels the average college freshmen at his university can do. Eight volunteers from the freshman class step forward. After observing their performance, the coach concludes that college freshmen can do an average of 16 cartwheels in a row without stopping.

In Example #4, the population is the class of all freshmen at the coach's university. The sample is composed of the 8 volunteers. The sample is poorly chosen because volunteers are more likely to be able to do cartwheels than the average freshman; people who can't do cartwheels probably did not volunteer! In the example, we are also not told of the gender of the volunteers. Were they all women, for example? That might affect the outcome, contributing to the non-representative nature of the sample (if the school is co-ed).

Researchers adopt a variety of sampling strategies. The most straightforward is simple random sampling. Such sampling requires every member of the population to have an equal chance of being selected into the sample. In addition, the selection of one member must be independent of the selection of every other member. That is, picking one member from the population must not increase or decrease the probability of picking any other member (relative to the others). In this sense, we can say that simple random sampling chooses a sample by pure chance. To check your understanding of simple random sampling, consider the following example. What is the population? What is the sample? Was the sample picked by simple random sampling? Is it biased?

Example #5: A research scientist is interested in studying the experiences of twins raised together versus those raised apart. She obtains a list of twins from the  National Twin Registry , and selects two subsets of individuals for her study. First, she chooses all those in the registry whose last name begins with Z. Then she turns to all those whose last name begins with B. Because there are so many names that start with B, however, our researcher decides to incorporate only every other name into her sample. Finally, she mails out a survey and compares characteristics of twins raised apart versus together.

In Example #5, the population consists of all twins recorded in the National Twin Registry. It is important that the researcher only make statistical generalizations to the twins on this list, not to all twins in the nation or world. That is, the National Twin Registry may not be representative of all twins. Even if inferences are limited to the Registry, a number of problems affect the sampling procedure we described. For instance, choosing only twins whose last names begin with Z does not give every individual an equal chance of being selected into the sample. Moreover, such a procedure risks over-representing ethnic groups with many surnames that begin with Z. There are other reasons why choosing just the Z's may bias the sample. Perhaps such people are more patient than average because they often find themselves at the end of the line! The same problem occurs with choosing twins whose last name begins with B. An additional problem for the B's is that the “every-other-one” procedure disallowed adjacent names on the B part of the list from being both selected. Just this defect alone means the sample was not formed through simple random sampling.

Recall that the definition of a random sample is a sample in which every member of the population has an equal chance of being selected. This means that the sampling procedure rather than the results of the procedure define what it means for a sample to be random. Random samples, especially if the sample size is small, are not necessarily representative of the entire population. For example, if a random sample of 20 subjects were taken from a population with an equal number of males and females, there would be a nontrivial probability (0.06) that 70% or more of the sample would be female. Such a sample would not be representative, although it would be drawn randomly. Only a large sample size makes it likely that our sample is close to representative of the population. For this reason, inferential statistics take into account the sample size when generalizing results from samples to populations. In later chapters, you'll see what kinds of mathematical techniques ensure this sensitivity to sample size. 

Sometimes it is not feasible to build a sample using simple random sampling. To see the problem, consider the fact that both Dallas and Houston are competing to be hosts of the 2012 Olympics. Imagine that you are hired to assess whether most Texans prefer Houston to Dallas as the host, or the reverse. Given the impracticality of obtaining the opinion of every single Texan, you must construct a sample of the Texas population. But now notice how difficult it would be to proceed by simple random sampling. For example, how will you contact those individuals who don't vote and don't have a phone? Even among people you find in the telephone book, how can you identify those who have just relocated to California (and had no reason to inform you of their move)? What do you do about the fact that since the beginning of the study, an additional 4,212 people took up residence in the state of Texas? As you can see, it is sometimes very difficult to develop a truly random procedure. For this reason, other kinds of sampling techniques have been devised. We now discuss two of them.

In experimental research, populations are often hypothetical. For example, in an experiment comparing the effectiveness of a new anti-depressant drug with a placebo, there is no actual population of individuals taking the drug. In this case, a specified population of people with some degree of depression is defined and a random sample is taken from this population. The sample is then randomly divided into two groups; one group is assigned to the treatment condition (drug) and the other group is assigned to the control condition (placebo). This random division of the sample into two groups is called  random assignment .  Random assignment  is critical for the validity of an experiment. For example, consider the bias that could be introduced if the first 20 subjects to show up at the experiment were assigned to the experimental group and the second 20 subjects were assigned to the control group. It is possible that subjects who show up late tend to be more depressed than those who show up early, thus making the experimental group less depressed than the control group even before the treatment was administered.

In experimental research of this kind, failure to assign subjects randomly to groups is generally more serious than having a non-random sample. Failure to randomize (the former error) invalidates the experimental findings. A non-random sample (the latter error) simply restricts the generalizability of the results.

Since simple random sampling often does not ensure a representative sample, a sampling method called stratified random sampling is sometimes used to make the sample more representative of the population. This method can be used if the population has a number of distinct "strata" or groups. In stratified sampling, you first identify members of your sample who belong to each group. Then you randomly sample from each of those subgroups in such a way that the sizes of the subgroups in the sample are proportional to their sizes in the population.

Let's take an example: Suppose you were interested in views of capital punishment at an urban university. You have the time and resources to interview 200 students. The student body is diverse with respect to age; many older people work during the day and enroll in night courses (average age is 39), while younger students generally enroll in day classes (average age of 19). It is possible that night students have different views about capital punishment than day students. If 70% of the students were day students, it makes sense to ensure that 70% of the sample consisted of day students. Thus, your sample of 200 students would consist of 140 day students and 60 night students. The proportion of day students in the sample and in the population (the entire university) would be the same. Inferences to the entire population of students at the university would therefore be more secure.

  https://onlinestatbook.com/movies/introduction/inferential.mp4  

Question 1 out of 8. Our data come from _______, but we really care most about ______.

  • theories; mathematical models
  • samples; populations
  • populations; samples
  • subjective methods; objective methods

Question 2 out of 8. A random sample

  • is more likely to be representative of the population than any other kind of sample.
  • is always representative of the population.
  • allows you to directly calculate the parameters of the population.
  • all of the above are true.
  • all of the above are false.

Question 3 out of 8. When participants who arrive for a research study are put into treatment groups on the basis of chance,

  • random sampling has occurred.
  • random assignment has occurred.
  • the statistical conclusions will also be absolutely correct.
  • the research findings will be compromised because you should never randomly assign to groups.

Question 4 out of 8. Uncertainty regarding conclusions about a population can be eliminated if you

a. use a large random sample. b. obtain data from all members of the population. c. depend upon the t-distribution. d. both a and b.

Question 5 out of 8. Which of the following is (are) true? Using a random sample

  • is to accept some uncertainty about the conclusions.
  • enables you to calculate statistics.
  • is to risk drawing the wrong conclusions about the population.
  • biases your results.

Question 6 out of 8. A random sample is one

  • that is haphazard.
  • that is unplanned.
  • in which every sample of a particular size has an equal probability of being selected.
  • that ensures that there will be no uncertainty in the conclusions.

Question 7 out of 8. Which of the following is a random sample of a college student body?

  • Every fifth person coming out of the Campus Center between 8:30am and 10:00am.
  • Lisa Meyer, Todd Jones, and Maria Rivera, whose ID numbers were picked from a table of random numbers.
  • Every 20th person in the student directory.
  • All are examples of random samples.

Question 8 out of 8.

A biased sample is one that

  • is too small.
  • will always lead to a wrong conclusion.
  • will likely have certain groups from the population over-represented or under-represented due only to chance factors.
  • will likely have groups from the population over-represented or under-represented due to systematic sampling factors.
  • is always a good and useful sample.
  •  Samples; populations We study a sample to allow us to draw inferences about the population.
  • All of the above are false. Stratified sampling is more likely to be representative of the population than random sampling.
  • random assignment has occurred. Random assignment has occurred because the decision as to which subject goes into which group is random.
  • The only way to eliminate uncertainty is to obtain data from the whole population. You can reduce uncertainty with a large sample.
  • All of the above except "biases your results". Random sampling does not produce bias, which means systematic rather than random error.
  • A random sample is defined as one in which every sample of a particular size has an equal probability of being selected.
  • The correct choice is: Lisa Meyer, Todd Jones, and Maria Rivera, whose ID numbers were picked from a table of random numbers.
  • will likely have groups from the population over-represented or under-represented due to systematic sampling factors. Only when the sampling is systematically favoring one group or another is the sample biased. Random samples, although they can be different from the population, are not biased. Bias is defined by the procedure for drawing the sample, not by the result.

The Difference Between Descriptive and Inferential Statistics

Descriptive statistics.

  • Statistics Tutorials
  • Probability & Games

Inferential Statistics

  • Applications Of Statistics
  • Math Tutorials
  • Pre Algebra & Algebra
  • Exponential Decay
  • Worksheets By Grade
  • Ph.D., Mathematics, Purdue University
  • M.S., Mathematics, Purdue University
  • B.A., Mathematics, Physics, and Chemistry, Anderson University

The field of statistics is divided into two major divisions: descriptive and inferential. Each of these segments is important, offering different techniques that accomplish different objectives. Descriptive statistics describe what is going on in a population or data set . Inferential statistics, by contrast, allow scientists to take findings from a sample group and generalize them to a larger population. The two types of statistics have some important differences.

Descriptive statistics is the type of statistics that probably springs to most people’s minds when they hear the word “statistics.” In this branch of statistics, the goal is to describe. Numerical measures are used to tell about features of a set of data. There are a number of items that belong in this portion of statistics, such as:

  • The average , or measure of the center of a data set, consisting of the mean, median, mode, or midrange
  • The spread of a data set, which can be measured with the range or standard deviation
  • Overall descriptions of data such as the five number summary
  • Measurements such as skewness and kurtosis
  • The exploration of relationships and correlation between paired data
  • The presentation of statistical results in graphical form

These measures are important and useful because they allow scientists to see patterns among data, and thus to make sense of that data. Descriptive statistics can only be used to describe the population or data set under study: The results cannot be generalized to any other group or population.

Types of Descriptive Statistics

There are two kinds of descriptive statistics that social scientists use:

Measures of central tendency  capture general trends within the data and are calculated and expressed as the mean, median, and mode. A mean tells scientists the mathematical average of all of a data set, such as the average age at first marriage; the median represents the middle of the data distribution, like the age that sits in the middle of the range of ages at which people first marry; and, the mode might be the most common age at which people first marry.

Measures of spread describe how the data are distributed and relate to each other, including:

  • The range, the entire range of values present in a data set
  • The frequency distribution, which defines how many times a particular value occurs within a data set
  • Quartiles, subgroups formed within a data set when all values are divided into four equal parts across the range
  • Mean absolute deviation , the average of how much each value deviates from the mean
  • Variance , which illustrates how much of a spread exists in the data
  • Standard deviation, which illustrates the spread of data relative to the mean

Measures of spread are often visually represented in tables, pie and bar charts, and histograms to aid in the understanding of the trends within the data.

Inferential statistics are produced through complex mathematical calculations that allow scientists to infer trends about a larger population based on a study of a sample taken from it. Scientists use inferential statistics to examine the relationships between variables within a sample and then make generalizations or predictions about how those variables will relate to a larger population.

It is usually impossible to examine each member of the population individually. So scientists choose a representative subset of the population, called a statistical sample, and from this analysis, they are able to say something about the population from which the sample came. There are two major divisions of inferential statistics:

  • A confidence interval gives a range of values for an unknown parameter of the population by measuring a statistical sample. This is expressed in terms of an interval and the degree of confidence that the parameter is within the interval.
  • Tests of significance or hypothesis testing  where scientists make a claim about the population by analyzing a statistical sample. By design, there is some uncertainty in this process. This can be expressed in terms of a level of significance.

Techniques that social scientists use to examine the relationships between variables, and thereby to create inferential statistics, include linear regression analyses , logistic regression analyses,  ANOVA ,  correlation analyses ,  structural equation modeling , and survival analysis. When conducting research using inferential statistics, scientists conduct a test of significance to determine whether they can generalize their results to a larger population. Common tests of significance include the  chi-square  and  t-test . These tell scientists the probability that the results of their analysis of the sample are representative of the population as a whole.

Descriptive vs. Inferential Statistics

Although descriptive statistics is helpful in learning things such as the spread and center of the data, nothing in descriptive statistics can be used to make any generalizations. In descriptive statistics, measurements such as the mean and standard deviation are stated as exact numbers.

Even though inferential statistics uses some similar calculations — such as the mean and standard deviation — the focus is different for inferential statistics. Inferential statistics start with a sample and then generalizes to a population. This information about a population is not stated as a number. Instead, scientists express these parameters as a range of potential numbers, along with a degree of confidence.

  • Empirical Relationship Between the Mean, Median, and Mode
  • What Is the 5 Number Summary?
  • What Are the Maximum and Minimum?
  • What Are the First and Third Quartiles?
  • Understanding the Interquartile Range in Statistics
  • What Is a Range in Statistics?
  • How Are Outliers Determined in Statistics?
  • Calculating the Correlation Coefficient
  • What Is a Two-Way Table of Categorical Variables?
  • What Is the Midhinge?
  • What Is Skewness in Statistics?
  • How to Classify the Kurtosis of Distributions
  • Understanding Quantiles: Definitions and Uses
  • Variance and Standard Deviation
  • What Are Moments in Statistics?
  • What Are Residuals?
How to Evaluate Descriptive and Inferential Statistics
:

.” .” .” .” .” and the for each type of statistics. ” (a transcript of the video is available  ). essay (using examples to support) the thesis that “Descriptive statistics are useful.” Remember that descriptive statistics can be graphs and figures, as well as means and modes. And remember to use examples of descriptive statistics, not reasons to use descriptive statistics. has a and a . sentence that . . . and ends with something (mildly) witty or profound. sentence that . of your paragraphs — your Introduction Paragraph, each of your three Examples Paragraphs, and your Conclusion Paragraph — to make sure each of your five paragraphs has (a Topic Sentence, three Supporting Sentences, and a Conclusion Sentence). (i.e., your readers need to be able to trace the source of any words or any ideas — even if you have summarized those words or ideas — if they are not your own words or ideas). . essay (using examples to support), this time to support the thesis “Inferential statistics are useful.” Remember to use examples of inferential statistics, not reasons to use inferential statistics. has a and a . sentence that . . . sentence that . of your paragraphs — your Introduction Paragraph, each of your three Examples Paragraphs, and your Conclusion Paragraph — to make sure each of your five paragraphs has (a Topic Sentence, three Supporting Sentences, and a Conclusion Sentence). (i.e., your readers need to be able to trace the source of any words or any ideas — even if you have summarized those words or ideas — if they are not your own words or ideas). , who is widely considered the father of U.S. Psychology! and make a new Discussion Board post.

:

by Darrell Huff. favorite deceptions. For example, you might choose as one of your favorite deceptions the hypothetical real estate agent’s deceptive use of a neighborhood’s “average” income in Chapter 2. favorite deceptions to other people. ). . . and make a new Discussion Board post.

:

.” .” Seven Sins of Statistical Misinterpretation” to scientific articles you have read. that you found and read in Unit #5 and that you synthesized in Unit #6. .” . _PSY-225_Gernsbacher_StatsCheck_Fillable.pdf. In other words, add your last name to the beginning of the filename. you open the unfilled PDF from your computer. and attach your filled-in PDF.

:

.” .” .” of the following five topics for which Gallup has conducted a public opinion poll. Then, within each of the three topics you’ve chosen, read of the listed reports. ” ” ” ” ” ” ” ” ” ” ” ” ” ” ” ” ” ” ” ” and make a new post of at least in which you provide the following information for you chose to read (three topics x one report per topic). It will be easiest if you write a separate paragraph for .

:

and read all the other students’ posts. other students; each of your three responses should be . that you also wrote about. . (besides the two students you responded to in 1. and 2. above).

:

. If you are in a Chat Group with two other students, that means you will read four essays; if you are in a Chat Group with only one other student, that means you will read two essays. . Note that you will again be answering 12 questions about each member’s essays. text-based Chat. . ” images. More than one Chat Group member can indicate the same image if that’s how they are feeling, and please refer to each image by its number , , that summarizes your Group Chat in . , save the Chat transcript, as described in the (under the topic, “How To Save and Attach a Chat Transcript”), and attach the Chat transcript, in PDF, to a post on the . , that states the name of the assignment (Unit 7: Assignment #6), the full name of your Chat Group, the first and last names of each Chat Group member who participated in the Group Chat, the day (e.g., Sunday) and date of this Group Chat (e.g., June 13), the start and stop time of this Group Chat (e.g., 1pm to 2pm). ” images. More than one Chat Group member can indicate the same image if that’s how they are feeling, and please refer to each image by its number. : The “How Are You Feeling at the of Today’s Group Chat” grid of images differs from the “How Are You Feeling at the of Today’s Group Chat” grid of images.

Congratulations, you have finished Unit 7! Onward to !

Creative Commons License

Jaro-Education-15-Years

Descriptive vs. Inferential Statistics: Key Differences

Table of contents.

Descriptive-vs.-Inferential-Statistics-Key-Differences

  • jaro education
  • 24, June 2024

Professionals across diverse fields such as science, mathematics, marketing, and technology rely on statistics to extract meaningful insights and draw conclusions from extensive datasets. The study of statistics is divided into two primary branches: descriptive and inferential statistics. These branches play distinct roles in uncovering various facets of data, enabling professionals to make informed decisions based on analytical findings.

Understanding different types of statistics, such as descriptive and inferential statistics, is crucial for developing a robust understanding of data management and choosing the most suitable analytical approaches. While certain measurement techniques may share similarities, the fundamental objectives of descriptive and inferential statistics diverge significantly. This blog explores the contrasting features of descriptive and inferential statistics, highlighting their unique roles and impacts in data analysis. By distinguishing between these statistical methods, professionals can effectively leverage them to extract valuable insights from complex datasets.

Key Concepts of Holistic Marketing

*geeksforgeeks.org

Understanding Descriptive Statistics?

Descriptive statistics  is a field of statistics focused on summarizing and explaining the key aspects of a dataset. It involves techniques for organizing, visualizing, and presenting data in a meaningful and straightforward manner. Descriptive statistics characterize the properties of the dataset being analyzed without making broader assumptions beyond the data itself.

Types of Descriptive Statistics

Here are some different types of method for descriptive statistics: 

1. Central Tendency:  Central tendency is a method of summarizing data by identifying a central value to which all other data points are related. This central value can be represented by the median or the mean. In a normal distribution, these central measures often coincide.

2. Dispersion:  Dispersion refers to how spread out the data points are within a dataset. Statisticians commonly use metrics such as variance and standard deviation to quantify the extent of spread among the dataset’s points.

3. Skewness:  Skewness in statistics characterizes the distribution shape of a dataset when plotted. Normally distributed datasets exhibit little to no skew, with data points evenly distributed around the central point. In contrast, other datasets may show strong skewness either to the left or right of the center, depending on where the majority of data points are concentrated.

Understanding Inferential Statistics

In contrast, inferential statistics involves drawing conclusions, making predictions, or forming generalizations about a larger population based on data collected from a representative sample of that population. It extends the insights gained from a sample to the entire population from which the sample was taken. Inferential statistics enable researchers to reach conclusions, test hypotheses, and predict outcomes for populations, even when it is impractical or impossible to study the entire population directly.

Types of Inferential Statistics Method:

Below are important inferential methods commonly used in statistical analysis:

1. Hypothesis Testing:  Hypothesis testing involves a statistician formulating a hypothesis about a population or sample and then collecting data from sample groups to test this hypothesis.

2. Regression Analysis:  Regression analysis utilizes a dataset with confirmed data points to estimate the relationship between two variables, such as height and weight. This analysis enables statisticians and other professionals to make predictions for values beyond the measured data range.

3. Confidence Intervals:  In research, statisticians often establish confidence intervals, which represent ranges of certain key values associated with specific probabilities. These intervals provide insights into the precision and reliability of statistical estimates.

Examples for Descriptive and Inferential Statistics

Descriptive statistics examples.

For instance, Imagine a company wants to analyze the performance of its sales team over the past year. They gather data on monthly sales figures for each salesperson. To understand the distribution of sales and identify key insights, they use descriptive statistics. Here are the steps company need to follow for the analysis: 

1. Measures of Central Tendency:  The company calculates the mean (average) monthly sales for the entire team. They find that the mean sales per month is $50,000, indicating the typical performance level.

2. Measures of Dispersion:  To understand how sales vary among team members, they calculate the standard deviation of monthly sales. They discover that the standard deviation is $15,000, suggesting a moderate level of variation in sales performance across the team.

3. Visualization:  Using histograms or box plots, they visually represent the distribution of monthly sales. This visualization highlights whether sales data are symmetrically distributed or skewed towards certain values.

4. Percentiles:  They calculate the 25th, 50th (median), and 75th percentiles of monthly sales. This helps identify sales levels at which specific percentages of salespeople perform, such as the median sales or the top quartile of performers.

5. Correlation Analysis:  They explore correlations between sales performance and other factors like tenure, training hours, or geographic region. This analysis reveals if any factors are associated with higher or lower sales figures.

Inferential Statistics Example

Suppose a company is interested in understanding whether coffee consumption affects employee productivity. They decide to conduct a study where they gather data on coffee consumption and corresponding productivity levels among employees. Here is how step by step analysis process will look like: 

  • Hypothesis : The company hypothesizes that increased coffee consumption is associated with higher productivity levels among employees.
  • Data Collection:  They collect data on the number of cups of coffee consumed per day by each employee and their daily productivity scores (measured by completed tasks or output).
  • Hypothesis Testing : Using inferential statistics, the company conducts a hypothesis test. They set up their null hypothesis (H0) that there is no relationship between coffee consumption and productivity (i.e., β = 0), versus the alternative hypothesis (H1) that there is a positive relationship (i.e., β > 0).
  • Regression Analysis : The collected data is analyzed using regression analysis. They fit a regression model where productivity (dependent variable) is regressed against coffee consumption (independent variable). The regression model helps estimate the effect of coffee consumption on productivity while controlling for other factors.
  • Interpretation : After running the regression, the company examines the coefficient of coffee consumption. If the coefficient is statistically significant and positive, it supports the alternative hypothesis that increased coffee consumption is associated with higher productivity.
  • Conclusion : Based on the analysis and hypothesis testing results, the company concludes whether there is indeed a statistically significant relationship between coffee consumption and productivity among employees.

Differences Between Descriptive and Inferential Statistics

Differences Between Descriptive and Inferential Statistics

Purpose of Descriptive and Inferential Statistics

Descriptive statistics serve the purpose of summarizing and describing the characteristics of a dataset that is directly available to the statistician. This includes providing measures like central tendency (e.g., mean, median), dispersion (e.g., variance, standard deviation), and graphical representations (e.g., histograms, box plots) that depict the distribution of the data. For instance, in the scenario of collecting birth weight data from a hospital, descriptive statistics would be used to calculate the average birth weight of the children born within that specific hospital during a year.

On the other hand, inferential statistics are employed to make inferences or generalizations about a larger population based on a sample of data. The purpose of inferential statistics is to draw conclusions that extend beyond the specific dataset being analyzed. Using inferential methods, a statistician can estimate parameters, test hypotheses, and make predictions about the population from which the sample was drawn. In the given example, inferential statistics would be applied to estimate or predict the average birth weight of children nationwide, based on the sample data collected from the hospital. This allows for broader insights and conclusions to be drawn beyond the immediate dataset.

Approach of Descriptive and Inferential Statistics

Descriptive statistics involves analyzing data from a complete set, where every data point is collected and summarized. For instance, when determining the average height of students in a class, each student’s height is measured, providing a comprehensive view of that specific group. 

The methodology of inferential statistics focuses on making broader conclusions or predictions about a larger population based on a representative sample. In this methodology, the statistician selects a sample that is representative of the entire population of interest. For example, if the teacher aims to infer the average height of all students in the school, the class data alone may not suffice as a representative sample. To utilize inferential statistics effectively, the teacher would need to construct a sample that includes students from different year levels across the school to ensure accurate representation of the overall student body.

Calculation Accuracy in Descriptive and Inferential Statistics

The precision of calculating descriptive statistics relies on the availability of complete data for the analysis. When using descriptive statistical methods, such as calculating the mean test score of a class, statisticians can achieve a better accuracy and precision because they have access to all necessary data points within the defined group or sample. For instance, a teacher calculating the exact average test score from a class of 20 students can achieve a precise measurement as they have all individual scores at their disposal. Descriptive statistics excel in precision when the entire dataset is known and can be directly analyzed, ensuring accurate measurements and insights into the specific group or sample.

The precision of inferential statistics is inherently influenced by the sampling process and the potential for variability and error when making predictions about a larger population based on a sample subset. Inferential statistics involve drawing conclusions or making estimations about a broader population from a representative sample. For example, using the average test score of a class to estimate the average test score of all classes taking the same test introduces uncertainty due to potential differences between the sample and the entire population. While inferential statistics provide valuable insights beyond the observed data, they carry the risk of error and require careful consideration of sampling methods and assumptions to ensure the reliability and accuracy of the inferences made.

Key Similarities Between Descriptive and Inferential Statistics

Descriptive and inferential statistics, although distinct in their purposes and approaches, exhibit some similarities:

1. Data Utilization:  Both descriptive and inferential statistics utilize the same dataset. Descriptive statistics summarize this data, whereas inferential statistics use it to draw broader conclusions about a larger population.

2. Statistical Measures : They commonly utilize similar statistical measures like mean and standard deviation to describe datasets or infer information about populations based on samples.

3. Graphical Representations:  Both types of statistics can employ graphical representations like histograms, box plots, and scatterplots to visually represent data trends and patterns.

4. Summary Statistics:  Summary statistics are essential in both descriptive and inferential statistics to provide a concise overview of the data, including measures of central tendency and dispersion.

5. Analysis Foundation:  Descriptive statistics form the basis for inferential statistics. Properly summarizing and understanding sample data is crucial before making accurate inferences about the broader population.

Descriptive and inferential statistics are important categories in statistics. Descriptive statistics focus on summarizing data to reveal its characteristics and patterns. Meanwhile, inferential statistics use sample data to make predictions and draw conclusions about a larger population.

Both types of statistics are essential for data analysis, complementing each other to provide a complete understanding of datasets. This blog has explained these concepts clearly, highlighting their differences with practical examples. Understanding descriptive and inferential statistics helps analysts and researchers make informed decisions in their work across different fields.

Fill The Form To Get More Information

  • I agree to the Terms and Conditions of this website.

Related Program

Professional certificate programme in data science for business decisions – iim kozhikode, post graduate certificate programme in data science for business excellence and innovation – iim nagpur, advanced data science certificate program – rotman school of management (uoft) & iitm pravartak technology innovation hub of iit madras, recent blogs.

Unlock-the-Economic-Environment-of-Business-For-More-Profit

Unlock the Economic Environment of Business For More Profit

Know-How-to-Double-Profits-With-Power-BI

Know How to Double Profits With Power BI

Best-Online-MBA-from-Pune-for-2024

Best Online MBA from Pune for 2024: The Ultimate Guide

How-to-Use-the-Duplicate-Formula-in-Google-Sheets

How to Use the Duplicate Formula in Google Sheets?

What-is-AI-Ethics-and-Why-is-it-Important

What is AI Ethics and Why is it Important?

Explore-The-Most-Affordable-Colleges-For-Distance-MBA-In-Kerala

Explore The Most Affordable Colleges For Distance MBA In Kerala

Trending blogs.

20 Essential Data Science Applications and Real Time Examples

Leave a Comment Cancel reply

Save my name, email, and website in this browser for the next time I comment.

Jaro Education

Long Duration Certification Programs

  • Accelerated General Management Programme - IIM Ahmedabad
  • Advanced Strategy for Products and Marketing & Advanced Analytics for Products and Marketing- IIM Kozhikode
  • Professional Certificate Programme in Investment Banking - IIM Kozhikode
  • Executive General Management Programme - IIM Trichy
  • PG Certification Program in Business Analytics & Application- IIM Trichy
  • PG Certificate in Cyber Security- IIT Palakkad
  • Executive Program in Business Management-IMT Ghaziabad
  • PG Certificate Programme in Fintech- IIM Nagpur
  • Adani Institute of Digital Technology & Management Ahmedabad-Executive Program in Business Analytics

Online Bachelors Degree Programs

  • Online BCA - Manipal University, Jaipur
  • Online BBA- Manipal University, Jaipur
  • Online BBA Degree Program-Dr D. Y. Patil Vidyapeeth Pune
  • Online B.Com- Manipal University, Jaipur

Doctoral Programs

  • Doctor Of Business Administration (DBA)-Swiss School of Management Switzerland

Online Masters Degree and PG Programs

  • Online MBA- Manipal University, Jaipur
  • Online MBA Degree Programme-Dr D. Y. Patil Vidyapeeth Pune
  • Executive MBA-Dayanand Sagar University Bangalore
  • PGDM Hybrid-WeSchool, Mumbai

Trending Programs

  • Doctorate & Phd Courses
  • Online MBA Courses
  • Online PG Programs
  • Online UG Programs
  • Finance Programs
  • International Programs
  • Analytics & Data Science Programs
  • Management & Leadership Programs

Short Duration Certification Programs

  • Future Leader Program- Deakin University Australia and KPMG in India
  • Executive Program in Business Finance - IIM Ahmedabad
  • Certification in Corporate Finance & Analytics- KPMG in India

Fulltime Masters and PG Programs

  • MBA Program in Real Estate - Niranjan Hiranandani School of Real Estate - HSNC University
  • About Jaro Education
  • Student Grievance Redressal
  • Terms of Use

Trending PG & UG Programs

  • Online Mcom
  • Online Bcom

In-Demand MBA Courses

  • MBA in Finance
  • MBA in Marketing
  • MBA in Human Resource Management
  • MBA in Systems & Operations
  • MBA Project Management
  • MBA in Business Analytics

Copyright © 2022 Jaro Education. All rights reserved.

Coming Soon

Logo for TRU Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

2.3 Descriptive and Inferential Statistics

Learning objectives.

  • Describe descriptive statistics and know how to produce them.
  • Describe inferential statistics and why they are used.

Descriptive statistics

In the previous section, we looked at some of the research designs psychologists use. In this section, we will provide an overview of some of the statistical approaches researchers take to understanding the results that are obtained in research. Descriptive statistics are the first step in understanding how to interpret the data you have collected. They are called descriptive because they organize and summarize some important properties of the data set. Keep in mind that researchers are often collecting data from hundreds of participants; descriptive statistics allow them to make some basic interpretations about the results without having to eyeball each result individually.

Let’s work through a hypothetical example to show how descriptive statistics help researchers to understand their data. Let’s assume that we have asked 40 people to report how many hours of moderate-to-vigorous physical activity they get each week. Let’s begin by constructing a frequency distribution of our hypothetical data that will show quickly and graphically what scores we have obtained.

Table 2.3. Distribution of exercise frequency
1 7
2 6
6 5
8 4
7 3
8 2
5 1
3 0

We can now construct a histogram that will show the same thing on a graph (see Figure 2.5 ). Note how easy it is to see the shape of the frequency distribution of scores.

Many variables that psychologists are interested in have distributions where most of the scores are located near the centre of the distribution, the distribution is symmetrical, and it is bell-shaped (see Figure 2.6 ). A data distribution that is shaped like a bell is known as a normal distribution . Normal distributions are common in human traits because of the tendency for variability; traits like intelligence, wealth, shoe size, and so on, are distributed such that relatively few people are either extremely high or low scorers, and most people fall somewhere near the middle.

A distribution can be described in terms of its central tendency — that is, the point in the distribution around which the data are centred — and its dispersion or spread . The arithmetic average, or arithmetic mean , symbolized by the letter M , is the most commonly used measure of central tendency. It is computed by calculating the sum of all the scores of the variable and dividing this sum by the number of participants in the distribution, denoted by the letter N . In the data presented in Figure 2.6, the mean height of the students is 67.12 inches (170.48 cm). The sample mean is usually indicated by the letter M .

In some cases, however, the data distribution is not symmetrical. This occurs when there are one or more extreme scores, known as outliers , at one end of the distribution. Consider, for instance, the variable of family income (see Figure 2.7 ), which includes an outlier at a value of $3,800,000. In this case, the mean is not a good measure of central tendency. Although it appears from Figure 2.7 that the central tendency of the family income variable should be around $70,000, the mean family income is actually $223,960. The single very extreme income has a disproportionate impact on the mean, resulting in a value that does not well represent the central tendency.

The median is used as an alternative measure of central tendency when distributions are not symmetrical. The median is the score in the centre of the distribution, meaning that 50% of the scores are greater than the median and 50% of the scores are less than the median. In our case, the median household income of $73,000 is a much better indication of central tendency than is the mean household income of $223,960.

A final measure of central tendency, known as the mode , represents the value that occurs most frequently in the distribution. You can see from Figure 2.7 that the mode for the family income variable is $93,000; it occurs four times.

In addition to summarizing the central tendency of a distribution, descriptive statistics convey information about how the scores of the variable are spread around the central tendency. Dispersion refers to the extent to which the scores are all tightly clustered around the central tendency (see Figure 2.8 ). Here, there are many scores close to the middle of the distribution.

In other instances, they may be more spread out away from it (see Figure 2.9 ). Here, the scores are further away from the middle of the distribution.

One simple measure of dispersion is to find the largest (i.e., the maximum) and the smallest (i.e., the minimum) observed values of the variable and to compute the range of the variable as the maximum observed score minus the minimum observed score. You can check that the range of the height variable shown in Figure 2.6 above is 72 – 62 = 10.

The standard deviation , symbolized as s , is the most commonly used measure of variability around the mean. Distributions with a larger standard deviation have more spread. Those with small deviations have scores that do not stray very far from the average score. Thus, standard deviation is a good measure of the average deviation from the mean in a set of scores. In the examples above, the standard deviation of height is s = 2.74, and the standard deviation of family income is s = $745,337. These standard deviations would be more informative if we had others to compare them to. For example, suppose we obtained a different sample of adult heights and compared it to those shown in Figure 2.6 above. If the standard deviation was very different, that would tell us something important about the variability in the second sample as compared to the first. A more relatable example might be student grades: a professor could keep track of student grades over many semesters. If the standard deviations were relatively similar from semester to semester, this would indicate that the amount of variability in student performance is fairly constant. If the standard deviation suddenly went up, that would indicate that there are more students with very low scores, very high scores, or both. It’s useful to see how standard deviation is calculated: a good demonstration can be found at Khan Academy .

The standard deviation in the normal distribution has some interesting properties (see Figure 2.10 ). Approximately 68% of the data fall within 1 standard deviation above or below the mean score: 34% fall above the mean, and 34% fall below. In other words, about 2/3 of the population are within 1 standard deviation of the mean. Therefore, if some variable is normally distributed (e.g., height, IQ, etc.), you can quickly work out where approximately 2/3 of the population fall by knowing the mean and standard deviation.

Inferential statistics

We have seen that descriptive statistics are useful in providing an initial way to describe, summarize, and interpret a set of data. They are limited in usefulness because they tell us nothing about how meaningful the data are. The second step in analyzing data requires inferential statistics . Inferential statistics provide researchers with the tools to make inferences about the meaning of the results. Specifically, they allow researchers to generalize from the sample they used in their research to the greater population, which the sample represents. Keep in mind that psychologists, like other scientists, rely on relatively small samples to try to understand populations.

This is not a textbook about statistics, so we will limit the discussion of inferential statistics. However, all students of psychology should become familiar with one very important inferential statistic: the significance test. In the simplest, non-mathematical terms, the significance test is the researcher’s estimate of how likely it is that their results were simply the result of chance. Significance testing is not the same thing as estimating how meaningful or large the results are. For example, you might find a very small difference between two experimental conditions that is statistically significant.

Typically, most researchers use the convention that if significance testing shows that a result has a less than 5% probability of being due to chance alone, the result is considered to be real and to generalize to the population. If the significance test shows that the probability of chance causing the outcome is greater than 5%, it is considered to be a non-significant result and, consequently, of little value; non-significant results are more likely to be chance findings and, therefore, should not be generalized to the population. Significance tests are reported as p values , for example, p< .05 means the probability of being caused by chance is less than 5%. P values are reported by all statistical programs so students no longer need to calculate them by hand. Most often, p values are used to determine whether or not effects detected in the research are present. So, if p< .05, then we can conclude that an effect is present, and the difference between the two groups is real.

Thus, p values provide information about the presence of an effect. However, for information about how meaningful or large an effect is, significance tests are of little value. For that, we need some measure of effect size. Effect size is a measure of magnitude; for example, if there is a difference between two experimental groups, how large is the difference? There are a few different statistics for calculating effect sizes.

In summary, statistics are an important tool in helping researchers understand the data that they have collected. Once the statistics have been calculated, the researchers interpret their results. Thus, while statistics are heavily used in the analysis of data, the interpretation of the results requires a researcher’s knowledge, analysis, and expertise.

Key Takeaways

  • Descriptive statistics organize and summarize some important properties of the data set. Frequency distributions and histograms are effective tools for visualizing the data set. Measures of central tendency and dispersion are descriptive statistics.
  • Many human characteristics are normally distributed.
  • Measures of central tendency describe the central point around which the scores are distributed. There are three different measures of central tendency.
  • The range and standard deviation show the dispersion of scores as well as the shape of the distribution of the scores. The standard deviation of the normal distribution has some special properties.
  • Inferential statistics provide researchers with the tools to make inferences about the meaning of the results, specifically about generalizing from the sample they used in their research to the greater population, which the sample represents.
  • Significance tests are commonly used to assess the probability that observed results were due to chance. Effect sizes are commonly used to estimate how large an effect has been obtained.

Exercises and Critical Thinking

  • Keep track of something you do over a week, such as your daily amount of exercise, sleep, cups of coffee, or social media time. Record your scores for each day. At the end of the week, construct a frequency distribution of your results, and draw a histogram that represents them. Calculate all three measures of central tendency, and decide which one best represents your data and why. Invite a friend or family member to participate, and do the same for their data. Compare your data sets. Whose shows the greatest dispersion around the mean, and how do you know?
  • The data for one person cannot generalize to the population. Consider why people might have different scores than yours.

Image Attribution

Figure 2.5. Used under a CC BY-NC-SA 4.0 license.

Figure 2.6. Used under a CC BY-NC-SA 4.0 license.

Figure 2.7. Used under a CC BY-NC-SA 4.0 license.

Figure 2.8. Used under a CC BY-NC-SA 4.0 license.

Figure 2.9. Used under a CC BY-NC-SA 4.0 license.

Figure 2.10. Empirical Rule by Dan Kernler is used under a CC BY-SA 4.0 license.

Long Descriptions

Figure 2.7. Of the 25 families, 24 families have an income between $44,000 and $111,000, and only one family has an income of $3,800,000. The mean income is $223,960, while the median income is $73,000.

[Return to Figure 2.7]

Psychology - 1st Canadian Edition Copyright © 2020 by Sally Walters is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Descriptive and Inferential Statistics

Descriptive and inferential statistics are two fields of statistics. Descriptive statistics is used to describe data and inferential statistics is used to make predictions. Descriptive and inferential statistics have different tools that can be used to draw conclusions about the data.

In descriptive and inferential statistics, the former uses tools such as central tendency, and dispersion while the latter makes use of hypothesis testing, regression analysis, and confidence intervals. In this article, we will learn more about descriptive and inferential statistics, its differences, associated formulas and examples.

1.
2.
3.
4.
5.

What is Descriptive and Inferential Statistics?

The purpose of descriptive and inferential statistics is to analyze different types of data using different tools. Descriptive statistics helps to describe and organize known data using charts, bar graphs, etc., while inferential statistics aims at making inferences and generalizations about the population data.

Descriptive and Inferential Statistics Types

Descriptive Statistics

Descriptive statistics are a part of statistics that can be used to describe data. It is used to summarize the attributes of a sample in such a way that a pattern can be drawn from the group. It enables researchers to present data in a more meaningful way such that easy interpretations can be made. Descriptive statistics uses two tools to organize and describe data. These are given as follows:

  • Measures of Central Tendency - These help to describe the central position of the data by using measures such as mean , median , and mode .
  • Measures of Dispersion - These measures help to see how spread out the data is in a distribution with respect to a central point. Range , standard deviation, variance , quartiles, and absolute deviation are the measures of dispersion.

Inferential Statistics

Inferential statistics is a branch of statistics that is used to make inferences about the population by analyzing a sample. When the population data is very large it becomes difficult to use it. In such cases, certain samples are taken that are representative of the entire population. Inferential statistics draws conclusions regarding the population using these samples. Sampling strategies such as simple random sampling, cluster sampling, stratified sampling, and systematic sampling, need to be used in order to choose correct samples from the population. Some methodologies used in inferential statistics are as follows:

  • Hypothesis Testing - This technique involves the use of hypothesis tests such as the z test , f test , t test, etc. to make inferences about the population data . It requires setting up the null hypothesis, alternative hypothesis, and testing the decision criteria.
  • Regression Analysis - Such a technique is used to check the relationship between dependent and independent variables. The most commonly used type of regression is linear regression.

Difference Between Descriptive and Inferential Statistics

Both descriptive and inferential statistics are equally important to analyze data. Descriptive statistics are used to order data and describe the sample using the mean, standard deviation, charts, etc. Inferential statistics uses this sample data to predict the trend of the population data. The differences between descriptive and inferential statistics have been outlined in the table given below:

Definition Descriptive statistics is used to describe the characteristics of the population using a sample. Inferential statistics uses various analytical tools to draw inferences about the population using samples.
Tools and measures of dispersion. Hypothesis testing and regression analysis.
Use Organizes, describes and presents data in a meaningful way with the help of charts and graphs. Tests, predicts, and compares data obtained from various samples.
Relevance It is used to summarize known data in a way that can be used for further predictions and analysis. It tries to use the summarized samples to draw conclusions about the population.

Descriptive and Inferential Statistics Formulas

There are many statistical formulas that fall under descriptive and inferential statistics. These are given as follows:

Descriptive Statistics:

  • Mean = Σx i / n
  • Mode = Most frequently occurring observation
  • Median (n is odd) = [(n + 1) / 2] th term
  • Median (n is even) = [(n / 2) th term + ((n / 2) + 1) th term] / 2
  • Sample Variance = \(\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n - 1}\)
  • Sample Standard Deviation = \(\sqrt{\sum \frac{\left ( X_{i}-\overline{X} \right )^{2}}{n - 1}}\)
  • Range = Highest observation - Lower Observation.
  • Z score = \(\frac{x-\mu}{\sigma}\)
  • F test statistic = \(\frac{\sigma_{1}^{2}}{\sigma_{2}^{2}}\)

Examples of Descriptive and Inferential Statistics

Descriptive and inferential statistics need to be used hand in hand so as to analyze the data in the best possible way. Some examples of descriptive and inferential statistics are given below:

  • Suppose the scores of 100 students belonging to a specific country are available. The performance of these students needs to be examined. This data by itself will not yield any valuable results. However, by using descriptive statistics, the spread of the marks can be obtained thus, giving a clear idea regarding the performance of each student.
  • Now suppose the scores of the students of an entire country need to be examined. Using a sample of, say 100 students, inferential statistics is used to make generalizations about the population.

Related Articles:

  • Mean Median Mode
  • Probability and Statistics
  • Data Handling
  • Summary Statistics
  • Regression Coefficients

Important Notes on Descriptive and Inferential Statistics

  • Descriptive and inferential statistics are used to analyze data, obtain samples and make inferences about the population.
  • The tools used in descriptive statistics are measures of central tendency and dispersion.
  • The tools used in inferential statistics are hypothesis testing and regression analysis.

Examples on Descriptive and Inferential Statistics

Example 1: The scores of 2 groups of students belonging to different classes are noted. Using descriptive and inferential statistics see which group exhibits a higher variability in performance.

Group A: 56, 58, 60, 62, 64

Group B: 40, 50, 60, 70, 80

Solution: To describe the variability in performance the variance is used. Thus, descriptive statistics is used to analyze this data.

Group A mean = (56 + 58 + 60 + 62 + 64) / 5 = 60

Group A variance = ([56 - 60) 2 + (58 - 60) 2 + (60 - 60) 2 + (62 - 60) 2 + (64 - 60) 2 ] / 5 - 1 = 10

Group B mean = (40 + 50 + 60 + 70 + 80) / 5 = 60

Group B variance = ([40 - 60) 2 + (50 - 60) 2 + (60 - 60) 2 + (70 - 60) 2 + (80 - 60) 2 ] / 5 - 1 = 250

By looking at the variance it is clear that group B displays higher variance than group A

Answer: Group B is more variable.

Example 2: Find the mode of the following data using descriptive statistics. 5, 6, 2, 7, 6, 5,1, 9, 5, 8, 5, 4, 3, 12, 11, 17, 5, 5

Solution: Mode is the most frequently occurring observation. Thus, the mode is 5

Answer: Mode = 5

Example 3: Find the z score using descriptive and inferential statistics for the given data. Population mean 100, sample mean 120, population variance 49 and size 10.

Solution: Inferential statistics is used to find the z score of the data. The formula is given as follows:

z = \(\frac{x-\mu}{\sigma}\)

Standard deviation = \(\sqrt{49}\) = 7

z = (120 - 100) / 7

= 20 / 7 = 2.86

Answer: Z score = 2.86

go to slide go to slide go to slide

descriptive and inferential statistics assignment

Book a Free Trial Class

FAQs on Descriptive and Inferential Statistics

What is the meaning of descriptive and inferential statistics.

Descriptive and inferential statistics are two branches of statistics that are used to describe data and make important inferences about the population using samples.

What are the Tools Used in Descriptive and Inferential Statistics?

The tools used in descriptive and inferential statistics are measures of central tendency, measures of dispersion , hypothesis testing, and regression analysis.

What are the Important Formulas in Descriptive and Inferential Statistics?

The important formulas used in descriptive and inferential statistics are as follows:

When Do You Use Descriptive and Inferential Statistics?

Descriptive and inferential statistics are used to analyze data. Descriptive statistics is used to describe and organize data while inferential statistics draw conclusions about the population from samples by using analytical tools.

Is Hypothesis Testing a Part of Descriptive and Inferential Statistics?

Yes, hypothesis tests such as z test, f test, ANOVA test, and t-test are a part of descriptive and inferential statistics. Hypothesis testing along with regression analysis specifically fall under inferential statistics.

What Is the Similarity Between Descriptive and Inferential Statistics?

The similarity between descriptive and inferential statistics is that they both rely on the same data set. Descriptive statistics describes this data set while inferential statistics uses this data set to make generalizations about a population

What is the Difference Between Descriptive and Inferential Statistics?

The main difference between descriptive and inferential statistics is that the former is used to describe the characteristics of a data set while the latter focuses on making predictions and generalizations about the data.

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base

Descriptive Statistics | Definitions, Types, Examples

Published on 4 November 2022 by Pritha Bhandari . Revised on 9 January 2023.

Descriptive statistics summarise and organise characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population .

In quantitative research , after collecting data, the first step of statistical analysis is to describe characteristics of the responses, such as the average of one variable (e.g., age), or the relation between two variables (e.g., age and creativity).

The next step is inferential statistics , which help you decide whether your data confirms or refutes your hypothesis and whether it is generalisable to a larger population.

Table of contents

Types of descriptive statistics, frequency distribution, measures of central tendency, measures of variability, univariate descriptive statistics, bivariate descriptive statistics, frequently asked questions.

There are 3 main types of descriptive statistics:

  • The distribution concerns the frequency of each value.
  • The central tendency concerns the averages of the values.
  • The variability or dispersion concerns how spread out the values are.

Types of descriptive statistics

You can apply these to assess only one variable at a time, in univariate analysis, or to compare two or more, in bivariate and multivariate analysis.

  • Go to a library
  • Watch a movie at a theater
  • Visit a national park

A data set is made up of a distribution of values, or scores. In tables or graphs, you can summarise the frequency of every possible value of a variable in numbers or percentages.

  • Simple frequency distribution table
  • Grouped frequency distribution table
Gender Number
Male 182
Female 235
Other 27

From this table, you can see that more women than men or people with another gender identity took part in the study. In a grouped frequency distribution, you can group numerical response values and add up the number of responses for each group. You can also convert each of these numbers to percentages.

Library visits in the past year Percent
0–4 6%
5–8 20%
9–12 42%
13–16 24%
17+ 8%

Measures of central tendency estimate the center, or average, of a data set. The mean , median and mode are 3 ways of finding the average.

Here we will demonstrate how to calculate the mean, median, and mode using the first 6 responses of our survey.

The mean , or M , is the most commonly used method for finding the average.

To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observations is called N .

Mean number of library visits
Data set 15, 3, 12, 0, 24, 3
Sum of all values 15 + 3 + 12 + 0 + 24 + 3 = 57
Total number of responses = 6
Mean Divide the sum of values by to find : 57/6 =

The median is the value that’s exactly in the middle of a data set.

To find the median, order each response value from the smallest to the biggest. Then, the median is the number in the middle. If there are two numbers in the middle, find their mean.

Median number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Middle numbers 3, 12
Median Find the mean of the two middle numbers: (3 + 12)/2 =

The mode is the simply the most popular or most frequent response value. A data set can have no mode, one mode, or more than one mode.

To find the mode, order your data set from lowest to highest and find the response that occurs most frequently.

Mode number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Mode Find the most frequently occurring response:

Measures of variability give you a sense of how spread out the response values are. The range, standard deviation and variance each reflect different aspects of spread.

The range gives you an idea of how far apart the most extreme response scores are. To find the range , simply subtract the lowest value from the highest value.

Standard deviation

The standard deviation ( s ) is the average amount of variability in your dataset. It tells you, on average, how far each score lies from the mean. The larger the standard deviation, the more variable the data set is.

There are six steps for finding the standard deviation:

  • List each score and find their mean.
  • Subtract the mean from each score to get the deviation from the mean.
  • Square each of these deviations.
  • Add up all of the squared deviations.
  • Divide the sum of the squared deviations by N – 1.
  • Find the square root of the number you found.
Raw data Deviation from mean Squared deviation
15 15 – 9.5 = 5.5 30.25
3 3 – 9.5 = -6.5 42.25
12 12 – 9.5 = 2.5 6.25
0 0 – 9.5 = -9.5 90.25
24 24 – 9.5 = 14.5 210.25
3 3 – 9.5 = -6.5 42.25
= 9.5 Sum = 0 Sum of squares = 421.5

Step 5: 421.5/5 = 84.3

Step 6: √84.3 = 9.18

The variance is the average of squared deviations from the mean. Variance reflects the degree of spread in the data set. The more spread the data, the larger the variance is in relation to the mean.

To find the variance, simply square the standard deviation. The symbol for variance is s 2 .

Univariate descriptive statistics focus on only one variable at a time. It’s important to examine data from each variable separately using multiple measures of distribution, central tendency and spread. Programs like SPSS and Excel can be used to easily calculate these.

Visits to the library
6
Mean 9.5
Median 7.5
Mode 3
Standard deviation 9.18
Variance 84.3
Range 24

If you were to only consider the mean as a measure of central tendency, your impression of the ‘middle’ of the data set can be skewed by outliers, unlike the median or mode.

Likewise, while the range is sensitive to extreme values, you should also consider the standard deviation and variance to get easily comparable measures of spread.

If you’ve collected data on more than one variable, you can use bivariate or multivariate descriptive statistics to explore whether there are relationships between them.

In bivariate analysis, you simultaneously study the frequency and variability of two variables to see if they vary together. You can also compare the central tendency of the two variables before performing further statistical tests .

Multivariate analysis is the same as bivariate analysis but with more than two variables.

Contingency table

In a contingency table, each cell represents the intersection of two variables. Usually, an independent variable (e.g., gender) appears along the vertical axis and a dependent one appears along the horizontal axis (e.g., activities). You read ‘across’ the table to see how the independent and dependent variables relate to each other.

Number of visits to the library in the past year
Group 0–4 5–8 9–12 13–16 17+
Children 32 68 37 23 22
Adults 36 48 43 83 25

Interpreting a contingency table is easier when the raw data is converted to percentages. Percentages make each row comparable to the other by making it seem as if each group had only 100 observations or participants. When creating a percentage-based contingency table, you add the N for each independent variable on the end.

Visits to the library in the past year (Percentages)
Group 0–4 5–8 9–12 13–16 17+
Children 18% 37% 20% 13% 12% 182
Adults 15% 20% 18% 35% 11% 235

From this table, it is more clear that similar proportions of children and adults go to the library over 17 times a year. Additionally, children most commonly went to the library between 5 and 8 times, while for adults, this number was between 13 and 16.

Scatter plots

A scatter plot is a chart that shows you the relationship between two or three variables. It’s a visual representation of the strength of a relationship.

In a scatter plot, you plot one variable along the x-axis and another one along the y-axis. Each data point is represented by a point in the chart.

From your scatter plot, you see that as the number of movies seen at movie theaters increases, the number of visits to the library decreases. Based on your visual assessment of a possible linear relationship, you perform further tests of correlation and regression.

Descriptive statistics: Scatter plot

Descriptive statistics summarise the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalisable to the broader population.

The 3 main types of descriptive statistics concern the frequency distribution, central tendency, and variability of a dataset.

  • Distribution refers to the frequencies of different responses.
  • Measures of central tendency give you the average for each response.
  • Measures of variability show you the spread or dispersion of your dataset.
  • Univariate statistics summarise only one variable  at a time.
  • Bivariate statistics compare two variables .
  • Multivariate statistics compare more than two variables .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2023, January 09). Descriptive Statistics | Definitions, Types, Examples. Scribbr. Retrieved 21 August 2024, from https://www.scribbr.co.uk/stats/descriptive-statistics-explained/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, data collection methods | step-by-step guide & examples, variability | calculating range, iqr, variance, standard deviation, normal distribution | examples, formulas, & uses.

  • For Individuals
  • For Businesses
  • For Universities
  • For Governments
  • Online Degrees
  • Find your New Career
  • Join for Free

Duke University

Inferential Statistics

This course is part of Data Analysis with R Specialization

Financial aid available

117,799 already enrolled

Coursera Plus

(2,666 reviews)

Skills you'll gain

  • Statistical Inference
  • Statistical Hypothesis Testing
  • R Programming

Details to know

descriptive and inferential statistics assignment

Add to your LinkedIn profile

See how employees at top companies are mastering in-demand skills

Placeholder

Build your subject-matter expertise

  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

Placeholder

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Placeholder

There are 5 modules in this course

This course covers commonly used statistical inference methods for numerical and categorical data. You will learn how to set up and perform hypothesis tests, interpret p-values, and report the results of your analysis in a way that is interpretable for clients or the public. Using numerous data examples, you will learn to report estimates of quantities in a way that expresses the uncertainty of the quantity of interest. You will be guided through installing and using R and RStudio (free statistical software), and will use this software for lab exercises and a final project. The course introduces practical tools for performing data analysis and explores the fundamental concepts necessary to interpret and report results for both categorical and numerical data

About the Specialization and the Course

This short module introduces basics about Coursera specializations and courses in general, this specialization: Statistics with R, and this course: Inferential Statistics. Please take several minutes to browse them through. Thanks for joining us in this course!

What's included

2 readings • total 20 minutes.

  • About Statistics with R Specialization • 10 minutes
  • More about Inferential Statistics • 10 minutes

Central Limit Theorem and Confidence Interval

Welcome to Inferential Statistics! In this course we will discuss Foundations for Inference. Check out the learning objectives, start watching the videos, and finally work on the quiz and the labs of this week. In addition to videos that introduce new concepts, you will also see a few videos that walk you through application examples related to the week's topics. In the first week we will introduce Central Limit Theorem (CLT) and confidence interval.

7 videos 6 readings 3 quizzes

7 videos • Total 65 minutes

  • Introduction • 4 minutes • Preview module
  • Sampling Variability and CLT • 20 minutes
  • CLT (for the mean) examples • 10 minutes
  • Confidence Interval (for a mean) • 11 minutes
  • Accuracy vs. Precision • 7 minutes
  • Required Sample Size for ME • 4 minutes
  • CI (for the mean) examples • 5 minutes

6 readings • Total 60 minutes

  • Lesson Learning Objectives • 10 minutes
  • Week 1 Suggested Readings and Practice Exercises • 10 minutes
  • About Lab Choices • 10 minutes
  • Week 1 Lab Instructions (RStudio) • 10 minutes
  • Week 1 Lab Instructions (RStudio Cloud) • 10 minutes

3 quizzes • Total 90 minutes

  • Week 1 Practice Quiz • 30 minutes
  • Week 1 Quiz • 30 minutes
  • Week 1 Lab • 30 minutes

Inference and Significance

Welcome to Week Two! This week we will discuss formal hypothesis testing and relate testing procedures back to estimation via confidence intervals. These topics will be introduced within the context of working with a population mean, however we will also give you a brief peek at what's to come in the next two weeks by discussing how the methods we're learning can be extended to other estimators. We will also discuss crucial considerations like decision errors and statistical vs. practical significance. The labs for this week will illustrate concepts of sampling distributions and confidence levels.

7 videos 5 readings 3 quizzes

7 videos • Total 59 minutes

  • Another Introduction to Inference • 4 minutes • Preview module
  • Hypothesis Testing (for a mean) • 14 minutes
  • HT (for the mean) examples • 9 minutes
  • Inference for Other Estimators • 10 minutes
  • Decision Errors • 8 minutes
  • Significance vs. Confidence Level • 6 minutes
  • Statistical vs. Practical Significance • 7 minutes

5 readings • Total 50 minutes

  • Week 2 Suggested Readings and Practice Exercises • 10 minutes
  • Week 2 Lab Instructions (RStudio) • 10 minutes
  • Week 2 Lab Instructions (RStudio Cloud) • 10 minutes

3 quizzes • Total 76 minutes

  • Week 2 Practice Quiz • 30 minutes
  • Week 2 Quiz • 16 minutes
  • Week 2 Lab • 30 minutes

Inference for Comparing Means

Welcome to Week Three of the course! This week we will introduce the t-distribution and comparing means as well as a simulation based method for creating a confidence interval: bootstrapping. If you have questions or discussions, please use this week's forum to ask/discuss with peers.

11 videos 5 readings 3 quizzes

11 videos • Total 83 minutes

  • t-distribution • 7 minutes
  • Inference for a mean • 9 minutes
  • Inference for comparing two independent means • 8 minutes
  • Inference for comparing two paired means • 9 minutes
  • Power • 11 minutes
  • Comparing more than two means • 6 minutes
  • ANOVA • 9 minutes
  • Conditions for ANOVA • 2 minutes
  • Multiple comparisons • 6 minutes
  • Bootstrapping • 8 minutes
  • Week 3 Suggested Readings and Practice Exercises • 10 minutes
  • Week 3 Lab Instructions (RStudio) • 10 minutes
  • Week 3 Lab Instructions (RStudio Cloud) • 10 minutes
  • Week 3 Practice Quiz • 30 minutes
  • Week 3 Quiz • 30 minutes
  • Week 3 Lab • 30 minutes

Inference for Proportions

Welcome to Week Four of our course! In this unit, we’ll discuss inference for categorical data. We use methods introduced this week to answer questions like “What proportion of the American public approves of the job of the Supreme Court is doing?” Also in this week you will use the data set provided to complete and report on a data analysis question. Please read the project instructions to complete this self-assessment.

11 videos 6 readings 3 quizzes

11 videos • Total 117 minutes

  • Introduction • 3 minutes • Preview module
  • Sampling Variability and CLT for Proportions • 15 minutes
  • Confidence Interval for a Proportion • 9 minutes
  • Hypothesis Test for a Proportion • 9 minutes
  • Estimating the Difference Between Two Proportions • 17 minutes
  • Hypothesis Test for Comparing Two Proportions • 13 minutes
  • Small Sample Proportions • 10 minutes
  • Examples • 4 minutes
  • Comparing Two Small Sample Proportions • 5 minutes
  • Chi-Square GOF Test • 14 minutes
  • The Chi-Square Independence Test • 11 minutes

6 readings • Total 110 minutes

  • Week 4 Suggested Readings and Practice Exercises • 10 minutes
  • Week 4 Lab Instructions (RStudio) • 10 minutes
  • Week 4 Lab Instructions (RStudio Cloud) • 10 minutes
  • Project Instructions, Data Files, and Checklist • 60 minutes
  • Week 4 Practice Quiz • 30 minutes
  • Week 4 Quiz • 30 minutes
  • Week 4 Lab • 30 minutes

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

Mine Çetinkaya-Rundel

Duke University has about 13,000 undergraduate and graduate students and a world-class faculty helping to expand the frontiers of knowledge. The university has a strong commitment to applying knowledge in service to society, both near its North Carolina campus and around the world.

Recommended if you're interested in Probability and Statistics

descriptive and inferential statistics assignment

Duke University

Linear Regression and Modeling

descriptive and inferential statistics assignment

Eindhoven University of Technology

Improving your statistical inferences

descriptive and inferential statistics assignment

Introduction to Probability and Data with R

descriptive and inferential statistics assignment

University of Amsterdam

Why people choose Coursera for their career

descriptive and inferential statistics assignment

Learner reviews

Showing 3 of 2666

2,666 reviews

Reviewed on Feb 28, 2017

Great course. If you put in a little effort, you will come out with a lot of new knowledge. I recommend using the book after you have seen the movies. It gives a deeper picture of how it works. Great!

Reviewed on May 25, 2024

This course equips students with the knowledge and skills needed to collect, analyze, and interpret data effectively, making it a valuable tool in many fields of study and professions.

Reviewed on Jul 5, 2020

Very nicely designed course and it also progresses very well. If higher mathematics would be involved in it, the course has the ability to replace many college's statistical inference's classes.

New to Probability and Statistics? Start here.

Placeholder

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Frequently asked questions

Cost of the course.

If you want to complete the course and earn a Course Certificate by submitting assignments for a grade, you can upgrade your experience by subscribing to the course for $49/month. You can also apply for financial aid if you can't afford the course fee.

When you enroll in a course that is part of a Specialization (which this course is), you will automatically be enrolled in the entire Specialization. You can unenroll from the Specialization if you’re not interested in the other courses or cancel your subscription once you complete the single course.

Can I just enroll in a single course? I'm not interested in the entire Specialization.

To enroll in an individual course, search for the course title in the catalog.

To get full access to a course, including the option to earn grades and a Course Certificate, you'll need to subscribe. New subscribers will start with a full access subscription, which includes full access to every course in the Coursera catalog. Existing Specialization subscribers will be given the option to update to a full access subscription when enrolling in a new Specialization or course.

When you enroll in a course that is part of a Specialization, you will automatically be enrolled in the entire Specialization. You can unenroll from the Specialization if you’re not interested in the other courses.

Will I receive a transcript from Duke University for completing this course?

No. Completion of a Coursera course does not earn you academic credit from Duke; therefore, Duke is not able to provide you with a university transcript. However, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

When will I have access to the lectures and assignments?

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.

The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

What will I get if I subscribe to this Specialization?

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

What is the refund policy?

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy Opens in a new tab .

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

More questions

  • Innovation at WSU
  • Directories
  • Give to WSU
  • Academic Calendar
  • A-Z Directory
  • Calendar of Events
  • Office Hours
  • Policies and Procedures
  • Schedule of Courses
  • Shocker Store
  • Student Webmail
  • Technology HelpDesk
  • Transfer to WSU
  • University Libraries

Unit 07: How to Evaluate Descriptive and Inferential Statistics

Unit 7: how to evaluate descriptive and inferential statistics.

Unit 7: Assignment #1 (due before 11:59 pm Central on Wednesday September 29) :

  • Watch Lynda.com’s (2010) video, “ Understanding Descriptive and Inferential Statistics .”
  • Read Laerd Statistics’ (no date) article, “ Descriptive and Inferential Statistics ” ( web link ).
  • Read a section of Wikipedia’s (2020) entry, “ Descriptive Statistics ” ( web link ).
  • Read Statistics HowTo’s (2014) article, “ Inferential Statistics: Definition, Uses ” ( web link ).
  • At this point, you should be clear on the difference between descriptive and inferential statistics and the common uses for both types of statistics.
  • If you’re not clear, you might want to re-read the above articles and re-watch the videos.
  • You might also want to review how to write a Five-Paragraph Examples-Style Essay, by watching the latter part of Professor Gernsbacher’s lecture video, “ The Five-Paragraph Model ” (a transcript of the video is available in PDF and Word ).
  • Check your essay to make sure your Introduction Paragraph has a hook and a Thesis Statement .
  • Check your Thesis Statement to make sure that it is ONE sentence that captures all three  of your three examples .
  • Check your essay to make sure it has three Examples Paragraphs .
  • Check your essay to make sure it has a Conclusion Paragraph .
  • Check your Conclusion Paragraph to make sure it has a Re-stated Thesis Statement and ends with something (mildly) witty or profound.
  • Check your Re-stated Thesis Statement to make sure that it is ONE sentence that summarizes all three of your three examples .
  • Check ALL FIVE of your paragraphs — your Introduction Paragraph, each of your three Examples Paragraphs, and your Conclusion Paragraph — to make sure each of your five paragraphs has FIVE sentences (a Topic Sentence, three Supporting Sentences, and a Conclusion Sentence).
  • Save your essay as a PDF and name the file YourLastname_DescriptiveEssay.pdf .
  • Check your Thesis Statement to make sure that it is ONE sentence that incorporates all three of your three examples .
  • Check your Conclusion Paragraph to make sure it has a Re-stated Thesis Statement and ends with something (mildly) witty or profound
  • Check ALL FIVE of your paragraphs — your Introduction Paragraph, each of your three Examples Paragraphs, and your Conclusion Paragraph — to make sure each of your five paragraphs has FIVE sentences (a Topic Sentence, three Supporting Sentences, and a Conclusion Sentence).
  • Save your second essay as a PDF and name the file YourLastname_InferentialEssay.pdf.
  • If you ever wonder why we repeatedly practice skills, such as writing five-paragraph essays, in different contexts throughout this course, consider the words of William James ( Word ), who is widely considered the father of U.S. Psychology!
  • First, attach your Descriptive Statistics Essay, saved as a PDF.
  • Remember to “Attach” your Descriptive Statistics Essay PDF and not use the “File” tool.
  • Because the Discussion Board will allow only one file to be attached to each post, make a reply post to your own post.
  • Use your reply post to attach your Inferential Statistics Essay, saved as a PDF.
  • You should write both essays, and then make your Discussion Board post, because if you turn in only one essay (or turn in one essay a while before you turn in the other), only one essay is what we’ll be alerted to grade.

Unit 7: Assignment #2 (due before 11:59 pm Central on Thursday September 30) :

  • NOTE: This book was published in 1954; therefore, the examples are from the 1940s and early 1950s. However, it’s still a beloved book (e.g., it’s recommended reading in a college Physics class), despite its age.
  • Chapter 2 explains the deception caused by indiscriminately referring to the mean, median, and mode (i.e., three central-tendency descriptive statistics) as “the average.”
  • Chapter 3 explains the deception caused by random variation and the solutions provided by inferential statistics.
  • Chapter 4 explains the deception caused by differences that aren’t meaningful.
  • Chapters 5 and 6 explain deception by graphs and figures.
  • When reading these chapters, jot down your three favorite deceptions. For example, you might choose as one of your favorite deceptions the hypothetical real estate agent’s deceptive use of a neighborhood’s “average” income in Chapter 2.
  • You need to choose an audience for your teaching document. Your choices are (1) other college students; (2) middle-school students (age 12 to 14); or (3) older adults (over age 60).
  • You need to choose a medium for your teaching document. Your choices are (1) a PPT; (2) an Infographic; or (3) a comic strip (e.g., The Nib’s ).
  • You need to save your teaching document as a PDF, named YourLastname_StatsDeception.pdf .
  • attach your teaching document PDF, and
  • tell us the intended audience of your teaching document and why you chose that intended audience.

Unit 7: Assignment #3 (due before 11:59 pm Central on Friday October 1) :

  • Read Sullivan and Feinn’s (2012) article, “ Using Effect Size—or Why the P Value Is Not Enough ” ( web link ).
  • Sullivan and Feinn’s (2012) article might be harder to read than other articles you’ve read in this course. But try to understand it at least at a superficial level. Feel free to Google terms that you don’t know.
  • Read Chapman and Louis’s (2017) article, “ The Seven Sins of Statistical Misinterpretation ” ( web link ).
  • In contrast to gaining a working, but superficial understanding of the computations and the like that Sullivan and Feinn (2012) provide in their article, make sure you understand well the seven “sins” that Chapman and Louis provide in their article.
  • Choose three of the 9 articles that you found and read in Unit #5 and that you synthesized in Unit #6.
  • Choose the three articles (of your 9 articles) that will be the easiest (and most logical) to evaluate according to Chapman and Louis’s (2017) “ Seven Sins of Statistical Misinterpretation ” ( web link ).
  • First, download the unfilled PDF and save it on your own computer.
  • Second, rename the unfilled PDF to be YourLastName _PSY-311_StatsCheck_Fillable.pdf. In other words, add your last name to the beginning of the filename.
  • Third, on your computer, open a PDF writer app, such as Preview, Adobe Reader, or the like. Be sure to open your PDF writer app before you open the unfilled PDF from your computer.
  • Fourth, from within your PDF writer app, open the unfilled PDF, which you have already saved onto your computer and re-named.
  • Fifth, using the PDF writer app, fill in the PDF.
  • Sixth, save your now-filled-in PDF on your computer.
  • There are three pages in the fillable PDF; use a different page for each of your three articles. Make sure that your citation in the citation text box at the top of the page is in APA style (see Unit 5). It’s okay, for this assignment, if you can’t italicize parts of the citation (in the citation text box of the fillable PDF).
  • Go the Unit 7: Assignment #3 Discussion Board and attach your filled in PDF.

Unit 7: Assignment #4 (due before 11:59 pm Central on Sunday October 3) :

  • Make sure you understand the article’s answer to the concern that “[the pollsters] never call me.”
  • Make sure you understand the article’s answer to the concern that “nobody I know says that.”
  • Read Rumsey’s (no date) article, “ How to Interpret the Margin of Error in Statistics ” ( web link ).
  • Make sure you understand the difference between sampling a population and surveying (or polling) an entire population.
  • Make sure you understand what a margin of error means in a public opinion poll or survey.
  • Read Hunter’s (no date) article, “ Margin of Error and Confidence Levels Made Simple ” ( web link ).
  • Make sure you understand what it means to calculate a margin of error at a 95% confidence level.
  • Make sure you understand the relation between sample size and margin of error.
  • Harter and Adkins (2015): “ Engaged Employees Less Likely to Have Health Problems ” ( web link ).
  • Newport (2017a) “ Email Outside of Working Hours Not a Burden to US Workers ” ( web link ).
  • Newport and Dugan (2017): “ Americans Still See Manufacturing as Key to Job Creation ” ( web link ).
  • Newport (2018a): “ Average American Predicts Retirement Age of 66 ” ( web link ).
  • Swift (2017a): “ Most U.S. Employed Adults Plan to Work Past Retirement Age ” ( web link ).
  • Newport (2017b): “ Young, Old in US Plan on Relying More on Social Security ” ( web link ).
  • Jones (2017a): “ Worry About Hunger, Homelessness Up for Lower-Income in US ” ( web link ).
  • Norman (2017): “ Financially Stressed in US Now Prefer Saving to Spending ” ( web link ).
  • Jones (2017b): “ Half of Non-Homeowners Expect to Buy Homes in Five Years ” ( web link ).
  • Newport (2018b): “ Americans’ Views of Their Spending and Saving ” ( web link ).
  • Rigoni and Nelson (2016): “ Millennials Want Jobs That Promote Their Well-Being ” ( web link ).
  • Witters (2017a): “ Hawaii Leads US States in Well-Being for Record Sixth Time ” ( web link ).
  • Witters (2017b): “ Naples, Florida, Remains Top US Metro for Well-Being ” ( web link ).
  • McCarthy (2017a): “ U.S. Support for Gay Marriage Edges to New High ” ( web link ).
  • McCarthy (2017b): “ Americans More Positive about Effects of Immigration ” ( web link ).
  • Swift (2017b): “ More Americans Say Immigrants Help Rather Than Hurt Economy ” ( web link ).
  • Reinhart and Ray (2018): “ Record Unhappiness with Women’s Position in U.S. ” ( web link ).
  • Auter (2018): “ Half of College Students Say Their Major Leads to a Good Job ” ( web link ).
  • Maturo (2017): “ One in Three Veterans Consult Coworkers About College Major ” ( web link ).
  • Auter (2017): “ Second Thoughts on College Major Linked to Source of Advice ” ( web link ).
  • What was the topic of the public opinion poll?
  • Why did you choose this topic (and read this report)?
  • What three findings from this public opinion poll do you think are the most interesting – and why do you think those three findings are interesting?
  • What was the total sample size?
  • What was the poll’s margin of error?
  • Was the margin of error calculated at the 95% confidence level?
  • What does it mean that the margin of error was calculated at the 95% confidence level?

Unit 7: Assignment #5 (due before 11:59 pm Central on Monday October 4) :

  • Go to the Unit 7: Assignment #4 and #5 Discussion Board and read all the other students’ posts.
  • One of your responses must be to a student who wrote about at least one of the topics that you also wrote about.
  • One of your responses must be to a student who wrote about at least one topics that you did NOT write about.
  • Your third response can be to any other student (besides the two students you responded to in 1. and 2. above).
  • If no other student wrote about one of the topics that you also wrote about, you can respond to three students who all wrote about different topics than you wrote about.

Unit 7: Assignment #6 (due before 11:59 pm Central on Tuesday October 5) :

  • Read both essays that each of the other members of your small Chat Group posted on the Unit 7: Assignment #1 Discussion Board . If you are in a Chat Group with two other students, that means you will read four essays; if you are in a Chat Group with only one other student, that means you will read two essays.
  • Review how to provide peer review on your Chat Group members’ essays by reading the Peer Review Guidelines ( Word ). Note that you will again be answering 12 questions about each member’s essays.
  • Prior to your Chat Group meeting online, all members of your Chat Group must have completed steps a. and b. of this Assignment .
  • Then, spend the remainder of your hour-long Chat with each Chat Group member providing peer review of the other Chat Group members’ essays.
  • Nominate one member of your Chat Group (who participated in the Chat) to make a post on the Unit 7: Assignment #6 Discussion Board that summarizes your Group Chat in at least 200 words.
  • Nominate another member of your Chat Group (who also participated in the Chat) to save the Chat transcript as a PDF, as described in the Course How To (under the topic, “How to Save and Attach a Group Text Chat Transcript”), and attach the Chat transcript to a post on the Unit 7: Assignment #6 Discussion Board .
  • Nominate another member of your Chat Group (who also participated in the Chat) to make another post on the Unit 7: Assignment #6 Discussion Board that states the name of your Chat Group, the names of the Chat Group members who participated the Chat, the date of your Chat, and the start and stop time of your Chat.
  • If only two persons participated in the Chat, then one of those two persons needs to do two of the above three tasks.
  • Before ending the Group Chat, bid goodbye to each other. In the next Unit you will be forming new Chat Groups!
  • All members of the Chat Group must record a typical Unit entry in your own Course Journal for Unit 7.

Congratulations, you have finished Unit 7! Onward to Unit 8 !

Open-Access Active-Learning Research Methods Course by Morton Ann Gernsbacher, PhD is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License . The materials have been modified to add various ADA-compliant accessibility features, in some cases including alternative text-only versions. Permissions beyond the scope of this license may be available at http://www.gernsbacherlab.org .

Table of Contents

What is descriptive statistics, what is inferential statistics, key differences between descriptive and inferential statistics , common similarities between descriptive and inferential statistics, 3 major types of descriptive statistics, 3 major types of inferential statistics, descriptive and inferential statistics tools, choose the right program, comprehensive guide to descriptive vs inferential statistics.

Descriptive vs. Inferential Statistics: Key Differences

Statistics forms the core of data analytics, serving as the fundamental tool for identifying trends and patterns within vast numerical datasets. This mathematical discipline encompasses two main categories: Descriptive Statistics and Inferential Statistics. Here, we delve into the contrasting aspects of descriptive vs inferential statistics and their respective impacts on data analytics . While certain measurement techniques may overlap, their underlying objectives diverge significantly. Therefore, it is crucial to discern the major disparities between the two.

Descriptive statistics is a branch of statistics that deals with summarizing and describing the main features of a dataset. It provides methods for organizing, visualizing, and presenting data meaningfully and informally. Descriptive statistics describe the characteristics of the data set under study without generalizing beyond the analyzed data.

Common measures and techniques in descriptive statistics include measures of central tendency (such as mean, median, and mode), measures of dispersion (such as range, variance, and standard deviation), frequency distributions (histograms, frequency tables), and graphical representations (box plots, bar charts, pie charts, etc.). These methods help to provide a clear and concise summary of the data, facilitating easier interpretation and understanding.

Inferential statistics , on the other hand, involves making inferences, predictions, or generalizations about a larger population based on data collected from a sample of that population. It extends the findings from a sample to the population from which the sample was drawn. Inferential statistics allow researchers to draw conclusions, test hypotheses, and make predictions about populations, even when it is impractical or impossible to study the entire population directly.

Key methods in inferential statistics include hypothesis testing, where researchers test hypotheses about population parameters using sample data; regression analysis, where relationships between variables are examined and used to make predictions; and confidence intervals, which provide estimates of population parameters and their uncertainty levels.

This table summarizes the main differences between descriptive and inferential statistics, highlighting their respective purposes, scopes, objectives, examples, and statistical techniques.

Purpose

Summarizes and describes features of a dataset

Makes inferences, predictions, or generalizations about a population based on sample data

Scope

Focuses on specific sample data

Extends findings to a larger population

Objective

Describes characteristics of the data without generalizing

Generalizes findings from sample to population

Examples

Measures of central tendency, dispersion, frequency distributions, graphical representations

Hypothesis testing, regression analysis, confidence intervals

Data Analysis

Provides a summary and visualization of data

Draws conclusions, tests hypotheses, and makes predictions

Population Representation

Represents features within the sample only

Represents features of the larger population

Statistical Techniques

Mean, median, mode, range, variance, standard deviation, histograms, box plots, etc.

Hypothesis testing, regression analysis, confidence intervals

Goal

To provide insights into the characteristics of a dataset

To make predictions or draw conclusions about a population

  • Data Analysis: Both descriptive and inferential statistics involve analyzing data to extract meaningful information. While descriptive statistics focus on summarizing and describing the characteristics of a dataset, inferential statistics use sample data to make inferences or predictions about a larger population.
  • Statistical Techniques: Although the specific techniques may vary, both branches of statistics rely on various statistical methods and tools to analyze data. Descriptive statistics commonly involve measures of central tendency, dispersion, and graphical representations, while inferential statistics often include hypothesis testing, regression analysis, and confidence intervals.
  • Population Consideration: While descriptive statistics primarily deal with the characteristics of a sample dataset, they are often used as a foundation for inferential statistics. Inferential statistics utilize sample data to make inferences about the larger population from which the sample was drawn.
  • Inference: Both branches ultimately aim to draw conclusions from data. Descriptive statistics provide insights into the features of the observed data, while inferential statistics extend these findings to make predictions or draw conclusions about a broader population.
  • Application: Descriptive and inferential statistics are widely applied across various fields, including science, business, economics, social sciences, and healthcare. They play essential roles in these domains' decision-making , research, analysis, and problem-solving.
  • Mathematical Foundations: Both branches of statistics are grounded in mathematical principles and concepts. They rely on probability theory, mathematical formulas, and statistical models to analyze and interpret data accurately.

Become a Data Science & Business Analytics Professional

  • 28% Annual Job Growth By 2026
  • 11.5 M Expected New Jobs For Data Science By 2026

Data Analyst

  • Industry-recognized Data Analyst Master’s certificate from Simplilearn
  • Dedicated live sessions by faculty of industry experts

Post Graduate Program in Data Analytics

  • Post Graduate Program certificate and Alumni Association membership
  • Exclusive hackathons and Ask me Anything sessions by IBM

Here's what learners are saying regarding our programs:

Gayathri Ramesh

Gayathri Ramesh

Associate data engineer , publicis sapient.

The course was well structured and curated. The live classes were extremely helpful. They made learning more productive and interactive. The program helped me change my domain from a data analyst to an Associate Data Engineer.

Felix Chong

Felix Chong

Project manage , codethink.

After completing this course, I landed a new job & a salary hike of 30%. I now work with Zuhlke Group as a Project Manager.

Measures of Central Tendency

Measures of central tendency represent the center or typical value of a dataset. They provide insight into where the bulk of the data points lie. The three main measures of central tendency are:

  • Mean: The arithmetic average of all the values in the dataset.
  • Median: The middle value of the dataset when arranged in ascending or descending order.
  • Mode: The value that occurs most frequently in the dataset.

Measures of Dispersion

Measures of dispersion quantify the spread or variability of data points around the central tendency. They indicate how much the individual data points deviate from the average. Common measures of dispersion include:

  • Range: The difference between the maximum and minimum values in the dataset.
  • Variance: The average squared differences between each data point and the mean.
  • Standard Deviation: The square root of the variance, representing the average distance of data points from the mean.

Frequency Distributions and Graphical Representations:

Frequency distributions display the frequency of occurrence of different values or ranges in a dataset. They help to visualize the distribution of data across various categories. Common graphical representations used in descriptive statistics include:

  • Histograms: Bar charts that display the frequency of data points within predefined intervals or bins.
  • Box Plots (Box-and-Whisker Plots): Graphical representations that display a dataset's median, quartiles, and outliers.
  • Pie Charts: Circular charts representing the proportions of different categories within a dataset.

Hypothesis Testing

Hypothesis testing is a fundamental technique in inferential statistics used to make decisions or draw conclusions about a population parameter based on sample data. It involves formulating a null hypothesis (H0) and an alternative hypothesis (Ha), collecting sample data, and using statistical tests to determine whether there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis. Common statistical tests for hypothesis testing include t-tests, chi-square tests, ANOVA (Analysis of Variance), and z-tests.

Regression Analysis

Regression analysis is a statistical technique used to examine the relationship between one or more independent variables (predictors) and a dependent variable (outcome) and to make predictions based on this relationship. It helps to identify and quantify the strength and direction of the association between variables and to predict the dependent variable's value for given independent variable values. Common types of regression analysis include linear, logistic, polynomial, and multiple regression.

Confidence Intervals

Confidence intervals provide a range of values within which the true population parameter is likely to lie with a certain level of confidence based on sample data. They quantify the uncertainty associated with estimating population parameters from sample data. Confidence intervals are calculated using point estimates, such as sample means or proportions, and their standard errors. The confidence level represents the probability that the interval contains the true population parameter. Commonly used confidence levels include 90%, 95%, and 99%.

Descriptive Statistics Tools

  • Microsoft Excel: Excel is widely used for basic statistical analysis, including calculating central tendency and dispersion measures and creating graphical representations such as histograms and scatter plots.
  • SPSS (Statistical Package for the Social Sciences): SPSS is a comprehensive statistical software package for data management , analysis, and reporting. It offers various descriptive statistical analyses, including frequency distributions, cross-tabulations, and descriptive charts.
  • R: R is a programming language and software environment specifically designed for statistical computing and graphics. It provides numerous packages and functions for descriptive statistics, data visualization, and exploratory data analysis.
  • Python: Python , with libraries such as NumPy, Pandas, and Matplotlib, is increasingly popular for statistical analysis and data visualization. These libraries offer powerful tools for calculating descriptive statistics and creating visualizations.
  • GraphPad Prism: GraphPad Prism is a scientific graphing and statistical software widely used in life sciences research. It provides tools for descriptive statistics, graphing, and curve fitting.

Inferential Statistics Tools

  • R: R offers various packages for conducting inferential statistical analyses, including hypothesis testing, regression analysis, and confidence interval estimation. Packages such as stats, lmtest, and MASS are commonly used for inferential statistics in R.
  • SPSS: Besides descriptive statistics, SPSS provides tools for conducting inferential statistical tests, including t-tests, ANOVA, chi-square tests, and regression analysis.
  • Python: Python libraries such as SciPy, StatsModels, and scikit-learn offer tools for conducting various inferential statistical analyses, including hypothesis testing, regression analysis, and machine learning algorithms .
  • SAS (Statistical Analysis System): SAS is a comprehensive statistical software suite for data management, analysis, and reporting. It provides various procedures and modules for conducting inferential statistical analyses.
  • MATLAB: MATLAB offers statistical and machine learning tools for conducting hypothesis tests, fitting models, and analyzing data. It includes built-in functions for conducting various inferential statistical analyses.

Interested in building a career path within the dynamic world of data analytics? Our data analytics courses are developed to equip you with the skills and expertise to thrive in this swiftly expanding field. Led by seasoned instructors, our curriculum is enriched with hands-on projects, real-world simulations, and case studies, fostering a practical learning environment essential for your triumph. Through our courses, you'll master the art of data analysis, adeptly craft insightful reports, and harness the power of data-driven decision-making pivotal for steering business triumphs.

Program Name Data Analyst Post Graduate Program In Data Analytics Data Analytics Bootcamp Geo All Geos All Geos US University Simplilearn Purdue Caltech Course Duration 11 Months 8 Months 6 Months Coding Experience Required No Basic No Skills You Will Learn 10+ skills including Python, MySQL, Tableau, NumPy and more Data Analytics, Statistical Analysis using Excel, Data Analysis Python and R, and more Data Visualization with Tableau, Linear and Logistic Regression, Data Manipulation and more Additional Benefits Applied Learning via Capstone and 20+ industry-relevant Data Analytics projects Purdue Alumni Association Membership Free IIMJobs Pro-Membership of 6 months Access to Integrated Practical Labs Caltech CTME Circle Membership Cost $$ $$$$ $$$$ Explore Program Explore Program Explore Program

Do you want to gain an in-depth understanding of descriptive vs. inferential statistics? Do you want to master the computation of summary statistics and gain a thorough knowledge of both branches? Enrolling in the Data Analyst Masters Program by Simplilearn is a significant step for those aspiring to build a career in data analytics. This program equips you with essential statistical fundamentals, including the disparities between descriptive and inferential statistics.

1. What's the difference between descriptive and inferential statistics?

Descriptive statistics summarize and describe the main features of a dataset through measures like mean, median, and standard deviation, providing a quick overview of the sample data. Inferential statistics, on the other hand, use sample data to make estimates, predictions, or other generalizations about a larger population. It involves using probability theory to infer characteristics of the population from which the sample was drawn.

2. What is an example of an inferential statistic?

An example of an inferential statistic is the calculation of a confidence interval. For instance, after sampling test scores from a group of students, a confidence interval might be used to estimate the range within which the average test score of all students in the population likely falls.

3. What is an example of a descriptive statistic?

An example of a descriptive statistic is the mean (average) score of students on a test. If you have test scores for 30 students in a class, calculating the mean score provides a summary of the performance of the class on that test.

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Program NameDurationFees

Cohort Starts:

32 weeks$ 3,850

Cohort Starts:

11 Months$ 4,500

Cohort Starts:

6 Months$ 8,500

Cohort Starts:

8 months$ 3,500

Cohort Starts:

11 months$ 3,800

Cohort Starts:

3 Months$ 2,624
11 months$ 1,449
11 months$ 1,449

Get Free Certifications with free video courses

Machine Learning using Python

AI & Machine Learning

Machine Learning using Python

Artificial Intelligence Beginners Guide: What is AI?

Artificial Intelligence Beginners Guide: What is AI?

Learn from Industry Experts with free Masterclasses

Data science & business analytics.

Navigate the Future of Data Analytics with Gen AI & Prompt Engineering

Data Storytelling: Transform Data into Business Solutions with Power BI in 60 Minutes

Career Masterclass: AI Engineer vs. Data Scientist: Skills, Roles, and Opportunities

Recommended Reads

Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer

Inferential Statistics Explained: From Basics to Advanced!

A Comprehensive Look at Percentile in Statistics

Free eBook: Top Programming Languages For A Data Scientist

The Difference Between Data Mining and Statistics

A One-Stop Guide to Statistics for Machine Learning

Get Affiliated Certifications with Live Class programs

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Descriptive Statistics: Definition & Charts and Graphs

Probability and Statistics > Descriptive Statistics

descriptive statistics: pie chart

  • Difference Between Descriptive and Inferential
  • Excel Instructions
  • Graphs, Charts and Plots See Also: Basic Statistics Terms

1. Definition of Descriptive Statistics

Descriptive statistics are one of the fundamental “must knows” with any set of data. It gives you a general idea of trends in your data including:

  • The mean, mode, median and range .
  • Variance and standard deviation .
  • Count, maximum and minimum.

Descriptive statistics is useful because it allows you to take a large amount of data and summarize it. For example, let’s say you had data on the incomes of one million people. No one is going to want to read a million pieces of data; if they did, they wouldn’t be able to glean any useful information from it. On the other hand, if you summarize it, it becomes useful: an average wage, or a median income, is much easier to understand than reams of data.

Descriptive statistics can be further broken down into several sub-areas, like:

  • Measures of central tendency.
  • measures of dispersion .
  • Charts & graphs.
  • Shapes of Distributions.

The charts, graphs and plots site index is below . For definitions and information on how to find measures of spread and central tendency, see: Basic statistics (which covers the basic terms you’ll find in descriptive statistics like interquartile range , outliers and standard deviation ).

2. Difference Between Descriptive and Inferential Statistics

Statistics can be broken down into two areas:

  • Descriptive statistics: describes and summarizes data. You are just describing what the data shows: a trend, a specific feature, or a certain statistic (like a mean or median).
  • Inferential statistics : uses statistics to make predictions.

Descriptive statistics just describes data. For example, descriptive statistics about a college could include: the average SAT score for incoming freshmen; the median income of parents; racial makeup of the student body. It says nothing about why the data might exist, or what trends you might be able to see from the data. When you take your data and start to make predictions about future behavior or trends, that’s inferential statistics. Inferential statistics also allows you to take sample data (e.g. from one university) and apply it to a larger population (e.g. all universities in the country).

3. Excel Descriptive Statistics

Excel Descriptive Statistics

How to Calculate Excel Descriptive Statistics: Steps

Step 1: Type your data into Excel, in a single column. For example, if you have ten items in your data set, type them into cells A1 through A10.

Step 2: Click the “Data” tab and then click “Data Analysis” in the Analysis group.

Step 3: Highlight “Descriptive Statistics” in the pop-up Data Analysis window.

Step 4: Type an input range into the “Input Range” text box. For this example, type “A1:A10” into the box.

Step 5: Check the “Labels in first row” check box if you have titled the column in row 1, otherwise leave the box unchecked.

Step 6: Type a cell location into the “Output Range” box. For example, type “C1.” Make sure that two adjacent columns do not have data in them.

Step 7: Click the “Summary Statistics” check box and then click “OK” to display Excel descriptive statistics. A list of descriptive statistics will be returned in the column you selected as the Output Range.

4. Descriptive Statistics: Charts, Graphs and Plots

There are literally dozens of charts and graphs you can make from data. which one you choose depends upon what kind of data you have and what you want to display. For example, if you wanted to display relationships between data in categories, you could make a bar graph.

Grouped bar graph. Image: CDC.

A pie chart would show you how categories in your data relate to the whole set.

Pie chart showing water consumption. Image courtesy of EPA.

Scatter plots are a good way to display data points.

Less common, but useful in some cases, include dot plots and box and whisker charts :

dot plot example

How To Articles for Descriptive Statistics

  • Causal Graph
  • Absolute Frequency: Definition, Examples
  • Make a Histogram.
  • Make a Relative Frequency Histogram.
  • How to Make a Frequency Chart and Determine Frequency.
  • House Graph
  • Choose Bin Sizes.
  • How to Make an Ogive Graph.
  • Read a Box Plot.
  • Find a Box Plot Interquartile Range.
  • Draw a Frequency Distribution Table.
  • Make a Cumulative Frequency Distribution Table.
  • Find a Quadratic Mean.
  • Make a Stemplot.
  • U Chart: Definition, Example
  • Venn Diagram Templates .

Microsoft Excel : Descriptive Statistics

  • How to Create a Bar Graph in Excel.
  • Create a Histogram in Excel.
  • How to Make a Scatter Plot in Microsoft Excel.
  • Create a Frequency Distribution Table in Excel.
  • How to Make a Pie Chart in Excel.
  • Grubb’s Test to Find Outliers .

Minitab for Descriptive Statistics

  • How to Make a Scatterplot in Minitab.
  • Make a Boxplot in Minitab.
  • How to Make a Histogram in Minitab.
  • How to Create a Bar Graph in Minitab.
  • How to Make an SPSS Frequency Table.
  • How to Make an SPSS Histogram.
  • Make a Bar Chart in SPSS.
  • How to Make an SPSS Boxplot.
  • How to Make an SPSS Scatterplot.
  • Make a Pie Chart in SPSS.

Definitions

  • 68-95-99.7 Rule.
  • The Area Principle.
  • Attribute Control Chart
  • Back-to-Back Stemplot.
  • Bimodal Distribution.
  • Bland-Altman Plot.
  • Collider Variable
  • Cumulative Frequency Distribution?
  • Directed Acyclic Graph?
  • What is a Forest Plot or Blobbogram?
  • Frequency Distribution Table.
  • What is a Funnel Plot?
  • Grouped Data.
  • What are upper hinges and lower hinges?
  • Interquartile Mean.
  • Measures of Position
  • Measures of Spread.
  • Measures of Variation .
  • What is an NP Chart?
  • What is a P-Chart?
  • What is a Probability Tree?
  • What is a Pyramid Graph?
  • Ribbon Diagram
  • Scatter Plot.
  • Radar Charts.
  • What is a Seven Number Summary?
  • What is a Skewed Distribution?
  • Finding Skewness .
  • Scales of Measurement.
  • What is a Stemplot?
  • Symmetric Distribution.
  • What is a Timeplot?
  • Uniform Distribution.
  • What is a Unimodal Distribution?
  • Upper and Lower Fences.
  • Variogram .
  • Waterfall plot
  • X-MR (X-Moving Range) Chart
  • Youden Plot
  • Misleading Graphs in Real Life.
  • Types of Graphs .

Dodge, Y. (2008). The Concise Encyclopedia of Statistics . Springer. Everitt, B. S.; Skrondal, A. (2010), The Cambridge Dictionary of Statistics , Cambridge University Press. Gonick, L. (1993). The Cartoon Guide to Statistics . HarperPerennial. Salkind, N. (2016). Statistics for People Who (Think They) Hate Statistics: Using Microsoft Excel 4th Edition.

Descriptive Statistics: Definitions, Types, Examples

Introduction.

The first step of any data-related process is the collection of data. Once we have collected the data, what do we do with it? Data can be sorted, analyzed, and used in various methods and formats, depending on the project’s needs. While analyzing a dataset, We use statistical methods to arrive at a conclusion. Data-driven decision-making also depends on how efficiently we use these methods. Two types of statistical methods are widely used in data analysis: descriptive and inferential. This article will focus more on descriptive statistics, its types, calculations, examples,percentages etc.

This article was published as a part of the  Data Science Blogathon .

Table of contents

What is descriptive statistics, types of statistics, what is inferential statistics, types of descriptive statistics, descriptive statistics based on the central tendency of data, descriptive statistics based on the dispersion of data, descriptive statistics based on the shape of the data, univariate data vs. bivariate data in descriptive statistics, what are the 10 commonly used descriptive statistics, can descriptive statistics be used to make inferences or predictions, frequently asked questions.

Descriptive statistics serves as the initial step in understanding and summarizing data . It involves organizing, visualizing, and summarizing raw data to create a coherent picture. The primary goal of descriptive statistics is to provide a clear and concise overview of the data’s main features. This helps us identify patterns, trends, and characteristics within the data set without making broader inferences.

Key Aspects of Descriptive Statistics

  • Measures of Central Tendency: Descriptive statistics include calculating the mean, median, and mode, which offer insights into the center of the data distribution.
  • Measures of Dispersion: Variance, standard deviation, and range help us understand the spread or variability of the data.
  • Visualizations: Creating graphs, histograms, bar charts, and pie charts visually represent the data’s distribution and characteristics

When you delve into the world of statistics, you’ll encounter two fundamental branches: descriptive statistics and inferential statistics. These two distinct approaches help us make sense of data and draw conclusions. Let’s look at the differences between these two branches to shed light on their roles in the realm of statistical analysis and their total number of branches.

AspectDescriptive StatisticsInferential Statistics
PurposeSummarize and describe dataDraw conclusions or predictions
Data SampleAnalyzes the entire datasetAnalyzes a sample of the data
ExamplesMean, Median, Range, VarianceHypothesis testing, Regression
ScopeFocuses on data characteristicsMakes inferences about populations
GoalProvides insights and simplifies dataGeneralizes findings to a larger population
AssumptionsNo assumptions about populationsRequires assumptions about populations
Common Use CasesData visualization, data explorationScientific research, hypothesis testing

Inferential statistics takes data analysis to the next level by drawing conclusions about populations based on a sample. It involves making predictions, generalizations, and hypotheses about a larger group using a smaller subset of data. Inferential statistics bridges the gap between our data and the conclusions we want to reach. This is particularly useful when obtaining data from an entire population is impractical or impossible.

Key Aspects of Inferential Statistics

  • Sampling Techniques: Inferential statistics relies on carefully selecting representative samples from a population to make valid inferences.
  • Hypothesis Testing: This process involves setting up hypotheses about population characteristics and using sample data to determine if these hypotheses are statistically significant.
  • Confidence Intervals: These provide a range of values within which we’re confident a population parameter lies based on sample data.
  • Regression Analysis: Inferential statistics also encompass techniques like regression analysis to model relationships between variables and predict outcomes.

Now we will look at descriptive statistics in detail.

There are various dimensions in which this data can be described. The three main dimensions used for describing data are the central tendency, dispersion, and the shape of the data. Now, let’s look at them in detail, one by one.

The central tendency of data is the center of the distribution of data. It describes the location of data and concentrates on where the data is located. The three most widely used measures of the “center” of the data are Mean, Median, and Mode.

central tendency | descriptive statistics

The “Mean” is the average of the data. The average can be identified by summing up all the numbers and then dividing them by the number of observations.

Mean = X 1 + X 2 + X 3 +… +   X n / n

Data – 10,20,30,40,50  and Number of observations = 5 Mean = [ 10+20+30+40+50 ] / 5 Mean = 30

The central tendency of the data may be influenced by outliers. You may now ask, ‘ What are outliers? ‘ Well, outliers are extreme behaviors. An outlier is a data point that differs significantly from other observations. It can cause serious problems in analysis.

outlier | descriptive statistics

Data – 10,20,30,40,200 Mean = [ 10+20+30+40+200 ] / 5 Mean = 60

Solution for the outliers problem: Removing the outliers while taking averages will give us better results.

It is the 50th percentile of the data. In other words, it is exactly the center point of the data. The median can be identified by ordering the data, splitting it into two equal parts, and then finding the number in the middle. It is the best way to find the center of the data.

Note that, in this case, the central tendency of the data is not affected by outliers.

median

Odd number of Data – 10,20,30,40,50 Median is 30. Even the number of data – 10,20,30,40,50,60

Find the middle 2 data and take the mean of those two values. Here, 30 and 40 are middle values. Now, add them and divide the result by 2 30+40 / 2  =35 Median is 35

The mode of the data is the most frequently occurring data or elements in a dataset. If an element occurs the highest number of times, it is the mode of that data. If no number in the data is repeated, then that data has no mode. There can be more than one mode in a dataset if two values have the same frequency, which is also the highest frequency.

Outliers don’t influence the data in this case. The mode can be calculated for both quantitative and qualitative data.

mode

Data – 1,3,4,6,7,3,3,5,10, 3 Mode is 3, because 3 has the highest frequency (4 times)

The dispersion is the “spread of the data”. It measures how far the data is spread. In most of the dataset, the data values are closely located near the mean. The values are widely spread out of the mean on some other datasets. These dispersions of data can be measured by the Inter Quartile Range (IQR), range, standard deviation, and variance of the data.

dispersion of data descriptive statistics

Let us see these measures in detail.

Inter Quartile Range (IQR)

Quartiles are special percentiles. 1st Quartile Q1  is the same as the 25th percentile. 2nd Quartile Q2  is the same as 50th percentile. 3rd Quratile Q3  is same as 75th percentile

Steps to find quartile and percentile

  • The data should sorted and ordered from the smallest to the largest.
  • For Quartiles, ordered data is divided into 4 equal parts.
  • For Percentiles, ordered data is divided into 100 equal parts.

The Inter Quartile Range is the difference between the third quartile (Q3) and the first quartile (Q1)

IQR = Q3 – Q1

iqr

In this example, the Inter Quartile range is the spread of the middle half (50%) of the data.

The range is the difference between the largest and the smallest value in the data.

Standard Deviation

The most common measure of spread is the standard deviation. The Standard deviation measures how far the data deviates from the mean value. The standard deviation formula varies for population and and highest value of sample. Both formulas are similar but not the same.

Symbol used for Sample Standard Deviation  –  “s” (lowercase) Symbol used for Population Standard Deviation – “ σ”  (sigma, lower case)

Steps to find the Standard Deviation

If x is a number, then the difference “x – mean” is its deviation. The deviations are used to calculate the standard deviation.

Sample Standard Deviation, s  = Square root of sample variance  Sample Standard Deviation, s = Square root of    [Σ(x − x ¯ ) 2 / n-1]   where x ¯ is average and n is  no. of samples

standard deviation

Population Standard Deviation, σ = Square root of population variance Population Standard Deviation, σ = Square root of  [  Σ(x − μ) 2 / N ] where μ is Mean and N is no.of population.

sd for population descriptive statistics

The standard deviation is always positive or zero. It will be large when the data values are spread out from the mean.

The variance is a measure of variability. It is the average squared deviation from the mean. The symbol σ 2 represents the population variance, and the symbol for s 2 represents sample variance.

variance

The shape of the data is important because deciding the probability of data is based on its shape. The shape describes the type of the graph.

type of graph

The shape of the data can be measured by three methodologies: symmetric, skewness, kurtosis

In the symmetric shape of the graph, the data is distributed the same on both sides. In symmetric data, the mean and median are located close together. The curve formed by this symmetric graph is called a normal curve.

skewed

Skewness is the measure of the asymmetry of the distribution of data. The data is not symmetrical (i.e.) it is skewed towards one side. Skewness is classified into two types: positive skew and negative skew.

  • Positively skewed : In a Positively skewed distribution, the data values are clustered around the left side of the distribution, and the right side is longer. The mean and median will be greater than the mode in the positive skew.
  • Negatively skewed : In a Negatively skewed distribution, the data values are clustered around the right side of the distribution, and the left side is longer. The mean and median will be less than the mode.

Positive.Negative skewed and unskewed

Kurtosis is the measure of describing the distribution of data. This data is distributed in three different ways: platykurtic, mesokurtic, and leptokurtic.

differences

  • Platykurtic : The platykurtic shows a distribution with flat tails. Here, the data is distributed fairly. The flat tails indicated the small outliers in the distribution.

platykurtic descriptive statistics

  • Mesokurtic : In Mesokurtic, the data is widely distributed. It is normally distributed, and it also matches normal distribution.

mesokurtic

  • Leptokurtic : In leptokurtic, the data is very closely distributed. The height of the peak is greater than the width of the peak.

leptokurtic

When it comes to delving into the world of data analysis, two key terms you’re likely to encounter are “ Univariate ” and “ Bivariate .” These terms are crucial in descriptive statistics, as they help us categorize and understand the data types we’re working with. Whether you’re deciphering the properties of individual data points or unraveling the intricate dance between two variables, the concepts of univariate and bivariate data provide the foundation for insightful data analysis.

the key difference between univariate and bivariate data lies in the focus of analysis. Univariate analysis centers on understanding the characteristics of a single variable, while bivariate analysis explores connections and interactions between two variables. Let’s break down the differences between univariate and bivariate data to better grasp their significance.

Univariate Data

Univariate data focuses on a single variable, essentially spotlighting one aspect of your data. In this scenario, you’re interested in studying the distribution, central tendency, and dispersion of a single set of values. For instance, if you’re analyzing the heights of a group of individuals, you’re dealing with univariate data. Here, the variable of interest is height, and you aim to uncover insights about that specific characteristic.

In univariate analysis, you’re often looking at measures like:

  • Measures of Central Tendency: Mean, median, and mode provide insights into where the center of the data lies.
  • Measures of Dispersion: Range, variance, and standard deviation help you understand how spread out the data is.
  • Frequency Distribution: Creating histograms, bar charts, and pie charts allows you to visualize the data’s distribution.

Bivariate Data

Bivariate data, on the other hand, adds an extra layer of complexity to your analysis by involving two variables. Here, you’re not just interested in understanding individual characteristics; you’re also keen on uncovering relationships and patterns between two different variables. For example, if you’re examining the relationship between hours of study and exam scores, you’re working with bivariate data. The goal is to determine whether changes in one variable (study hours) have an impact on another (exam scores).

Bivariate analysis often involves techniques such as:

  • Scatter Plots: These visualizations showcase the relationship between two variables, with each data point plotted on the graph.
  • Correlation: Calculating correlation coefficients helps you quantify the strength and direction of the relationship between variables.
  • Regression Analysis: This technique allows you to model the relationship between variables, predicting the outcome of one based on the other.

There are actually many useful descriptive statistics, but here are 5 of the most commonly used:

  • Mean : This is the average of all the values in a data set. It’s a good indicator of the overall center of the data, but can be sensitive to outliers, especially in multivariate data with extreme values.
  • Median : This is the ‘middle’ value when the data is ordered from least to greatest. It’s less affected by outliers than the mean, making it a robust measure for box plot analyses.
  • Mode : This is the most frequent value in a data set. There can be one mode, or even multiple modes in some cases, especially when dealing with categorical variables.
  • Standard Deviation : This tells you how spread out the data is from the mean. A larger standard deviation indicates a wider spread of data points. It’s crucial in understanding the dispersion in multivariate data.
  • Range : This is the difference between the highest and lowest values in the data set. It’s a simple way to gauge how much variation there is but doesn’t tell you anything about the distribution within that range. It’s often represented in graphical representations like box plots.
  • Categorical Variables : These are variables that represent distinct groups or categories. Analysis often involves graphical representations and contingency tables to understand the relationships between categories.
  • Contingency Tables : These tables are used to display the frequency distribution of categorical variables. They help in analyzing the relationship between different categorical variables in multivariate data.
  • Box Plot : A graphical representation that shows the distribution of a dataset through its quartiles. It highlights the median, quartiles, and extreme values, providing a clear picture of the data’s spread and potential outliers.
  • Graphical Representation : This involves using visual tools like box plots, histograms, and scatter plots to summarize and analyze data, making it easier to identify patterns, trends, and extreme values in both univariate and multivariate datasets.
  • Extreme Values : These are the data points that are significantly higher or lower than the majority of the data. They can heavily influence the mean and standard deviation and are often highlighted in box plots and other graphical representations

Descriptive statistics themselves are not used for predictions, but they can lay the groundwork for them. Here’s the key difference:

Descriptive statistics summarize the data you have. They use measures like mean, median, and standard deviation to give you a general idea of what the data looks like. This process often involves exploratory data analysis, where open exploration of the data can reveal patterns and insights. For instance, calculating mean scores is a common part of this analysis.

Inferential statistics use the data you have to draw conclusions about a larger population. This allows you to make predictions about things you haven’t observed yet. Here, you would identify the dependent variable and independent variable in your study, which are crucial for making these inferences.

Think of it like this: Descriptive statistics describe your apartment, while inferential statistics use the features of your apartment to guess about the entire apartment building.

So, while descriptive statistics can’t directly predict the future, they can help you understand the data and prepare it for inferential statistics, which can then be used for predictions. Summary statistics from your exploratory data analysis can provide the foundation for these predictive models.

In a world flooded with data, understanding, interpreting, and communicating information is paramount. Descriptive statistics doesn’t just crunch numbers; it crafts narratives, constructs visualizations, and empowers us to make informed decisions. Hope this article has given you a brief introduction to descriptive statistics. In this article, we have seen how the various measures of descriptive statistics, such as central tendency, dispersion, and shape of the data curve, help decipher the numbers. We have also bridged the gap between individual characteristics and the dance between variables by learning about univariate and bivariate data.

Also, this article will help you with the standard deviation of these statistics and statisticians. Not only Multivariate analysis measures of spread the sample size of the shape of the distribution of these statistics.

Ans. The methods used to summarize and describe the main features of a dataset are called descriptive statistics. Measures of central tendencies, measures of variability, etc., which give information about the typical values in a dataset, are all examples of descriptive statistics.

Ans. The 5 descriptive statistics include standard deviation, minimum and maximum variables, variance, kurtosis, and skewness.

Ans. The frequency distribution, central tendency, and variability of a dataset are the 3 main types of descriptive statistics.

Ans. Descriptive statistics are of 3 types: frequency distribution, central tendency, and variability.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Recommended Articles

Descriptive vs Inferential Statistics: WhatR...

Get Started with Statistics for Data Science

Data Types in Statistics for Data Science

A Guide To Complete Statistics For Data Science...

End to End Statistics for Data Science

15 Basic Statistics Concepts Every Data Science...

Various Uses of Python Statistics Module &...

An Introduction to Statistics For Data Science:...

Mathematics for Data Science

Top 40 Data Science Statistics Interview Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear Submit reply

suresh

I have seen so many websites/videos. But did not understood few concepts, after this page - I understood very clearly without any doubts. Kudos to those who prepared this tutorial. Thanking you very much...!!!!

Write for us

Write, captivate, and earn accolades and rewards for your work

  • Reach a Global Audience
  • Get Expert Feedback
  • Build Your Brand & Audience
  • Cash In on Your Knowledge
  • Join a Thriving Community
  • Level Up Your Data Science Game

imag

Sion Chakrabarti

CHIRAG GOYAL

CHIRAG GOYAL

Barney Darlington

Barney Darlington

Suvojit Hore

Suvojit Hore

Arnab Mondal

Arnab Mondal

Prateek Majumder

Prateek Majumder

GenAI Pinnacle Program

Revolutionizing ai learning & development.

  • 1:1 Mentorship with Generative AI experts
  • Advanced Curriculum with 200+ Hours of Learning
  • Master 26+ GenAI Tools and Libraries

Enroll with us today!

Continue your learning for free, enter email address to continue, enter otp sent to.

Resend OTP in 45s

Privacy Overview

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

2.E: Descriptive Statistics (Exercises)

  • Last updated
  • Save as PDF
  • Page ID 1327

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

These are homework exercises to accompany the Textmap created for "Introductory Statistics" by Shafer and Zhang.

2.1: Three popular data displays

  • Describe one difference between a frequency histogram and a relative frequency histogram.
  • Describe one advantage of a stem and leaf diagram over a frequency histogram.
  • Construct a stem and leaf diagram, a frequency histogram, and a relative frequency histogram for the following data set. For the histograms use classes \(51-60\), \(61-70\), and so on. \[\begin{array}69 & 92 & 68 & 77 & 80 \\ 70 & 85 & 88 & 85 & 96 \\ 93 & 75 & 76 & 82 & 100 \\ 53 & 70 & 70 & 82 & 85\end{array}\]
  • Construct a stem and leaf diagram, a frequency histogram, and a relative frequency histogram for the following data set. For the histograms use classes \(6.0-6.9\), \(7.0-7.9\), and so on. \[\begin{array}8.5 & 8.2 & 7.0 & 7.0 & 4.9 \\ 6.5 & 8.2 & 7.6 & 1.5 & 9.3 \\ 9.6 & 8.5 & 8.8 & 8.5 & 8.7 \\ 8.0 & 7.7 & 2.9 & 9.2 & 6.9\end{array}\]
  • A data set contains \(n = 10\) observations. The values \(x\) and their frequencies \(f\) are summarized in the following data frequency table. \[\begin{array}{c|cccc}x & -1 & 0 & 1 & 2 \\ \hline f & 3 & 4 & 2 & 1\end{array}\]Construct a frequency histogram and a relative frequency histogram for the data set.
  • A data set contains the \(n=20\) observations The values \(x\) and their frequencies \(f\) are summarized in the following data frequency table. \[\begin{array}{c|ccc}x & -1 & 0 & 1 & 2 \\ \hline f & 3 & a & 2 & 1\end{array}\]The frequency of the value \(0\) is missing. Find a and then sketch a frequency histogram and a relative frequency histogram for the data set.
  • A data set has the following frequency distribution table: \[\begin{array}{c|ccc}x & 1 & 2 & 3 & 4 \\ \hline f & 3 & a & 2 & 1\end{array}\]The number a is unknown. Can you construct a frequency histogram? If so, construct it. If not, say why not.
  • A table of some of the relative frequencies computed from a data set is \[\begin{array}{c|ccc}x & 1 & 2 & 3 & 4 \\ \hline f ∕ n & 0.3 & p & 0.2 & 0.1\end{array}\]The number \(p\) is yet to be computed. Finish the table and construct the relative frequency histogram for the data set.

Applications

  • The IQ scores of ten students randomly selected from an elementary school are given. \[\begin{array}108 & 100 & 99 & 125 & 87 \\ 105 & 107 & 105 & 119 & 118\end{array}\]Grouping the measures in the \(80s\), the \(90s\), and so on, construct a stem and leaf diagram, a frequency histogram, and a relative frequency histogram.
  • The IQ scores of ten students randomly selected from an elementary school for academically gifted students are given. \[\begin{array}133 & 140 & 152 & 142 & 137 \\ 145 & 160 & 138 & 139 & 138\end{array}\]Grouping the measures by their common hundreds and tens digits, construct a stem and leaf diagram, a frequency histogram, and a relative frequency histogram.
  • During a one-day blood drive \(300\) people donated blood at a mobile donation center. The blood types of these \(300\) donors are summarized in the table. \[\begin{array}{c|ccc}Blood\: Type\hspace{0.167em} & O & A & B & AB \\ \hline Frequency & 136 & 120 & 32 & 12\end{array}\]Construct a relative frequency histogram for the data set.
  • In a particular kitchen appliance store an electric automatic rice cooker is a popular item. The weekly sales for the last \(20\)weeks are shown. \[\begin{array}20 & 15 & 14 & 14 & 18 \\ 15 & 17 & 16 & 16 & 18 \\ 15 & 19 & 12 & 13 & 9 \\ 19 & 15 & 15 & 16 & 15\end{array}\]Construct a relative frequency histogram with classes \(6-10\), \(11-15\), and \(16-20\).

Additional Exercises

  • Random samples, each of size \(n = 10\), were taken of the lengths in centimeters of three kinds of commercial fish, with the following results: \[\begin {array}{lrcccccccc} Sample \hspace{0.167em}1 : & 108 & 100 & 99 & 125 & 87 & 105 & 107 & 105 & 119 & 118 \\ Sample \hspace{0.167em} 2 : & 133 & 140 & 152 & 142 & 137 & 145 & 160 & 138 & 139 & 138 \\ Sample \hspace{0.167em} 3 : & 82 & 60 & 83 & 82 & 82 & 74 & 79 & 82 & 80 & 80\end{array}\]Grouping the measures by their common hundreds and tens digits, construct a stem and leaf diagram, a frequency histogram, and a relative frequency histogram for each of the samples. Compare the histograms and describe any patterns they exhibit.
  • During a one-day blood drive \(300\) people donated blood at a mobile donation center. The blood types of these \(300\) donors are summarized below. \[\begin{array}{c|ccc}Blood\: Type\hspace{0.167em} & O & A & B & AB \\ \hline Frequency & 136 & 120 & 32 & 12\end{array}\]Identify the blood type that has the highest relative frequency for these \(300\) people. Can you conclude that the blood type you identified is also most common for all people in the population at large? Explain.

the store is not to run out of stock by the end of a week for more than \(15\%\) of the weeks; and

the store is not to run out of stock by the end of a week for more than \(5\%\) of the weeks.

  • In retail sales, too large an inventory ties up capital, while too small an inventory costs lost sales and customer satisfaction. Using the relative frequency histogram for these data, find approximately how many rice cookers must be in stock at the beginning of each week if the store is not to run out of stock by the end of a week for more than \(15\%\) of the weeks; and the store is not to run out of stock by the end of a week for more than \(5\%\) of the weeks.
  • The vertical scale on one is the frequencies and on the other is the relative frequencies.
  • \[\begin{array}{r|cccccc}5 & 3 & & & & & & \\ 6 & 8 & 9 & & & & & \\ 7 & 0 & 0 & 0 & 5 & 6 & 7 & \\ 8 & 0 & 2 & 3 & 5 & 5 & 5 & 8 \\ 9 & 2 & 3 & 6 & & & & \\ 10 & 0 & & & & & &\end{array}\]
  • Noting that \(n = 10\) the relative frequency table is: \[\begin{array}{c|cccc}x & -1 & 0 & 1 & 2 \\ \hline f ∕ n & 0.3 & 0.4 & 0.2 & 0.1\end{array}\]
  • Since \(n\) is unknown, \(a\) is unknown, so the histogram cannot be constructed.
  • \[\begin{array}{r|cccc}8 & 7 & & & & \\ 9 & 9 & & & & \\ 10 & 0 & 5 & 5 & 7 & 8 \\ 11 & 8 & 9 & & \\ 12 & 5 & & & &\end{array}\] Frequency and relative frequency histograms are similarly generated.
  • Noting \(n = 300\), the relative frequency table is therefore: \[\begin{array}{c|cccc}Blood\hspace{0.167em}Type & O & A & B & AB \\ \hline f ∕ n & 0.4533 & 0.4 & 0.1067 & 0.04\end{array}\] A relative frequency histogram is then generated.
  • The stem and leaf diagrams listed for Samples \(1,\, 2,\; \text{and}\; 3\) in that order: \[\begin{array}{c|ccccc}6 & & & & & \\ 7 & & & & & \\ 8 & 7 & & & & \\ 9 & 9 & & & & \\ 10 & 0 & 5 & 5 & 7 & 8 \\ 11 & 8 & 9 & & & \\ 12 & 5 & & & & \\ 13 & & & & & \\ 14 & & & & & \\ 15 & & & & & \\ 16 & & & & &\end{array}\]

\[\begin{array}{c|ccccc}6 & & & & & \\ 7 & & & & & \\ 8 & & & & & \\ 9 & & & & & \\ 10 & & & & & \\ 11 & & & & & \\ 12 & & & & & \\ 13 & 3 & 7 & 8 & 8 & 9 \\ 14 & 0 & 2 & 5 & & \\ 15 & 2 & & & & \\ 16 & 0 & & & &\end{array}\]

\[\begin{array}{c|ccccccc}6 & 0 & & & & \\ 7 & 4 & 9 & & & \\ 8 & 0 & 0 & 2 & 2 & 2 & 2 & 3 \\ 9 & & & & & \\ 10 & & & & & \\ 11 & & & & & \\ 12 & & & & & \\ 13 & & & & & \\ 14 & & & & & \\ 15 & & & & & \\ 16 & & & & &\end{array}\]

The frequency tables are given below in the same order:

\[\begin{array}{c|ccc}Length\hspace{0.167em} & 80 \sim 89 & 90 \sim 99 & 100 \sim 109 \\ \hline f & 1 & 1 & 5\end{array}\]

\[\begin{array}{c|cc}Length\hspace{0.167em} & 110 \sim 119 & 120 \sim 129 \\ \hline f & 2 & 1\end{array}\]

\[\begin{array}{c|ccc}Length\hspace{0.167em} & 130 \sim 139 & 140 \sim 149 & 150 \sim 159 \\ \hline f & 5 & 3 & 1\end{array}\]

\[\begin{array}{c|ccc}Length\hspace{0.167em} & 160 \sim 169 \\ \hline f & 1\end{array}\]

\[\begin{array}{c|ccc}Length\hspace{0.167em} & 60 \sim 69 & 70 \sim 79 & 80 \sim 89 \\ \hline f & 1 & 2 & 7\end{array}\]

The relative frequency tables are also given below in the same order:

2.2: Measures of Central Location

  • \(\sum x^2\)
  • \(\sum (x-3)\)
  • \(\sum (x-3)^2\)
  • \(\sum (x-1)\)
  • \(\sum (x-1)^2\)
  • Find the mean, the median, and the mode for the sample \[1\; 2\; 3\; 4\]
  • Find the mean, the median, and the mode for the sample \[3\; 3\; 4\; 4\]
  • Find the mean, the median, and the mode for the sample \[2\; 1\; 2\; 7\]
  • Find the mean, the median, and the mode for the sample \[-1\; 0\; 1\; 4\; 1\; 1\]
  • Find the mean, the median, and the mode for the sample data represented by the table \[\begin{array}{c|c c c}x & 1 & 2 & 7 \\ \hline f & 1 & 2 & 1\\ \end{array}\]
  • Find the mean, the median, and the mode for the sample data represented by the table \[\begin{array}{c|c c c c}x & -1 & 0 & 1 & 4 \\ \hline f & 1 & 1 & 3 & 1\\ \end{array}\]
  • Create a sample data set of size \(n=3\) for which the mean \(\bar{x}\) is greater than the median \(\tilde{x}\).
  • Create a sample data set of size \(n=3\) for which the mean \(\bar{x}\) is less than the median \(\tilde{x}\).
  • Create a sample data set of size \(n=4\) for which the mean \(\bar{x}\), the median \(\tilde{x}\), and the mode are all identical.
  • Create a sample data set of size \(n=4\) for which the median \(\tilde{x}\) and the mode are identical but the mean \(\bar{x}\) is different.
  • Find the mean and the median for the LDL cholesterol level in a sample of ten heart patients. \[\begin{matrix} 132 & 162 & 133 & 145 & 148\\ 139 & 147 & 160 & 150 & 153 \end{matrix}\]
  • Find the mean and the median, for the LDL cholesterol level in a sample of ten heart patients on a special diet. \[\begin{matrix} 127 & 152 & 138 & 110 & 152\\ 113 & 131 & 148 & 135 & 158 \end{matrix}\]
  • Find the mean, the median, and the mode for the number of vehicles owned in a survey of \(52\) households. \[\begin{array}{c|c c c c c c c c} x & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7\\ \hline f &2 &12 &15 &11 &6 &3 &1 &2\\ \end{array}\]
  • The number of passengers in each of \(120\) randomly observed vehicles during morning rush hour was recorded, with the following results. \[\begin{array}{c|c c c c c } x & 1 & 2 & 3 & 4 & 5\\ \hline f &84 &29 &3 &3 &1\\ \end{array}\]Find the mean, the median, and the mode of this data set.
  • Twenty-five \(1-lb\) boxes of \(16d\) nails were randomly selected and the number of nails in each box was counted, with the following results. \[\begin{array}{c|c c c c c } x & 47 & 48 & 49 & 50 & 51\\ \hline f &1 &3 &18 &2 &1\\ \end{array}\]Find the mean, the median, and the mode of this data set.
  • Can you find the sample mean for the data set? If so, find it. If not, why not?
  • Can you find the sample median for the data set? If so, find it. If not, why not?
  • Can you find the sample mean for the data set? If so, find it. If not, explain why not.
  • Can you find the sample median for the data set? If so, find it. If not, explain why not.
  • A player keeps track of all the rolls of a pair of dice when playing a board game and obtains the following data. \[\begin{array}{c|c c c c c c } x & 2 & 3 & 4 & 5 & 6 & 7\\ \hline f &10 &29 &40 &56 &68 &77 \\ \end{array}\] \[\begin{array}{c|c c c c c } x & 8 & 9 & 10 & 11 & 12 \\ \hline f &67 &55 &39 &28 &11 \\ \end{array}\]Find the mean, the median, and the mode.
  • Based on the frequencies, do you expect the mean and the median to be about the same or markedly different, and why?
  • Compute the mean, the median, and the mode.
  • Based on the shape of the display, do you expect the mean and the median to be about the same or markedly different, and why?
  • Find the mean of the data.
  • Find the median of the data.
  • Construct a data set consisting of ten numbers, all but one of which is above average, where the average is the mean.
  • Is it possible to construct a data set as in part (a) when the average is the median? Explain.
  • Show that no matter what kind of average is used (mean, median, or mode) it is impossible for all members of a data set to be above average.
  • Twenty sacks of grain weigh a total of \(1,003\; lb\). What is the mean weight per sack?
  • Can the median weight per sack be calculated based on the information given? If not, construct two data sets with the same total but different medians.
  • Compute the mean, median, and mode.
  • Form a new data set, \(\text{Data Set II}\), by adding \(3\) to each number in \(\text{Data Set I}\). Calculate the mean, median, and mode of \(\text{Data Set II}\).
  • Form a new data set, \(\text{Data Set III}\), by subtracting \(6\) from each number in \(\text{Data Set I}\). Calculate the mean, median, and mode of \(\text{Data Set III}\).
  • Comparing the answers to parts (a), (b), and (c), can you guess the pattern? State the general principle that you expect to be true.

Large Data Set Exercises

Note: For Large Data Set Exercises below, all of the data sets associated with these questions are missing, but the questions themselves are included here for reference.

  • Compute the mean and median of the \(1,000\) SAT scores.
  • Compute the mean and median of the \(1,000\) GPAs.
  • Regard the data as arising from a census of all students at a high school, in which the SAT score of every student was measured. Compute the population mean \(\mu\).
  • Regard the first \(25\) observations as a random sample drawn from this population. Compute the sample mean \(\bar{x}\) and compare it to \(\mu\).
  • Regard the next \(25\) observations as a random sample drawn from this population. Compute the sample mean \(\bar{x}\) and compare it to \(\mu\).
  • Regard the data as arising from a census of all freshman at a small college at the end of their first academic year of college study, in which the GPA of every such person was measured. Compute the population mean \(\mu\).
  • Compute the mean and median survival time for all mice, without regard to gender.
  • Compute the mean and median survival time for the \(65\) male mice (separately recorded in Large \(\text{Data Set 7A}\)).
  • Compute the mean and median survival time for the \(75\) female mice (separately recorded in Large \(\text{Data Set 7B}\)).
  • \(\bar x= 2.5,\; \tilde{x} = 2.5,\; \text{mode} = \{1,2,3,4\}\)
  • \(\bar x= 3,\; \tilde{x} = 2,\; \text{mode} = 2\)
  • \(\{0, 0, 3\}\)
  • \(\{0, 1, 1, 2\}\)
  • \(\bar x = 146.9,\; \tilde x = 147.5\)
  • \(\bar x=2.6 ,\; \tilde{x} = 2,\; \text{mode} = 2\)
  • \(\bar x= 48.96,\; \tilde{x} = 49,\; \text{mode} = 49\)
  • No, the survival times of the fourth and fifth mice are unknown.
  • Yes, \(\tilde{x}=421\).
  • \(\bar x= 28.55,\; \tilde{x} = 28,\; \text{mode} = 28\)
  • \(\bar x= 2.05,\; \tilde{x} = 2,\; \text{mode} = 1\)
  • Mean : \(nx_{min}\leq \sum x\) so dividing by \(n\) yields \(x_{min}\leq \bar{x}\), so the minimum value is not above average. Median : the middle measurement, or average of the two middle measurements, \(\tilde{x}\), is at least as large as \(x_{min}\), so the minimum value is not above average. Mode : the mode is one of the measurements, and is not greater than itself
  • \(\bar x= 3.18,\; \tilde{x} = 3,\; \text{mode} = 5\)
  • \(\bar x= 6.18,\; \tilde{x} = 6,\; \text{mode} = 8\)
  • \(\bar x= -2.81,\; \tilde{x} = -3,\; \text{mode} = -1\)
  • If a number is added to every measurement in a data set, then the mean, median, and mode all change by that number.
  • \(\mu = 1528.74\)
  • \(\bar{x}=1502.8\)
  • \(\bar{x}=1532.2\)
  • \(\bar x= 553.4286,\; \tilde{x} = 552.5\)
  • \(\bar x= 665.9692,\; \tilde{x} = 667\)
  • \(\bar x= 455.8933,\; \tilde{x} = 448\)

2.3 Measures of Variability

\[1\; 2\; 3\; 4\]

\[2\; -3\; 6\; 0\; 3\; 1\]

\[2\; 1\; 2\; 7\]

\[-1\; 0\; 1\; 4\; 1\; 1\]

\[\begin{array}{c|c c c} x & 1 & 2 & 7 \\ \hline f &1 &2 &1\\ \end{array}\]

\[\begin{array}{c|c c c c} x & -1 & 0 & 1 & 4 \\ \hline f &1 &1 &3 &1\\ \end{array}\]

\[\begin{matrix} 132 & 162 & 133 & 145 & 148\\ 139 & 147 & 160 & 150 & 153 \end{matrix}\]

\[\begin{matrix} 142 & 152 & 138 & 145 & 148\\ 139 & 147 & 155 & 150 & 153 \end{matrix}\]

Consider the data set represented by the table \[\begin{array}{c|c c c c c c c} x & 26 & 27 & 28 & 29 & 30 & 31 & 32 \\ \hline f &3 &4 &16 &12 &6 &2 &1\\ \end{array}\]

  • Use the frequency table to find that \(\sum x=1256\) and \(\sum x^2=35,926\).
  • Use the information in part (a) to compute the sample mean and the sample standard deviation.

\[\begin{array}{c|c c c c c} x & 1 & 2 & 3 & 4 & 5 \\ \hline f &384 &208 &98 &56 &28 \\ \end{array}\]

\[\begin{array}{c|c c c c c} x & 6 & 7 & 8 & 9 & 10 \\ \hline f &12 &8 &2 &3 &1 \\ \end{array}\]

A random sample of \(49\) invoices for repairs at an automotive body shop is taken. The data are arrayed in the stem and leaf diagram shown. (Stems are thousands of dollars, leaves are hundreds, so that for example the largest observation is \(3,800\).)

\[\begin{array}{c|c c c c c c c c c c c} 3 & 5 & 6 & 8 \\ 3 &0 &0 &1 &1 &2 &4 \\ 2 &5 &6 &6 &7 &7 &8 &8 &9 &9 \\ 2 &0 &0 &0 &0 &1 &2 &2 &4 \\ 1 &5 &5 &5 &6 &6 &7 &7 &7 &8 &8 &9 \\ 1 &0 &0 &1 &3 &4 &4 &4 \\ 0 &5 &6 &8 &8 \\ 0 &4 \end{array}\]

For these data, \(\sum x=101\), \(\sum x^2=244,830,000\).

  • Compute the range.
  • Compute the sample standard deviation.

What must be true of a data set if its standard deviation is \(0\)?

A data set consisting of \(25\) measurements has standard deviation \(0\). One of the measurements has value \(17\). What are the other \(24\) measurements?

Create a sample data set of size \(n=3\) for which the range is \(0\) and the sample mean is \(2\).

Create a sample data set of size \(n=3\) for which the sample variance is \(0\) and the sample mean is \(1\).

The sample \(\{-1,0,1\}\) has mean \(\bar{x}=0\) and standard deviation \(\bar{x}=0\). Create a sample data set of size \(n=3\) for which \(\bar{x}=0\) and \(s\) is greater than \(1\).

The sample \(\{-1,0,1\}\) has mean \(\bar{x}=0\) and standard deviation \(\bar{x}=0\). Create a sample data set of size \(n=3\) for which \(\bar{x}=0\) and the standard deviation \(s\) is less than \(1\).

\[5\; -2\; 6\; 1\; 4\; -3\; 0\; 1\; 4\; 3\; 2\; 5\]

  • Compute the sample standard deviation of \(\text{Data Set I}\).
  • Form a new data set, \(\text{Data Set II}\), by adding \(3\) to each number in \(\text{Data Set I}\). Calculate the sample standard deviation of \(\text{Data Set II}\).
  • Form a new data set, \(\text{Data Set III}\), by subtracting \(6\) from each number in \(\text{Data Set I}\). Calculate the sample standard deviation of \(\text{Data Set III}\).

\(\text{Large Data Set 1}\) lists the SAT scores and GPAs of \(1,000\) students.

  • Compute the range and sample standard deviation of the \(1,000\) SAT scores.
  • Compute the range and sample standard deviation of the \(1,000\) GPAs.

\(\text{Large Data Set 1}\) lists the SAT scores of \(1,000\) students.

  • Regard the data as arising from a census of all students at a high school, in which the SAT score of every student was measured. Compute the population range and population standard deviation \(\sigma\).
  • Regard the first \(25\) observations as a random sample drawn from this population. Compute the sample range and sample standard deviation \(s\) and compare them to the population range and \(\sigma\).
  • Regard the next \(25\) observations as a random sample drawn from this population. Compute the sample range and sample standard deviation \(s\) and compare them to the population range and \(\sigma\).
  • Regard the data as arising from a census of all freshman at a small college at the end of their first academic year of college study, in which the GPA of every such person was measured. Compute the population range and population standard deviation \(\sigma\).

\(\text{Large Data Set 7, 7A, and 7B }\) list the survival times in days of \(140\) laboratory mice with thymic leukemia from onset to death.

  • Compute the range and sample standard deviation of survival time for all mice, without regard to gender.
  • Compute the range and sample standard deviation of survival time for the \(65\) male mice (separately recorded in \(\text{Large Data Set 7A}\)).
  • Compute the range and sample standard deviation of survival time for the \(75\) female mice (separately recorded in \(\text{Large Data Set 7B}\)). Do you see a difference in the results for male and female mice? Does it appear to be significant?
  • \(R = 3,\; s^2 = 1.7,\; s = 1.3\).
  • \(R = 6,\; s^2=7.\bar{3},\; s = 2.7\).
  • \(R = 6,\; s^2=7.3,\; s = 2.7\).
  • \(R = 30,\; s^2 = 103.2,\; s = 10.2\).
  • \(\bar{x}=28.55,\; s = 1.3\).
  • \(\bar{x}=2063,\; \tilde{x} =2000,\; \text{mode}=2000\).
  • \(R = 3400\).
  • \(s = 869\).
  • All are \(17\).
  • \(\{1,1,1\}\)
  • One example is \(\{-.5,0,.5\}\).
  • \(R = 1350\) and \(s = 212.5455\)
  • \(R = 4.00\) and \(s = 0.7407\)
  • \(R = 4.00\) and \(\sigma = 0.740375\)
  • \(R = 3.04\) and \(s = 0.808045\)
  • \(R = 2.49\) and \(s = 0.657843\)

2.4 Relative Position of Data

  • Find the percentile rank of \(82\).
  • Find the percentile rank of \(68\).
  • Find the percentile rank of \(6.5\).
  • Find the percentile rank of \(7.7\).
  • Find the percentile rank of the grade \(75\).
  • Find the percentile rank of the grade \(57\).
  • Is the \(90^{th}\) percentile of a data set always equal to \(90\%\)? Why or why not?
  • Approximately what percentage of the observations are less than \(5\)?
  • Approximately what percentage of the observations are greater than \(5\)?
  • Approximately what percentage of the observations are less than \(98.6\)?
  • Approximately what percentage of the observations are greater than \(98.6\)?
  • In a large data set the \(29^{th}\) percentile is \(5\) and the \(79^{th}\) percentile is \(10\). Approximately what percentage of observations lie between \(5\) and \(10\)?
  • In a large data set the \(40^{th}\) percentile is \(125\) and the \(82^{nd}\) percentile is \(158\). Approximately what percentage of observations lie between \(125\) and \(158\)?
  • Find the five-number summary and the IQR and sketch the box plot for the sample represented by the stem and leaf diagram in Figure 2.1.2 "Ordered Stem and Leaf Diagram".
  • Find the five-number summary and the IQR and sketch the box plot for the sample explicitly displayed in "Example 2.2.7" in Section 2.2.
  • Find the five-number summary and the IQR and sketch the box plot for the sample represented by the data frequency table \[\begin{array}{c|c c c c c} x & 1 & 2 & 5 & 8 & 9 \\ \hline f &5 &2 &3 &6 &4\\ \end{array}\]
  • Find the five-number summary and the IQR and sketch the box plot for the sample represented by the data frequency table \[\begin{array}{c|c c c c c c c c c} x & -5 & -3 & -2 & -1 & 0 & 1 & 3 & 4 & 5 \\ \hline f &2 &1 &3 &2 &4 &1 &1 &2 &1\\ \end{array}\]
  • Find the \(z\)-score of each measurement in the following sample data set. \[-5\; \; 6\; \; 2\; \; -1\; \; 0\]
  • Find the \(z\)-score of each measurement in the following sample data set. \[1.6\; \; 5.2\; \; 2.8\; \; 3.7\; \; 4.0\]
  • The sample with data frequency table \[\begin{array}{c|c c c} x & 1 & 2 & 7 \\ \hline f &1 &2 &1\\ \end{array}\] has mean \(\bar{x}=3\) and standard deviation \(s\approx 2.71\). Find the \(z\)-score for every value in the sample.
  • The sample with data frequency table \[\begin{array}{c|c c c c} x & -1 & 0 & 1 & 4 \\ \hline f &1 &1 &3 &1\\ \end{array}\] has mean \(\bar{x}=1\) and standard deviation \(s\approx 1.67\). Find the \(z\)-score for every value in the sample.
  • The population mean \(\mu\).
  • The population variance \(\sigma ^2\).
  • The population standard deviation \(\sigma \).
  • The \(z\)-score for every value in the population data set.
  • A measurement \(x\) in a sample with mean \(\bar{x}=10\) and standard deviation \(s=3\) has \(z\)-score \(z=2\). Find \(x\).
  • A measurement \(x\) in a sample with mean \(\bar{x}=10\) and standard deviation \(s=3\) has \(z\)-score \(z=-1\). Find \(x\).
  • A measurement \(x\) in a population with mean \(\mu =2.3\) and standard deviation \(\sigma =1.3\) has \(z\)-score \(z=2\). Find \(x\).
  • A measurement \(x\) in a sample with mean \(\mu =2.3\) and standard deviation \(\sigma =1.3\) has \(z\)-score \(z=-1.2\). Find \(x\).
  • Find the percentile rank of \(15\).
  • If the sample accurately reflects the population, then what percentage of weeks would an inventory of \(15\) rice cookers be adequate?
  • Find the percentile rank of \(2\).
  • If the sample accurately reflects the population, then what percentage of households have at most two vehicles?
  • Find the percentile rank of \(30\), the time she has to get to work.
  • Assuming that the sample accurately reflects the population of all of Cordelia’s commute times, use your answer to part (a) to predict the proportion of the work days she is late for work.
  • Is Dromio’s score above average or below average?
  • What was Dromio’s actual score on the exam?
  • Find the \(z\)-score of the repair that cost \(\$1,100\).
  • Find the \(z\)-score of the repairs that cost \(\$2,700\).
  • Find the quartiles.
  • Give the five-number summary of the data.
  • Find the range and the IQR.
  • Find the three quartiles.
  • Find the percentile rank of \(800\).
  • Find the percentile rank of \(3,200\).
  • Find the five-number summary for the following sample data. \[\begin{array}{c|c c c c c c c} x &26 &27 &28 &29 &30 &31 &32 \\ \hline f &3 &4 &16 &12 &6 &2 &1\\ \end{array}\]
  • Find the five-number summary for the following sample data. \[\begin{array}{c|c c c c c c c c c c} x &1 &2 &3 &4 &5 &6 &7 &8 &9 &10 \\ \hline f &384 &208 &98 &56 &28 &12 &8 &2 &3 &1\\ \end{array}\]
  • Find the IQR.
  • Determine whether the following statement is true. “In any data set, if an observation \(x_1\) is greater than another observation \(x_2\), then the \(z\)-score of \(x_1\) is greater than the \(z\)-score of \(x_2\) . ”
  • Emilia and Ferdinand took the same freshman chemistry course, Emilia in the fall, Ferdinand in the spring. Emilia made an \(83\) on the common final exam that she took, on which the mean was \(76\) and the standard deviation \(8\). Ferdinand made a \(79\) on the common final exam that he took, which was more difficult, since the mean was \(65\) and the standard deviation \(12\). The one who has a higher \(z\)-score did relatively better. Was it Emilia or Ferdinand?
  • Refer to the previous exercise. On the final exam in the same course the following semester, the mean is \(68\) and the standard deviation is \(9\). What grade on the exam matches Emilia’s performance? Ferdinand’s?
  • Rosencrantz and Guildenstern are on a weight-reducing diet. Rosencrantz, who weighs \(178\; lb\), belongs to an age and body-type group for which the mean weight is \(145\; lb\) and the standard deviation is \(15\; lb\). Guildenstern, who weighs \(204\; lb\), belongs to an age and body-type group for which the mean weight is \(165\; lb\) and the standard deviation is \(20\; lb\). Assuming z -scores are good measures for comparison in this context, who is more overweight for his age and body type?
  • Compute the three quartiles and the interquartile range of the \(1,000\) SAT scores.
  • Compute the three quartiles and the interquartile range of the \(1,000\) GPAs.
  • Compute the five-number summary of the data.
  • Describe in words the performance of the class on the exam in the light of the result in part (a).
  • Compute the five-number summary of the heights, without regard to gender.
  • Compute the five-number summary of the heights of the men in the sample.
  • Compute the five-number summary of the heights of the women in the sample.
  • Compute the three quartiles and the interquartile range of the survival times for all mice, without regard to gender.
  • Compute the three quartiles and the interquartile range of the survival times for the \(65\) male mice (separately recorded in \(\text{Data Set 7A}\)).
  • Compute the three quartiles and the interquartile range of the survival times for the \(75\) female mice (separately recorded in \(\text{Data Sets 7B}\)).
  • \(x_{min}=25,\; \; Q_1=70,\; \; Q_2=77.5\; \; Q_3=90,\; \; x_{max}=100, \; \; IQR=20\)
  • \(x_{min}=1,\; \; Q_1=1.5,\; \; Q_2=6.5\; \; Q_3=8,\; \; x_{max}=9, \; \; IQR=6.5\)
  • \(-1.3,\; 1.39,\; 0.4,\; -0.35,\; -0.11\)
  • \(z=-0.74\; \text{for}\; x = 1,\; z=-0.37\; \text{for}\; x = 2,\; z = 1.48\; \text{for}\; x = 7\)
  • \(z=-1\; \text{for}\; x = 0,\; z=1\; \text{for}\; x = 2\)
  • \(Q_1=59,\; Q_2=70,\; Q_3=81\)
  • \(x_{min}=39,\; Q_1=59,\; Q_2=70,\; Q_3=81,\; x_{max}=100\)
  • \(R = 61,\; IQR=22\)
  • \(x_{min}=26,\; Q_1=28,\; Q_2=28,\; Q_3=29,\; x_{max}=32\)
  • \(Q_1=1450,\; Q_2=2000,\; Q_3=2800\)
  • \(IQR=1350\)
  • \(x_{min}=400,\; Q_1=1450,\; Q_2=2000,\; Q_3=2800,\; x_{max}=3800\)
  • Emilia: \(z=0.875\), Ferdinand: \(z=1.1\bar{6}\)
  • Rosencrantz: \(z=2.2\), Guildenstern: \(z=1.95\). Rosencrantz is more overweight for his age and body type.
  • \(x_{min}=15,\; Q_1=51,\; Q_2=67,\; Q_3=82,\; x_{max}=97\)
  • The data set appears to be skewed to the left.
  • \(Q_1=440,\; Q_2=552.5,\; Q_3=661\; \; \text{and}\; \; IQR=221\)
  • \(Q_1=641,\; Q_2=667,\; Q_3=700\; \; \text{and}\; \; IQR=59\)
  • \(Q_1=407,\; Q_2=448,\; Q_3=504\; \; \text{and}\; \; IQR=97\)

2.5 The Empirical Rule and Chebyshev's Theorem

  • State the Empirical Rule.
  • Describe the conditions under which the Empirical Rule may be applied.
  • State Chebyshev’s Theorem.
  • Describe the conditions under which Chebyshev’s Theorem may be applied.
  • between \(4\) and \(8\);
  • between \(2\) and \(10\);
  • between \(0\) and \(12\).
  • above \(2\);
  • above \(3.1\);
  • between \(2\) and \(3.1\).
  • below \(-0.2\);
  • below \(3.1\);
  • between \(-1.3\) and \(0.9\).
  • between \(0\) and \(12\);
  • between \(4\) and \(8\).
  • between \(-0.2\) and \(4.2\);
  • between \(-1.3\) and \(5.3\).
  • between \(3\) and \(7.4\);
  • between \(1.9\) and \(8.5\).
  • between \(-2\) and \(6\) (including \(-2\) and \(6\));
  • between \(-4\) and \(8\) (including \(-4\) and \(8\)).
  • What is the maximum proportion of observations in the data set that can lie outside the interval \((2,10)\)?
  • What can be said about the proportion of observations in the data set that are below \(2\)?
  • What can be said about the proportion of observations in the data set that are above \(10\)?
  • What can be said about the number of observations in the data set that are above \(10\)?
  • What is the maximum proportion of observations in the data set that can lie outside the interval \((-1.3,5.3)\)?
  • What can be said about the proportion of observations in the data set that are below \(-1.3\)?
  • What can be said about the proportion of observations in the data set that are above \(5.3\)?
  • What is the median score on the exam?
  • About how many students scored between \(63\) and \(81\)?
  • About how many students scored between \(72\) and \(90\)?
  • About how many students scored below \(54\)?
  • About what proportion of all fish caught are between \(20\) inches and \(26\) inches long?
  • About what proportion of all fish caught are between \(20\) inches and \(23\) inches long?
  • About how long is the longest fish caught (only a small fraction of a percent are longer)?
  • Hockey pucks used in professional hockey games must weigh between \(5.5\) and \(6\) ounces. If the weight of pucks manufactured by a particular process is bell-shaped, has mean \(5.75\) ounces and standard deviation \(0.125\) ounce, what proportion of the pucks will be usable in professional games?
  • Hockey pucks used in professional hockey games must weigh between \(5.5\) and \(6\) ounces. If the weight of pucks manufactured by a particular process is bell-shaped and has mean \(5.75\) ounces, how large can the standard deviation be if \(99.7\%\) of the pucks are to be usable in professional games?
  • If the speed limit is \(55\; mph\), about what proportion of vehicles are speeding?
  • What is the median speed for vehicles on this highway?
  • What is the percentile rank of the speed \(65\; mph\)?
  • What speed corresponds to the \(16_{th}\) percentile?
  • If the speed limit is \(55\; mph\), at least what proportion of vehicles must speeding?
  • What can be said about the proportion of vehicles going \(65\; mph\) or faster?
  • What is the median score?
  • Approximately what proportion of students in the class scored between \(70\) and \(80\)?
  • Approximately what proportion of students in the class scored above \(85\)?
  • What is the percentile rank of the score \(85\)?
  • The GPAs of all currently registered students at a large university have a bell-shaped distribution with mean \(2.7\) and standard deviation \(0.6\). Students with a GPA below \(1.5\) are placed on academic probation. Approximately what percentage of currently registered students at the university are on academic probation?
  • Thirty-six students took an exam on which the average was \(80\) and the standard deviation was \(6\). A rumor says that five students had scores \(61\) or below. Can the rumor be true? Why or why not?
  • Compute the mean and the standard deviation.
  • About how many of the measurements does the Empirical Rule predict will be in the interval \(\left (\bar{x}-s,\bar{x}+s \right )\), the interval \(\left (\bar{x}-2s,\bar{x}+2s \right )\), and the interval \(\left (\bar{x}-3s,\bar{x}+3s \right )\)?
  • Compute the number of measurements that are actually in each of the intervals listed in part (a), and compare to the predicted numbers.
  • What can be said about the number of observations that lie in the interval \((126,152)\)?
  • What can be said about the number of observations that lie in the interval \((113,165)\)?
  • What can be said about the number of observations that exceed \(165\)?
  • What can be said about the number of observations that either exceed \(165\) or are less than \(113\)?
  • Compute the sample mean and the sample standard deviation.
  • Considering the shape of the data set, do you expect the Empirical Rule to apply? Count the number of measurements within one standard deviation of the mean and compare it to the number predicted by the Empirical Rule.
  • What does Chebyshev’s Rule say about the number of measurements within one standard deviation of the mean?
  • Count the number of measurements within two standard deviations of the mean and compare it to the minimum number guaranteed by Chebyshev’s Theorem to lie in that interval.
  • See the displayed statement in the text.
  • At most \(0.25\).
  • At most \(7\).
  • By Chebyshev’s Theorem at most \(1/9\) of the scores can be below \(62\), so the rumor is impossible.
  • It is at least \(60\).
  • It is at most \(20\).
  • \(\bar{x}=48.96\), \(s = 0.7348\).
  • Roughly bell-shaped, the Empirical Rule should apply. True count : \(18\), P redicted : \(17\).
  • True count : \(23\), G uaranteed : at least \(18.75\), hence at least \(19\).

Contributor

  • Number System
  • Linear Algebra
  • Trigonometry
  • Probability
  • Discrete Mathematics
  • Engineering Math Practice Problems

Difference between Descriptive and Inferential statistics

Statistics is a vital discipline that empowers us to make sense of data by providing tools for collection, analysis, interpretation, and presentation. In every field, from engineering to social sciences, understanding data is crucial for making informed decisions and drawing accurate conclusions . This understanding is facilitated by two key branches of statistics: descriptive and inferential.

Table of Content

What is Statistics?

Types of statistics:.

Descriptive Statistics

Uses cases of Descriptive Statistics

Measures of central tendency, graphical representation, measures of dispersion, applications of descriptive statistics, 2. inferential statistics, uses cases of inferential statistics, hypothesis testing, regression analysis, applications of inferential statistics.

Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data . It is basically a collection of quantitative data. 

Statistics is a fundamental branch of mathematics that involves the collection, analysis, interpretation, presentation, and organization of data. statistics is divided into two main branches: descriptive statistics and inferential statistics . These two branches serve different purposes and are used in various fields, including engineering, social sciences, business, and healthcare. This article explores the definitions, characteristics, and applications of both descriptive and inferential statistics.

descriptive and inferential statistics assignment

Descriptive Statistics Inferential Statistics
It which describes the data in some manner. It using data drawn from the population.
It . It
It is used to describe a situation. It is .
It explains already known data and is limited to a sample or population having a small size. It attempts to reach the conclusion about the population.
It etc. It can be achieved by

Descriptive statistics is a term given to the analysis of data that helps to describe, show and summarize data in a meaningful way . It is a simple way to describe our data. Descriptive statistics is very important to present our raw data in effective/meaningful way using numerical calculations or graphs or tables . This type of statistics is applied to already known data. 

Descriptive statistics involves summarizing and organizing data to describe the main features of a dataset . It provides simple summaries about the sample and the measures. Descriptive statistics is primarily concerned with the presentation of data in a meaningful way, which includes graphical representation and numerical analysis.

  • Mean : The average of all data points.
  • Mode : The most frequently occurring value in a dataset.
  • Median : The middle value that separates the higher half from the lower half of the data.
  • Histograms : Bar graphs representing the frequency distribution of a dataset.
  • Pie Charts : Circular charts divided into sectors representing relative frequencies.
  • Box Plots : Graphical depiction of data through their quartiles.
  • Range : The difference between the maximum and minimum values.
  • Variance : The measure of how data points differ from the mean .
  • Standard Deviation : The square root of the variance, representing the average distance from the mean.

descriptive and inferential statistics assignment

  • Business Analysis : Summarizing sales data to identify trends and make informed business decisions.
  • Healthcare : Analyzing patient data to understand the distribution of health outcomes.
  • Engineering : Monitoring manufacturing processes through quality control charts to ensure consistency.

Inferential statistics is used to make predictions by taking any group of data in which you are interested . It can be defined as a random sample of data taken from a population to describe and make inferences about the population. Any group of data that includes all the data you are interested in is known as population. It basically allows you to make predictions by taking a small sample instead of working on the whole population.

Inferential-Statistics

  • Point Estimation : Provides a single value estimate of a population parameter (e.g., sample mean as an estimate of population mean).
  • Interval Estimation : Provides a range of values within which the population parameter is expected to lie (e.g., confidence intervals).
  • Null Hypothesis (H0) : A statement of no effect or no difference , which researchers aim to test against.
  • Alternative Hypothesis (H1) : A statement indicating the presence of an effect or difference.
  • p-value : The probability of observing the test results under the null hypothesis.
  • Significance Level (α) : The threshold for rejecting the null hypothesis , commonly set at 0.05.
  • Simple Linear Regression : Analyzing the relationship between two continuous variables.
  • Multiple Regression : Examining the relationship between one dependent variable and multiple independent variables.
  • Market Research : Making predictions about consumer behavior based on survey samples.
  • Clinical Trials : Drawing conclusions about the effectiveness of new treatments from sample data.
  • Engineering : Predicting product performance and reliability through sample testing and analysis.

Descriptive and inferential statistics are essential tools in the field of statistics, each serving distinct but complementary purposes. Descriptive statistics focuses on summarizing and presenting data to highlight its main features, while inferential statistics aims to make predictions and generalizations about a population based on sample data. Understanding and applying these two branches of statistics enables researchers, analysts, and engineers to make informed decisions, draw meaningful conclusions , and advance knowledge in their respective fields.

Descriptive and Inferential statistics- FAQs

What is statistics used for.

Statistics is used to analyze data, make informed decisions, predict outcomes, and ensure quality and consistency in various fields such as business, healthcare, and scientific research

What are the two types of inferential statistics?

hypothesis testing and regression analysis are two main types of inferential statistics 

What are the types of descriptive statistics?

Measures of Central Tendency, Graphical Representation, Measures of Dispersion are some types of descriptive statistics.

Who is the father of Statistics?

Sir Ronald Aylmer Fisher, a British Genius, is widely considered as the father of modern statistics.

author

Please Login to comment...

Similar reads.

  • Difference Between
  • Engineering Mathematics

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

A Comparative Analysis Of Descriptive And Inferential Statistics

descriptive and inferential statistics assignment

Introduction Statistics is a branch of mathematics. The subject focuses on collection, management, examination, interpretation and demonstration of the data. Statistical analysis consists of two types, descriptive and inferential statistics. The two concepts play a vital role during any statistical analysis. Briefly speaking, through descriptive statistics, huge volumes of data can be analyzed by using charts and tables. The entire data is analyzed in this process to draw conclusions rather than just using samples. On the other hand, during inferential statistics, data is taken from available samples to generate a hypothesis or test the already existing hypothesis. A researcher studies the samples in order to reach at a conclusion about a specific population.

descriptive and inferential statistics

The blog on descriptive and inferential statistics would help you to understand both the concept in detail and would include the following points:

  • Definition of descriptive statistics
  • Definition of inferential statistics
  • The distinction between descriptive and inferential statistics
  • The need for statistical software in analyzing the data

So, let’s sit down and study the above points in details in this blog on descriptive and inferential statistics without wasting any time.

Definition of descriptive statistics Descriptive statistical analysis makes reports and graphs with the help of data visualization software are for companies to understand a particular event or point that occurred in the past. The name itself signifies that the analysis would be in a description form. It does not lead to a concrete conclusion; rather, it helps in describing the data. Descriptive statistics consist of minute constants which help in outlining and guiding the data set. The data set to be analyzed may be complete or based upon a given sample of the population. Measures of central tendency, along with measures of variability or spread, are the two divisions in the descriptive statistics. The below figure demonstrates the raw data details and the descriptive statistics generated in the form charts and tables.

descriptive and inferential statistics

A measure of central tendency means median and mode. When a raw data is provided with the measure of central tendency, it helps set the position of a frequency distribution. As per the experts of descriptive and inferential statistics, central tendency measures are not specific to a particular condition; it is preferred for types of situations and conditions.

Mean: It is used when there is a continuous flow of data, and the other name of the mean is arithmetic average.

Median: It is used in splitting a given data into two halves. One of the parts is smaller in nature than the other part. It can be used when there is a continuous or ordinal flow of data.

Mode: It consists of a large number of data and is basically used for categorical data.

Now, let’s move on to the discussion of groups that are used in the descriptive statistics.

descriptive and inferential statistics assignment

Dispersion: By measuring the dispersion, the data can be extended in the descriptive statistics. An individual can measure the dispersion with the help of deviation and variance.

Central tendency: The mean, mode and median can determine the centre of the data. By using this measure on the data set, a descriptive summary with a single value can be gained.

Skewedness: When there is a distortion in the bell curve, or there is uneven distribution, it is referred to as skewedness. As per the experts of descriptive and inferential statistics, by looking at the distribution of the values, it can be said whether it is symmetrical or skewed.

Definition of inferential statistics Inferential statistics help in drawing conclusions about the larger population basis a sample. It helps in testing different hypothesis related to the given data. To define the term inferential statistics, we first need to understand how the term population is used in statistics. Population in statistics does not necessarily imply that the human population; rather, the term is used for the complete raw data to be analyzed by conducting the descriptive and inferential statistics. Under certain situations, a person may be asked to analyze incomplete data, and in such instances, the person can use the sample data for his analysis work. If in case you want to conduct a study on cancer survivors below the age of 16 years, residing around the globe, you won’t be able to get accurate data. So in such cases, where the total population is not specific or complete, you need to consider a sample data.

descriptive and inferential statistics

When an analysis is to be performed basis a sample population, there are some techniques in inferential statistics applying which inductive reasoning can be generated about the sample population data considered for the analysis. The analyst can reach at a generalized conclusion through the process of inductive reasoning. The analyst can also try to represent a near to accurate result in his analysis by using the sampling process on the given population. Inferential statistics can be majorly used in works related to data science.

But the fact that this sampling process cannot generate an accurate result is true as the sample data used may have some errors or discrepancies which may lead to inaccurate and inconsistent interpretations. It is preferable to use the probability theory while applying inferential statistics.

descriptive and inferential statistics assignment

Methods usually applied in inferential statistics are as follows:

Parameter estimation: The entire raw population has some descriptive estimates known as parameters. When the analysis is to be done on a random population sample, then the process is termed as sample statistics. Through this method, the analyst discovers the estimate of a complete population with a sample’s support, but the estimation generated through this process may not be precise.

Statistical hypothesis testing: The aim of this method is to differentiate basis population or verify the connection between variables with the help of samples. With this method’s help, conclusions can be drawn for the entire population basis a sample population provided.

Regression analysis: It explains the connection between a given set of independent and dependent variables. With the help of hypothesis testing, the analysis determines the existence of a relationship as presented in the sample data.

The distinction between descriptive and inferential statistics Although both descriptive and inferential statistics are used to perform analysis on a given set of data but there lays a huge difference between them in terms process and interpretation of data. The key distinction between descriptive and inferential statistics can be generated on the following points:

When using a descriptive statistics, the analyst has access to the entire raw population data whereas inferential statistics are used by analyst considering some part of the data when the population is too large and cannot be collected or compiled in a single attempt.

Descriptive statistics are used when there is no sampling process requirement, whereas inferential statistics are completely based upon the sampling process and sample parameters.

The descriptive statistics have some properties parameter for the raw population, namely mean, median and mode, whereas in inferential statistics properties parameter for the raw population is termed as statistics.

There are some limitations in descriptive statistics, and it can be applied when the data is actually measured in reality whereas in inferential statistics, sample data from the large population is applied, so there are no such limitations in representing the population data.

It is claimed that descriptive statistics can generate 100 per cent accuracy as it is based upon the complete raw data of the population without any assumptions whereas analysis based on inferential statistics uses sample data and the results can be speculative. There is no guarantee of 100 per cent accuracy in this method as conclusions are drawn based upon some sample population data.

Descriptive statistics help present meaningful data, whereas inferential statistics help compare the data, make hypothesis and predictions.

Descriptive statistics are used in describing a situation, whereas inferential statistics are used to describe the occurrence of any future event.

The descriptive statistics can be explained with the help of graphs, charts and tables, whereas inferential statistics can be explained through probability.

The need for statistical software in analyzing the data Conducting research requires analysis of huge data sets, and this can be easily done with the application of different statistical software’s tools. Application of such software’s smooth’s the process of descriptive and inferential statistics analysis. It brings accuracy and precision in the results of the analysis and saves time. Using the statistical software during the analysis of descriptive and inferential statistics has great advantages as compared to the analysis conducted manually. The below points will help out in understanding the need for statistical software:

Reduction in error during the sampling process: The success of research depends upon the type of analysis performed on a given set of data. If there are any errors in collecting data or during its processing, then the entire analysis can be considered useless and vague. The sampling process may have some errors when there is a deviation between the actual data and the sample data. The larger database can be accessed with the help of statistical software to generate error-free and customized analysis. It reduces manual intervention, thereby reducing the workload on an individual or a group while performing descriptive and inferential statistics. Majority of the software has an automatic feature that nullifies the requirement of revising the data again and again on its usage.

Accurate result and easy solution: With the help of statistical software, complex questions or problems can be solved easily. In case of analyzing a limited data, it would be the best option to look out for. But in the case of large data, the simplification of the solution could be problematic and may lead to inaccurate analysis. Easy solution and accurate results can be achieved with the application of statistical software’s. There are some features in the software that make handling the analysis an easy way out using multivariate analysis, statistical process control and regression analysis. The three features mentioned are few; there are various other features as well as making the task of handling descriptive and inferential statistics an easy one. The features of the statistical software safeguard the data and the results can be easily understood.

Helping business houses: The statistical software’s can be used by businesses to evaluate different areas to their advantage. The advantages may include judging its employees’ performance, finding a changing behavioural pattern of its consumers, driving audits, and understanding sales performance in different locations. With the help of statistical software’s the business can make future predictions and maximize its profits.

Topmost preferable statistical software’s If you search for statistical software’s to conduct a descriptive and inferential statistics analysis, you are sure to come across a variety of them. You need to choose the best among all, and it purely depends upon the type of population data and the results that you are willing to see. The software’s are designed with some special programs that allow its users to feed any amount of data and easily come up with a conclusion that was not manually possible. The software’s are basically used by mathematicians, data scientists and industries. Each tool has a unique feature which sets it apart from the rest and depending upon the feature, and you may choose the one that suits you best. Some of the preferable statistical software’s for the analysis of descriptive and inferential statistics have been discussed below:

IBM SPSS Statistics: Industries use the software to solve business-specific issues and reach a correct decision. It has some customized features which can be seen while generating graphs and reports. The software can help generate probabilities, make predictions for future events, plan activities for the benefit of the community, and fulfil objectives and goals.

R Studio: The tool has been created for statistical and data science computing. Individuals and teams can use it at the same time. The resources can be shared in order to generate results to be used by the decisions makers in the organization. With the help of this software assignment related to R, language can be graphically represented. The data can be automatically imported in this software, and it makes navigation within the source file an easy process without inserting any line of code. The plots and the commands generated are efficiently managed through this software. 

Stata: It is a tool for the management of all types of data, used in descriptive and inferential statistics analysis, maintaining high-resolution graphics, etc. The software has a simple interface, the help section of the manual explains in details about the commands with support from a wide community. The navigation process is much easier, graphs generated can be used while giving presentations, and the analysis generated is user friendly. To use the software appropriately, one needs to know graphical interpretations and usage of regression and standard errors.

JMP: The tool uses robust statistics and dynamic graphics to present an analysis of the data fed in the memory or in the computer. The software’s interactive and visual feature provides better insights about the data that could not be gained from static graphs or raw number tables. Any amount of statistical problem can be solved with the usage of this software. During the process of descriptive and inferential statistics analysis, the user does not face any problem in handling the software due to its easy interface.

Minitab 18: It is software with different tools that the users can use to analyze data and find solutions to different business problems. Data interpretations are made without any interruptions, and presentations are effortless. One can discover different other advantages by using this software. It has an extraordinary user interface and can be easily located. The tools can be used upon their categorization. When you are stuck at any point in time, you can use the help feature of the software by a right-click, and you will get instant help in the form of step by step guidance.

KNIME analytics platform: It is an overall solution for all data-driven discoveries, helping the users in discovering the prospects of hidden data, generating new insights and making future predictions. It consists of many modules, examples and other tools that can help analyze descriptive and inferential statistics. It checks the workflow, provides mathematical and statistical solutions, generates and predicts algorithms for machine learning. The platform can evaluate a large amount of data involving algorithms and codes with the help of modules. It does not require any programming to perform graphical jobs.

Origin Pro: It is easy to learn and friendly software that helps in data analysis and is able to publish customized quality graphs as per the needs of engineers and scientists. People using this software can engineer operations related to importing, analysis and graphing from the graphical user interface. When there is a change in the data, the graphs, analysis and the reports automatically get updated. The best feature of this software is its customer service which provides quick solutions. The graphics generated through the software is very professional and visually appealing.

NumXL: The software is considered as different from others basis its time series and Excel add-ins. The two features change the Microsoft Excel into an econometrics tool and extraordinary time series software. It provides accuracy in descriptive and inferential statistics along with some shortcuts, which can take you with the entire process. All data can be easily adjusted with the help of add-ins. Customer support is one of its best parts in case you are stuck at any point in time.

SAS or STAT: The software adjusts itself with any type of data being fed in it. It has some techniques that can interpret smaller data sets; some tools can interpret larger data sets by applying statistical modelling tools. The software can also interpret data having incomplete values by the application of modern methods. There are regular updates with different statistical methods and statistical procedures that can be used at any time. It helps in managing the codes and the macros by itself when one does not have enough time to write down the codes in a detailed manner.

SAS base: It is programming language software providing an interface which is purely web-based. Certain programs can be used instantly for data manipulation, data storage and recovery, descriptive and inferential statistics, and reporting. The feature of cross-platform and multi-platform support can be found on this tool. The tool is streamlined, and there are no frills. You can insert your data or write your codes, and the tool will run the data and provide you with the result. You can either do the analysis yourself or pass the result to another program for further interpretations. The tool acts immediately once everything has been put in the right place.

Conducting a statistical analysis with the help Totalassignmenthelp.com If you are tasked with the analysis of descriptive and inferential statistics, statistic assignment help from total assignment help would be the best option to get help. Experts working on statistics assignment take help from different statistical software to generate quick and accurate results on any amount of data. The interpretations provided by the experts cannot be challenged at any platform. You can contact us for more information’s related to descriptive and inferential statistics. We will help you in solving your statistical problems and other assignment related queries. Along with assignment writing services, a student can avail the below benefits:

Quality work and on-time delivery

Extended revisions

Affordable prices

Non-stop offers on new assignments.

Original content

Quick refund

Immediate service

If you want to verify the quality of our descriptive and inferential statistics assignment or you want check the type of work done by us, please visit our official website where you can search for sample articles related to the field you are having a problem with.

Total Assignment Help Incase, you are looking for an opportunity to work from home and earn big money. TotalAssignmenthelp Affiliate program is the best choice for you. 

Do visit : https://www.totalassignmenthelp.com/affiliate-program for more details

Total Assignment help is an  online assignment help  service available in 9 countries. Our local operations span across Australia, US, UK, South east Asia and the Middle East. With extensive experience in academic writing, Total assignment help has a strong track record delivering quality writing at a nominal price that meet the unique needs of students in our local markets.

We have specialized network of highly trained writers, who can provide best possible assignment help solution for all your needs.  Next time you are looking for assignment help, make sure to give us a try.

Looking for Assignment Help from Top Experts ?

Get the best Assignment Help from leading experts from the field of academics with assured onetime, 100% plagiarism free and top Quality delivery.

Thomas Smith

Thomas Smith

Related posts.

argumentative essay examples

Refer To Argumentative Essay Examples To Write An Impeccable Argumentative Essay

essay planning

Write Essays By Using Essay Planning Tips And Techniques

how to cite an interview in Chicago style

Learn How To Cite An Interview In Chicago Style With Examples

Comments are closed.

IMAGES

  1. A Comparative Analysis Of Descriptive And Inferential Statistics

    descriptive and inferential statistics assignment

  2. Descriptive vs Inferential Statistics

    descriptive and inferential statistics assignment

  3. Descriptive and Inferential Statistics Assignment

    descriptive and inferential statistics assignment

  4. Inferential Vs Descriptive statistics

    descriptive and inferential statistics assignment

  5. Descriptive & Inferential Statistics Assignment

    descriptive and inferential statistics assignment

  6. What is Statistical Analysis? Examples, Definition & Methods

    descriptive and inferential statistics assignment

COMMENTS

  1. Difference between Descriptive and Inferential Statistics

    A good exploratory tool for descriptive statistics is the five-number summary, which presents a set of distributional properties for your sample.. Related post: Analyzing Descriptive Statistics in Excel. Inferential Statistics. Inferential statistics takes data from a sample and makes inferences about the larger population from which the sample was drawn.

  2. Descriptive vs. Inferential Statistics: What's the Difference?

    Descriptive statistics use summary statistics, graphs, and tables to describe a data set. This is useful for helping us gain a quick and easy understanding of a data set without pouring over all of the individual data values. Inferential statistics use samples to draw inferences about larger populations.

  3. Descriptive and Inferential Statistics

    4. Types of Descriptive Statistics with Examples. Measures of Central Tendency:These provide insights into the central point of a dataset.. Mean (Average): The sum of all values divided by the number of values. Example: For a dataset of ages (23, 25, 26, 29, 30), the mean age is $ \frac{23+25+26+29+30}{5} = 26.6 $ years. Median: The middle value in an ordered dataset.

  4. Inferential Statistics

    Inferential Statistics | An Easy Introduction & Examples. Published on September 4, 2020 by Pritha Bhandari.Revised on June 22, 2023. While descriptive statistics summarize the characteristics of a data set, inferential statistics help you come to conclusions and make predictions based on your data. When you have collected data from a sample, you can use inferential statistics to understand ...

  5. 7.2.2: Descriptive versus Inferential Statistics

    For more descriptive statistics, consider Table 7.2.2.2 7.2.2. 2. It shows the number of unmarried men per 100 unmarried women in U.S. Metro Areas in 1990. From this table we see that men outnumber women most in Jacksonville, NC, and women outnumber men most in Sarasota, FL.

  6. Descriptive and Inferential Statistics: Random Assignment

    Here, we introduce descriptive statistics using examples and discuss the difference between descriptive and inferential statistics. We also talk about samples and populations, explain how you can identify biased samples, and define differential statistics. ... Random assignment is critical for the validity of an experiment. For example ...

  7. PDF Introduction to Statistics

    NOTE: Descriptive statistics summarize data to make sense or meaning of a list of numeric values. Transition from descriptive to inferential statistics (Chapters 6-7) Inferential Statistics (Chapters 8-18) Statistics Descriptive Statistics (Chapters 2-5) FIGURE 1.1 A general overview of this book. This book begins with an introduction to ...

  8. 1.1: Descriptive and Inferential Statistics

    Organizing and summarizing data is called descriptive statistics. Two ways to summarize data are by graphing and by using numbers (for example, finding an average). After you have studied probability and probability distributions, you will use formal methods for drawing conclusions from "good" data.

  9. Descriptive and Inferential Statistics

    Inferential statistics use data from a sample to answer questions about a population. Inferential statistics involves generalizing beyond the data at hand. Descriptive statistics are numbers that are used to summarize and describe data. Predicting next month's unemployment rate involves predicting future data, no describing the data at hand.

  10. Descriptive vs. Inferential Statistics

    Each of these segments is important, offering different techniques that accomplish different objectives. Descriptive statistics describe what is going on in a population or data set. Inferential statistics, by contrast, allow scientists to take findings from a sample group and generalize them to a larger population.

  11. Unit 07

    Unit 7: Assignment #2 (due before 11:59 pm Central on MON JUL 8): To become familiar with some of the ways that descriptive and inferential statistics can be used to deceive people, read Chapters 2 through 6 of (a slender!) book titled How to Lie with Statistics by Darrell Huff. NOTE: This book was published in 1954; therefore, the examples are ...

  12. Descriptive vs. Inferential Statistics: Key Differences

    Descriptive and inferential statistics, although distinct in their purposes and approaches, exhibit some similarities: 1. Data Utilization: Both descriptive and inferential statistics utilize the same dataset. Descriptive statistics summarize this data, whereas inferential statistics use it to draw broader conclusions about a larger population. 2.

  13. 2.3 Descriptive and Inferential Statistics

    Inferential statistics. We have seen that descriptive statistics are useful in providing an initial way to describe, summarize, and interpret a set of data. They are limited in usefulness because they tell us nothing about how meaningful the data are. The second step in analyzing data requires inferential statistics.

  14. Descriptive and Inferential Statistics

    Example 3: Find the z score using descriptive and inferential statistics for the given data. Population mean 100, sample mean 120, population variance 49 and size 10. Solution: Inferential statistics is used to find the z score of the data. The formula is given as follows: z = x−μ σ x − μ σ. Standard deviation = √49 49 = 7.

  15. Descriptive Statistics

    Types of descriptive statistics. There are 3 main types of descriptive statistics: The distribution concerns the frequency of each value. The central tendency concerns the averages of the values. The variability or dispersion concerns how spread out the values are. You can apply these to assess only one variable at a time, in univariate ...

  16. Inferential Statistics

    There are 5 modules in this course. This course covers commonly used statistical inference methods for numerical and categorical data. You will learn how to set up and perform hypothesis tests, interpret p-values, and report the results of your analysis in a way that is interpretable for clients or the public.

  17. Unit 07: How to Evaluate Descriptive and Inferential Statistics

    Unit 7: How to Evaluate Descriptive and Inferential Statistics. Unit 7: Assignment #1 (due before 11:59 pm Central on Wednesday September 29): To review what descriptive and inferential statistics are, why they are important to learn, and examples of how they are used: Watch Lynda.com's (2010) video, " Understanding Descriptive and ...

  18. Descriptive vs Inferential Statistics: A Comprehensive Guide

    Descriptive statistics provide insights into the features of the observed data, while inferential statistics extend these findings to make predictions or draw conclusions about a broader population. Application: Descriptive and inferential statistics are widely applied across various fields, including science, business, economics, social ...

  19. Descriptive Statistics: Definition & Charts and Graphs

    For example, if you have ten items in your data set, type them into cells A1 through A10. Step 2: Click the "Data" tab and then click "Data Analysis" in the Analysis group. Step 3: Highlight "Descriptive Statistics" in the pop-up Data Analysis window. Step 4: Type an input range into the "Input Range" text box.

  20. Descriptive Statistics: Definitions, Types, Examples

    It involves organizing, visualizing, and summarizing raw data to create a coherent picture. The primary goal of descriptive statistics is to provide a clear and concise overview of the data's main features. This helps us identify patterns, trends, and characteristics within the data set without making broader inferences.

  21. 2.E: Descriptive Statistics (Exercises)

    This page titled 2.E: Descriptive Statistics (Exercises) is shared under a license and was authored, remixed, and/or curated by via that was edited to the style and standards of the LibreTexts platform. These are homework exercises to accompany the Textmap created for "Introductory Statistics" by Shafer and Zhang.

  22. Difference between Descriptive and Inferential statistics

    Descriptive Statistics Inferential Statistics; It gives information about raw data which describes the data in some manner.: It makes inferences about the population using data drawn from the population.: It helps in organizing, analyzing, and to present data in a meaningful manner.: It allows us to compare data, and make hypotheses and predictions.: It is used to describe a situation.

  23. A Comparative Analysis Of Descriptive And Inferential Statistics

    Statistics is a branch of mathematics. The subject focuses on collection, management, examination, interpretation and demonstration of the data. Statistical analysis consists of two types, descriptive and inferential statistics. The two concepts play a vital role during any statistical analysis. Briefly speaking, through descriptive statistics ...