If significant, clearly report which group mean is higher, along with the effect size.
Before conducting analysis, we need to ensure that we will have an adequate sample size to detect an effect. Sample size relates to the concept of power. For example, to detect a small effect, a larger sample is needed. Larger sample sizes can thus detect a smaller effect. Sample size is determined through a power analysis. The determination of sample size is never a simple percent of the population, but a calculated number based on the planned statistical tests, significance level and effect size. 8 I recommend using G*Power for basic power calculations, although many other options are available. In the exemplar study, the authors did not report their power analysis prior to conducting the study, but they gave a post-hoc power analysis of the actual power based on their sample size and the effect size detected. 6
Data often need cleaning and other preparation before conducting analysis. Problems requiring cleaning include values outside of an acceptable range and missing values. Any particular value could be wrong because of a data entry error or data collection problem. Visually inspecting data can reveal anomalies. For example, an age value of 200 is clearly an error, or a value of 9 on a 1–5 Likert-type scale is an error. An easy way to start inspecting data is to sort each variable by ascending values and then descending values to look for atypical values. Then, try to correct the problem by determining what the value should be. Missing values are a more complicated problem because a concern is why the value is missing. A few missing values at random is not necessarily a concern, but a pattern of missing values (eg, individuals from a specific ethnic group tend to skip a certain question) indicates a systematic missingness that could indicate a problem with the data collection instrument. Descriptive statistics are an additional way to check for errors and ensure data are ready for analysis. While not discussed in the communication assessment exemplar, the authors did prepare data for analysis and report missing values in their descriptive statistics.
Before running inferential statistics, it is critical to first describe the data. Obtaining descriptive statistics is a way to check whether data are ready for further analysis. Descriptive statistics give a general sense of trends and can illuminate errors by reviewing frequencies, minimums and maximums that can indicate values outside of the accepted range. Descriptive statistics are also an important step to check whether we meet assumptions for statistical tests. In a quantitative study, descriptive statistics also inform the first table of the results that reports information about the sample, as seen in table 2 of the exemplar study. 6
All statistical tests rely on foundational assumptions. Although some tests are more robust to violations, checking assumptions indicates whether the test is likely to be valid for a particular data set. Foundational parametric statistics (eg, t tests, ANOVA, correlation, regression) assume independent observations and a normal linear distribution of data. In the exemplar study, the authors noted ‘Data from both groups met normality assumptions, based on the Shapiro–Wilk test’ (p508), and gave the statistics in addition to noting specific assumptions for the independent t tests around equality of variances. 6
Conducting the analysis involves running whatever tests were planned. Statistics may be calculated manually or using software like SPSS, Stata, SAS or R. Statistical software provides an output with key tests statistics, p values that indicate whether a result is likely systematic or random, and indicators of fit. In the exemplar study, the authors noted they used SPSS V.22. 6
The first step involves examining whether the statistical model was significant or a good fit. For t tests, ANOVAs, correlation and regression, first examine an overall test of significance. For a t test, if the t statistic is not statistically significant (eg, p>0.05 or a CI crossing 0), we can conclude no significant difference between groups. The communication assessment exemplar reports significance of the t tests along with measures such as equality of variance.
For an ANOVA, if the F statistic is not statistically significant (eg, p>0.05 or a CI crossing 0), we can conclude no significant difference between groups and stop because there is no point in further examining what groups may be different. If the F statistic is significant in an ANOVA, we can then use contrasts or post-hoc tests to examine what is different. For a correlation test, if the r value is not statistically significant (eg, p>0.05 or a CI crossing 0), we can stop because there is no point in looking at the magnitude or direction of the coefficient. If it is significant, we can proceed to interpret the r. Finally, for a regression, we can examine the F statistic as an omnibus test and its significance. If it is not significant, we can stop. If it is significant, then examine the p value of each independent variable and residuals.
When writing statistical results, always start with descriptive statistics and note whether assumptions for tests were met. When reporting inferential statistical tests, give the statistic itself (eg, a F statistic), the measure of significance (p value or CI), the effect size and a brief written interpretation of the statistical test. The interpretation, for example, could note that an intervention was not significantly different from the control or that it was associated with improvement that was statistically significant. For example, the exemplar study gives the pre–post means along standard error, t statistic, p value and an interpretation that postseminar means were lower, along with a reminder to the reader that lower is better. 6
When writing for a journal, follow the journal’s style. Many styles italicise non-Greek statistics (eg, the p value), but follow the particular instructions given. Remember a p value can never be 0 even though some statistical programs round the p to 0. In that case, most styles prefer to report as p<0.001.
Shadish et al 9 provide nine threats to statistical conclusion validity in drawing inferences about the relationship between two variables; the threats can broadly apply to many statistical analyses. Although it helps to consider and anticipate these threats when designing a research study, some only arise after data collection and analysis. Threats to statistical conclusion validity appear in table 4 . 9 Pertinent threats can be dealt with to the extent possible (eg, if assumptions were not met, select another test) and should be discussed as limitations in the research report. For example, in the exemplar study, the authors noted the sample size as a limitation but reported that a post-hoc power analysis found adequate power. 6
Threats to statistical conclusion validity
Threat | Description |
Low statistical power (see step 3) | The sample size is not adequate to detect an effect. |
Violated assumptions of statistical tests (see step 6) | The data violate assumptions needed for the test, such as normality. |
Fishing and error rates | Repeated tests of the same data (eg, multiple comparisons) increase chances of errors in conclusions. |
Unreliability of measures | Error in measurement or instruments can artificially inflate or decrease apparent relationships among variables. |
Restricted range | Statistics can be biased by limited outcome values (eg, high/low only) or floor or ceiling effects in which participants scores are clustered around high or low values. |
Unreliability of treatment implementation | In experiments, unstandardised or inconsistent implementation affects conclusions about correlation. |
Extraneous variance in an experiment | The setting of a study can introduce error. |
Heterogeneity of units | As participants differ within conditions, standard deviation can increase and introduce error, making it harder to detect effects. |
Inaccurate effect size estimation | Outliers or incorrect effect size calculations (eg, a continuous measure for a dichotomous dependent variable) can skew measures of effect. |
Key resources to learn more about statistics include Field 4 and Salkind 10 for foundational information. For advanced statistics, Hair et al 11 and Tabachnick and Fidell 12 provide detailed information on multivariate statistics. Finally, the University of California Los Angeles Institute for Digital Research and Education (stats.idre.ucla.edu/other/annotatedoutput/) provides annotated output from Stata, SAS, Stata and MPlus for many statistical tests to help researchers read the output and understand what it means.
Researchers in family medicine and community health often conduct statistical analyses to address research questions. Following specific steps ensures a systematic and rigorous analysis. Knowledge of these essential statistical procedures will equip family medicine and community health researchers with interpreting literature, reviewing literature and conducting appropriate statistical analysis of their quantitative data.
Nevertheless, I gently remind you that the steps are interrelated, and statistics is not only a consideration at the end of data collection. When designing a quantitative study, investigators should remember that statistics is based on distributions, meaning statistics works with aggregated numerical data and relies on variance within that data to test statistical hypotheses about group differences, relationships or trends. Statistics provides a broad view, based on these distributions, which brings implications at the early design phase. In designing a quantitative study, the nature of statistics generally suggests a larger number of participants in the research (ie, a larger n) to have adequate power to detect statistical significance and draw valid conclusions. Therefore, it will likely be helpful for researchers to include a biostatistician as early as possible in the research team when designing a study.
Contributors: The sole author, TCG, is responsible for the conceptualisation, writing and preparation of this manuscript.
Funding: This study was funded by the National Institutes of Health (10.13039/100000002) and grant number 1K01LM012739.
Competing interests: None declared.
Patient consent for publication: Not required.
Provenance and peer review: Not commissioned; internally peer reviewed.
IMAGES
VIDEO
COMMENTS
Descriptive statistics summarize and organize characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population.
Descriptive Statistics: In research, descriptive analytics often takes the form of descriptive statistics. This includes calculating measures of central tendency (like mean, median, and mode), measures of dispersion (like range, variance, and standard deviation), and measures of frequency (like count, percent, and frequency).
This chapter discusses and illustrates descriptive statistics. The purpose of the procedures and fundamental concepts reviewed in this chapter is quite straightforward: to facilitate the description and summarisation of data. By 'describe' ...
Statistical methods involved in carrying out a study include planning, designing, collecting data, analysing, drawing meaningful interpretation and reporting of the research findings. The statistical analysis gives meaning to the meaningless numbers, ...
Learn about the key concepts and measures within descriptive statistics, including measures of central tendency and dispersion.
Descriptive Statistics Descriptive statistics is a branch of statistics that deals with the summarization and description of collected data. This type of statistics is used to simplify and present data in a manner that is easy to understand, often through visual or numerical methods. Descriptive statistics is primarily concerned with measures of central tendency, variability, and distribution ...
Emphasizing its pivotal role in academia, descriptive statistics serve as a fundamental tool for summarizing and analyzing data across disciplines. The chapter underscores how descriptive statistics drive research inspiration and guide analysis, and provide a foundation for advanced statistical techniques.
Effective presentation of study results, in presentation or manuscript form, typically starts with frequencies and descriptive statistics (ie, mean, medians, standard deviations). One can get a better sense of the variables by examining these data to determine whether a balanced and sufficient research design exists.
14 Quantitative analysis: Descriptive statistics Numeric data collected in a research project can be analysed quantitatively using statistical tools in two different ways. Descriptive analysis refers to statistically describing, aggregating, and presenting the constructs of interest or associations between these constructs.
The chapter underscores how descriptive statistics drive research inspiration and guide analysis, and provide a foundation for advanced statistical techniques.
What do we mean by descriptive statistics? With any kind of data, the main objective is to describe a population at large — and using descriptive statistics, researchers can quantify and describe the basic characteristics of a given data set.
Descriptive statistics provide an essential foundation for understanding and summarizing large datasets by offering valuable insights into the central tendencies, dispersion, and shape of the distribution. By leveraging measures such as mean, median, mode, range,...
Descriptive statistics are used to describe the basic features of your study's data and form the basis of virtually every quantitative analysis of data.
The purpose of descriptive statistics Describing data is an essential part of statistical analysis aiming to provide a complete picture of the data before moving to exploratory analysis or predictive modeling. The type of statistical methods used for this purpose are called descriptive statistics. They include both numerical (e.g. central tendency measures such as mean, mode, median or ...
Chapter 14 Quantitative Analysis Descriptive Statistics Numeric data collected in a research project can be analyzed quantitatively using statistical tools in two different ways. Descriptive analysis refers to statistically describing, aggregating, and presenting the constructs of interest or associations between these constructs.
What is Descriptive Statistics? Descriptive statistics refers to a set of mathematical and graphical tools used to summarize and describe essential features of a dataset. These statistics provide a clear and concise representation of data, enabling researchers, analysts, and decision-makers to gain valuable insights, identify patterns, and understand the characteristics of the information at hand.
Descriptive statistics are specific methods basically used to calculate, describe, and summarize collected research data in a logical, meaningful, and efficient way. Descriptive statistics are reported numerically in the manuscript text and/or in its tables, or graphically in its figures. This basic …
Definition and purpose of descriptive statistics. Descriptive statistics form the cornerstone of quantitative analysis, offering a snapshot of data by producing concise summaries. The intrinsic value of these statistical methods lies in their ability to simplify complex datasets, allowing for a more digestible presentation of research findings.
Two main statistical methods are used in data analysis: descriptive statistics, which summarizes data using indexes such as mean and median and another is inferential statistics, which draw conclusions from data using statistical tests such as student's t -test.
Two types of descriptive statistics that go hand in hand are measuresof central tendency, which describe the characteristics of the average case, and measuresof dispersion, which tell us just how typical this average case is. We use inferential statistics to make statements about a population on the basis of a sample drawn from that population.
Descriptive statistics can be useful for two purposes: 1) to provide basic information about variables in a dataset and 2) to highlight potential relationships between variables. The three most common descriptive statistics can be displayed graphically or pictorially and are measures of: Graphical/Pictorial Methods. Measures of Central Tendency.
Defining Descriptive Research in Psychology: More Than Meets the Eye. At its core, descriptive research in psychology is a systematic approach to observing and cataloging human behavior, thoughts, and emotions in their natural context. It's the scientific equivalent of people-watching, but with a structured methodology and a keen eye for detail.
After examining a brief overview of foundational statistical techniques, for example, differences between descriptive and inferential statistics, the article illustrates 10 steps in conducting statistical analysis with examples of each.