factorial experiment equation

Full Factorial Design: Understanding the Impact of Independent Variables on Outputs

Updated: December 17, 2023 by Ken Feldman

factorial experiment equation

DOE, or Design of Experiments , is a method of designed experimentation where you manipulate the controllable factors (independent variables or inputs) in your process at different levels to see their effect on some response variable (dependent variable or output). 

This article will explore the different approaches to DOE with a specific focus on the full factorial design. We will discuss the benefits of using the full factorial design and offer some best practices for a successful experiment.

Overview: What is a full factorial DOE? 

As stated above, a full factorial DOE design is one of several approaches to designing and carrying out an experiment to determine the effect that various levels of your inputs will have on your outputs. The purpose of the DOE is to determine at what levels of the inputs will you optimize your outputs. For example, if your output is a thickness of coating to be applied to a metal sheet, and your primary process variables are speed, temperature, and viscosity of the coating, then what combination of speed, temperature, and viscosity should you use to get an optimal and consistent thickness on the metal sheet?

With three variables, speed, temperature, and viscosity, how many different unique combinations would we have to try to fully examine all the possibilities? Which combination of speed, temperature and viscosity will give us the best coating thickness? The experimentation using all possible factor combinations is called a full factorial design, and the minimum number of experiments you would have to do is called Runs .  

We can calculate the total number of unique factor combinations with the simple formula of # Runs=2^k, where k is the number of variables and 2 is the number of levels, such as (high/low) or (400 rpm/800 rpm). In our coating example, we would call this design a 2 level, 3 factor full factorial DOE . The number of runs would then be calculated as 2^3, or 2x2x2, which equals 8 total runs.

There are other designs that you can use such as a fractional factorial , which uses only a fraction of the total runs. That fraction can be one-half, one-quarter, one-eighth, and so forth depending on the number of factors or variables.

3 benefits of doing a full factorial DOE 

Doing a full factorial as opposed to a fractional factorial or other screening design has a number of benefits.  

1. You can determine main effects

Main effects describe the impact of each individual factor on the output or response variable. In our example, one of the main effects would be the impact or change in the coating thickness that would be attributable to speed alone if we changed from a run speed of 400 rpm to a speed of 800 rpm.

2. You can determine the effects of interactions on the response variable

An interaction occurs when the effect of one factor on the response depends on the setting of another factor. For our example, if we ran at a speed of 800 rpm, what temperature should we run at to optimize our coating thickness? On the other hand, what temperature should we run at if the speed is 400 rpm?

3. The optimal settings for the independent variables can be estimated

The final full factorial analysis will tell us what setting or levels of our speed, temperature, and viscosity we should use to optimize our coating thickness. 

Our example answer might look like this: Run the machine at 400 rpm, a temperature of 350 degrees using a coating viscosity of 6,000 cps.

Why is a full factorial DOE important to understand? 

Using intuition to set the optimal settings of your process variables or factors is insufficient if the goal is to understand the impact of the factors on your output. Using trial and error may miss the important combinations, or optimal combination, and you might end up with a less-than-optimized process or product.

You need to know the full effect of your variables on the process

Your process variables have different impacts on your output. You need the whole picture.

The world is not just made up of main effects

As explained earlier, main effects are the individual impacts of each factor on the output. But the world is more complex than that, and most outcomes are a function of interactions, not just main effects.  

State your conclusions with statistical certainty 

You can determine main effects, interactions, and other outcomes of a full factorial DOE using statistics, so decisions are based on statistical significance rather than hunches or “seat of the pants” conclusions. 

An industry example of a full factorial DOE 

A beverage manufacturer wanted to reduce the amount of overfilled bottles on its manufacturing line. Company leadership felt that the major factors in the process were the run speed of the machine, size of bottle, type of product, and degree of carbonation. A 2-level, 4-factor full factorial experiment was selected. This would require 16 runs. 

The company’s Master Black Belt designed the DOE and ran it on a preselected machine. The experiment was restricted to a single machine to block out any impact that might be attributable to machine differences.

Each run consisted of 100 bottles, and they took the average of the fill level of those 100 to use in the calculations. Running a single bottle would have been impractical. They determined that, after doing the calculations and analysis all four factors were statistically significant — and that there were interactions between speed and carbonation. The optimal settings suggested by the experiment were used in a confirmatory run to see if the changes actually improved fill level.

It did, and they replicated the new settings on the rest of the machines.

3 best practices when thinking about a full factorial DOE

Running a DOE needs planning and discipline. The results of your experiment can become contaminated and untrustworthy if you don’t take care. 

1. Clearly define your factors and desired outcomes

Know what factors are likely to be the most important and how you are going to measure them. 

2. Minimize the “noise”

Since you’re only interested in the impact of your chosen factors on the response, remove any other factors than might contaminate your experiments. Control and minimize any other factors around you so that they don’t inadvertently affect your experiment.

3. Use screening experiments if appropriate

If you have a large number of possible factors, you will be doing a large number of runs that can get costly and time-consuming. To help screen out the factors that are not really important, use appropriate screening experiments such as a fractional factorial.

Frequently Asked Questions (FAQ) about full factorial DOE

How do you calculate the number of runs or experiments to do for a full factorial DOE?

Use the simple formula # Runs=X^K ,  where X is the number of levels or settings, and K is the number of variables for factors.

What are the main effects of a DOE?

They are the specific impact of a single factor on the response variable. They are determined by calculating the difference in response when a factor is run at different levels or settings.

Are Interactions in a Full Factorial DOE important? 

Yes, they are very important. If interactions exist between the factors or variables in your process, you’ll want to understand them so you can optimize your settings based on the crossed impact of your factors.

In summary: Full factorial DOE

Designed experiments in general and a full factorial DOE design in particular, are powerful statistical tools to understand your process and optimize your output. You must take care to do the DOE with planning and discipline so the results are meaningful.

In a full factorial DOE, you will identify the appropriate output that you want to improve and the factors or variables that you believe impact that output. Once you’ve identified the factors, determine the levels or settings you’d like to explore and the number of unique combinations of the factors and levels.

After running your experiment, you’ll usually use a statistical software package to analyze your results. From there, you will be able to statistically determine the main effects of your factors, the interactions between the factors, and the optimal levels or settings.

About the Author

' src=

Ken Feldman

factorial experiment equation

What is a Factorial Design of an Experiment?

The factorial design of experiment is described with examples in Video 1.

Video 1. Introduction to Factorial Design of Experiment DOE and the Main Effect Calculation Explained Example .

In a Factorial Design of Experiment, all possible combinations of the levels of a factor can be studied against all possible levels of other factors. Therefore, the factorial design of experiments is also called the crossed factor design of experiments . Due to the crossed nature of the levels, the factorial design of experiments can also be called the completely randomized design (CRD) of experiments . Therefore, the proper name for the factorial design of experiments would be completely randomized factorial design of experiments .

In an easy to understand study of human comfort, two levels of the temperature factor (or independent variable), including 0 O F and 75 O F; and two levels of the humidity factor, including 0% and 35% were studied with all possible combinations (Figure 1). Therefore, the four (2X2) possible treatment combinations, and their associated responses from human subjects (experimental units) are provided in Table 1.

Table 1. Data Structure/Layout of a Factorial Design of Experiment

factorial experiment equation

Coding Systems for the Factor Levels in the Factorial Design of Experiment

As the factorial design is primarily used for screening variables, only two levels are enough. Often, coding the levels as (1) low/high, (2) -/+, (3) -1/+1, or (4) 0/1 is more convenient and meaningful than the actual level of the factors, especially for the designs and analyses of the factorial experiments. These coding systems are particularly useful in developing the methods in factorial and fractional factorial design of experiments. Moreover, general formula and methods can only be developed utilizing the coding system. Coding systems are also useful in response surface methodology. Often, coded levels produce smooth, meaningful and easy to understand contour plots and response surfaces. Moreover, especially in complex designs, the coded levels such as the low- and high-level of a factor are easier to understand.

How to graphically represent the design?

An example graphical representation of a factorial design of experiment is provided in Figure 1 .

factorial experiment equation

Figure 1. Factorial Design of Experiments with two levels for each factor (independent variable, x). The response (dependent variable, y) is shown using the solid black circle with the associated response values.

Test Your Knowledge

Understanding main effects.

9.1 Setting Up a Factorial Experiment

Learning objectives.

  • Explain why researchers often include multiple independent variables in their studies.
  • Define factorial design, and use a factorial design table to represent and interpret simple factorial designs.

Just as it is common for studies in psychology to include multiple levels of a single independent variable (placebo, new drug, old drug), it is also common for them to include multiple independent variables. Schnall and her colleagues studied the effect of both disgust and private body consciousness in the same study. Researchers’ inclusion of multiple independent variables in one experiment is further illustrated by the following actual titles from various professional journals:

  • The Effects of Temporal Delay and Orientation on Haptic Object Recognition
  • Opening Closed Minds: The Combined Effects of Intergroup Contact and Need for Closure on Prejudice
  • Effects of Expectancies and Coping on Pain-Induced Intentions to Smoke
  • The Effect of Age and Divided Attention on Spontaneous Recognition
  • The Effects of Reduced Food Size and Package Size on the Consumption Behavior of Restrained and Unrestrained Eaters

Just as including multiple levels of a single independent variable allows one to answer more sophisticated research questions, so too does including multiple independent variables in the same experiment. For example, instead of conducting one study on the effect of disgust on moral judgment and another on the effect of private body consciousness on moral judgment, Schnall and colleagues were able to conduct one study that addressed both questions. But including multiple independent variables also allows the researcher to answer questions about whether the effect of one independent variable depends on the level of another. This is referred to as an interaction between the independent variables. Schnall and her colleagues, for example, observed an interaction between disgust and private body consciousness because the effect of disgust depended on whether participants were high or low in private body consciousness. As we will see, interactions are often among the most interesting results in psychological research.

Factorial Designs

By far the most common approach to including multiple independent variables (which are often called factors) in an experiment is the factorial design. In a  factorial design , each level of one independent variable is combined with each level of the others to produce all possible combinations. Each combination, then, becomes a condition in the experiment. Imagine, for example, an experiment on the effect of cell phone use (yes vs. no) and time of day (day vs. night) on driving ability. This is shown in the  factorial design table  in Figure 9.1. The columns of the table represent cell phone use, and the rows represent time of day. The four cells of the table represent the four possible combinations or conditions: using a cell phone during the day, not using a cell phone during the day, using a cell phone at night, and not using a cell phone at night. This particular design is referred to as a 2 × 2 (read “two-by-two”) factorial design because it combines two variables, each of which has two levels.

If one of the independent variables had a third level (e.g., using a handheld cell phone, using a hands-free cell phone, and not using a cell phone), then it would be a 3 × 2 factorial design, and there would be six distinct conditions. Notice that the number of possible conditions is the product of the numbers of levels. A 2 × 2 factorial design has four conditions, a 3 × 2 factorial design has six conditions, a 4 × 5 factorial design would have 20 conditions, and so on. Also notice that each number in the notation represents one factor, one independent variable. So by looking at how many numbers are in the notation, you can determine how many independent variables there are in the experiment. 2 x 2, 3 x 3, and 2 x 3 designs all have two numbers in the notation and therefore all have two independent variables. The numerical value of each of the numbers represents the number of levels of each independent variable. A 2 means that the independent variable has two levels, a 3 means that the independent variable has three levels, a 4 means it has four levels, etc. To illustrate a 3 x 3 design has two independent variables, each with three levels, while a 2 x 2 x 2 design has three independent variables, each with two levels.

Figure 8.1 Factorial Design Table Representing a 2 × 2 Factorial Design

Figure 9.1 Factorial Design Table Representing a 2 × 2 Factorial Design

In principle, factorial designs can include any number of independent variables with any number of levels. For example, an experiment could include the type of psychotherapy (cognitive vs. behavioral), the length of the psychotherapy (2 weeks vs. 2 months), and the sex of the psychotherapist (female vs. male). This would be a 2 × 2 × 2 factorial design and would have eight conditions. Figure 9.2 shows one way to represent this design. In practice, it is unusual for there to be more than three independent variables with more than two or three levels each. This is for at least two reasons: For one, the number of conditions can quickly become unmanageable. For example, adding a fourth independent variable with three levels (e.g., therapist experience: low vs. medium vs. high) to the current example would make it a 2 × 2 × 2 × 3 factorial design with 24 distinct conditions. Second, the number of participants required to populate all of these conditions (while maintaining a reasonable ability to detect a real underlying effect) can render the design unfeasible (for more information, see the discussion about the importance of adequate statistical power in Chapter 13). As a result, in the remainder of this section, we will focus on designs with two independent variables. The general principles discussed here extend in a straightforward way to more complex factorial designs.

Figure 8.2 Factorial Design Table Representing a 2 × 2 × 2 Factorial Design

Figure 9.2 Factorial Design Table Representing a 2 × 2 × 2 Factorial Design

Assigning Participants to Conditions

Recall that in a simple between-subjects design, each participant is tested in only one condition. In a simple within-subjects design, each participant is tested in all conditions. In a factorial experiment, the decision to take the between-subjects or within-subjects approach must be made separately for each independent variable. In a  between-subjects factorial design , all of the independent variables are manipulated between subjects. For example, all participants could be tested either while using a cell phone  or  while not using a cell phone and either during the day  or  during the night. This would mean that each participant would be tested in one and only one condition. In a within-subjects factorial design, all of the independent variables are manipulated within subjects. All participants could be tested both while using a cell phone and  while not using a cell phone and both during the day  and  during the night. This would mean that each participant would need to be tested in all four conditions. The advantages and disadvantages of these two approaches are the same as those discussed in Chapter 5. The between-subjects design is conceptually simpler, avoids order/carryover effects, and minimizes the time and effort of each participant. The within-subjects design is more efficient for the researcher and controls extraneous participant variables.

Since factorial designs have more than one independent variable, it is also possible to manipulate one independent variable between subjects and another within subjects. This is called a  mixed factorial design . For example, a researcher might choose to treat cell phone use as a within-subjects factor by testing the same participants both while using a cell phone and while not using a cell phone (while counterbalancing the order of these two conditions). But he or she might choose to treat time of day as a between-subjects factor by testing each participant either during the day or during the night (perhaps because this only requires them to come in for testing once). Thus each participant in this mixed design would be tested in two of the four conditions.

Regardless of whether the design is between subjects, within subjects, or mixed, the actual assignment of participants to conditions or orders of conditions is typically done randomly.

Non-Manipulated Independent Variables

In many factorial designs, one of the independent variables is a non-manipulated independent variable . The researcher measures it but does not manipulate it. The study by Schnall and colleagues is a good example. One independent variable was disgust, which the researchers manipulated by testing participants in a clean room or a messy room. The other was private body consciousness, a participant variable which the researchers simply measured. Another example is a study by Halle Brown and colleagues in which participants were exposed to several words that they were later asked to recall (Brown, Kosslyn, Delamater, Fama, & Barsky, 1999) [1] . The manipulated independent variable was the type of word. Some were negative health-related words (e.g.,  tumor, coronary ), and others were not health related (e.g.,  election, geometry ). The non-manipulated independent variable was whether participants were high or low in hypochondriasis (excessive concern with ordinary bodily symptoms). The result of this study was that the participants high in hypochondriasis were better than those low in hypochondriasis at recalling the health-related words, but they were no better at recalling the non-health-related words.

Such studies are extremely common, and there are several points worth making about them. First, non-manipulated independent variables are usually participant variables (private body consciousness, hypochondriasis, self-esteem, gender, and so on), and as such, they are by definition between-subjects factors. For example, people are either low in hypochondriasis or high in hypochondriasis; they cannot be tested in both of these conditions. Second, such studies are generally considered to be experiments as long as at least one independent variable is manipulated, regardless of how many non-manipulated independent variables are included. Third, it is important to remember that causal conclusions can only be drawn about the manipulated independent variable. For example, Schnall and her colleagues were justified in concluding that disgust affected the harshness of their participants’ moral judgments because they manipulated that variable and randomly assigned participants to the clean or messy room. But they would not have been justified in concluding that participants’ private body consciousness affected the harshness of their participants’ moral judgments because they did not manipulate that variable. It could be, for example, that having a strict moral code and a heightened awareness of one’s body are both caused by some third variable (e.g., neuroticism). Thus it is important to be aware of which variables in a study are manipulated and which are not.

Non-Experimental Studies With Factorial Designs

Thus far we have seen that factorial experiments can include manipulated independent variables or a combination of manipulated and non-manipulated independent variables. But factorial designs can also include  only non-manipulated independent variables, in which case they are no longer experiments but are instead non-experimental (cross-sectional) in nature. Consider a hypothetical study in which a researcher simply measures both the moods and the self-esteem of several participants—categorizing them as having either a positive or negative mood and as being either high or low in self-esteem—along with their willingness to have unprotected sexual intercourse. This can be conceptualized as a 2 × 2 factorial design with mood (positive vs. negative) and self-esteem (high vs. low) as non-manipulated between-subjects factors. Willingness to have unprotected sex is the dependent variable.

Again, because neither independent variable in this example was manipulated, it is a cross-sectional study rather than an experiment. (The similar study by MacDonald and Martineau [2002] [2]  was an experiment because they manipulated their participants’ moods.) This is important because, as always, one must be cautious about inferring causality from non-experimental studies because of the directionality and third-variable problems. For example, an effect of participants’ moods on their willingness to have unprotected sex might be caused by any other variable that happens to be correlated with their moods.

Key Takeaways

  • Researchers often include multiple independent variables in their experiments. The most common approach is the factorial design, in which each level of one independent variable is combined with each level of the others to create all possible conditions.
  • Each independent variable can be manipulated between-subjects or within-subjects.
  • Non-manipulated independent variables (gender) can be included in factorial designs, however, they limit the causal conclusions that can be made about the effects of the non-manipulated variable on the dependent variable.
  • Practice: Return to the five article titles presented at the beginning of this section. For each one, identify the independent variables and the dependent variable.
  • Practice: Create a factorial design table for an experiment on the effects of room temperature and noise level on performance on the MCAT. Be sure to indicate whether each independent variable will be manipulated between-subjects or within-subjects and explain why.
  • Brown, H. D., Kosslyn, S. M., Delamater, B., Fama, A., & Barsky, A. J. (1999). Perceptual and memory biases for health-related information in hypochondriacal individuals. Journal of Psychosomatic Research, 47 , 67–78. ↵
  • MacDonald, T. K., & Martineau, A. M. (2002). Self-esteem, mood, and intentions to use condoms: When does low self-esteem lead to risky health behaviors? Journal of Experimental Social Psychology, 38 , 299–306. ↵

Creative Commons License

Share This Book

  • Increase Font Size

MATH3014-6027 Design (and Analysis) of Experiments

Chapter 4 factorial experiments.

In Chapters 2 and 3 , we assumed the objective of the experiment was to investigate \(t\) unstructured treatments, defined only as a collection of distinct entities (drugs, advertisements, receipes, etc.). That is, there was not necessarily any explicit relationship between the treatments (although we could clearly choose which paticular comparisons between treatments were of interest via choice of contrast).

In many experiments, particularly in industry, engineering and the physical sciences, the treatments are actually defined via the choice of a level relating to each of a set of factors . We will focus on the commonly occurring case of factors at two levels . For example, consider the below experiment from the pharmaceutical industry.

Example 4.1 Desilylation experiment ( Owen et al. , 2001 )

In this experiment, performed at GlaxoSmithKline, the aim was to optimise the desilylation 24 of an ether into an alcohol, which was a key step in the synthesis of a particular antibiotic. There were \(t=16\) treatments, defined via the settings of four different factors, as given in Table 4.1 .

Table 4.1: Desilylation experiment: 16 treatments defined by settings of four factors, with response (yield).
Temp (degrees C) Time (hours) Solvent (vol.) Reagent (equiv.) Yield (%)
Trt 1 10 19 5 1 82.93
Trt 2 20 19 5 1 94.04
Trt 3 10 25 5 1 88.07
Trt 4 20 25 5 1 93.97
Trt 5 10 19 7 1 77.21
Trt 6 20 19 7 1 92.99
Trt 7 10 25 7 1 83.60
Trt 8 20 25 7 1 94.38
Trt 9 10 19 5 1.33 88.68
Trt 10 20 19 5 1.33 94.30
Trt 11 10 25 5 1.33 93.00
Trt 12 20 25 5 1.33 93.42
Trt 13 10 19 7 1.33 84.86
Trt 14 20 19 7 1.33 94.26
Trt 15 10 25 7 1.33 88.71
Trt 16 20 25 7 1.33 94.66

Each treatment is defined by the choice of one of two levels for each of the four factors. In the R code above, we have used the function FrF2 (from the package of the same name) to generate all \(t = 2^4 = 16\) combinations of the two levels of these four factors. We come back to this function later in the chapter.

This factorial treatment structure lends itself to certain treatment contrasts being of natural interest.

4.1 Factorial contrasts

Throughout this chapter, we will assume there are no blocks or other restrictions on randomisation, and so we will assume a completely randomised design can be used. We start by assuming the same unit-treatment model as Chapter 2 :

\[\begin{equation} y_{ij} = \mu + \tau_i + \varepsilon_{ij}\,, \quad i = 1, \ldots, t; j = 1, \ldots, n_i\,, \tag{4.1} \end{equation}\]

where \(y_{ij}\) is the response from the \(j\) th application of treatment \(i\) , \(\mu\) is a constant parameter, \(\tau_i\) is the effect of the \(i\) th treatment, and \(\varepsilon_{ij}\) is the random individual effect from each experimental unit with \(\varepsilon_{ij} \sim N(0, \sigma^2)\) independent of other errors.

Now, the number of treatments \(t = 2^f\) , where \(f\) equals the number of factors in the experiment.

For Example 4.1 , we have \(t = 2^4 = 16\) and \(n_i = 1\) for all \(i=1,\ldots, 16\) ; that is, each of the 16 treatments are replicated once. In general, we shall assume common treatment replication \(n_i = r \ge 1\) .

If we fit model (4.1) and compute the ANOVA table, we notice a particular issue with this design.

All available degrees of freedom are being used to estimate parameters in the mean ( \(\mu\) and the treatment effects \(\tau_i\) ). There are no degrees of freedom left to estimate \(\sigma^2\) . This is due to a lack of treatment replication. Without replication in the design, model (4.1) is saturated , with as many treatments as there are observations and an unbiased estimate of \(\sigma^2\) cannot be obtained. We will return to this issue later.

4.1.1 Main effects

Studying Table 4.1 , there are some comparisons between treatments which are obviously of interest. For example, comparing the average effect from the first 8 treatments with the average effect of the second 8, using

\[ \boldsymbol{c}^{\mathrm{T}}\boldsymbol{\tau} = \sum_{i=1}^tc_i\tau_i\,, \] with

\[ \boldsymbol{c}^{\mathrm{T}} = (-\boldsymbol{1}_{2^{f-1}}^{\mathrm{T}}, \boldsymbol{1}_{2^{f-1}}^{\mathrm{T}}) / 2^{f-1} = (-\boldsymbol{1}_8^{\mathrm{T}}, \boldsymbol{1}_8^{\mathrm{T}}) / 8\,. \]

This contrast compares the average treatment effect from the 8 treatments which have reagent set to its low level (1 equiv.) to the average effect from the 8 treatments which have reagent set to its high level. This is a “fair” comparison, as both of these sets of treatments have each of the combinations of the factors temp , time and solvent occuring equally often (twice here). Hence, the main effect of reagent is averaged over the levels of the other three factors.

As in Chapter 2 , we can estimate this treatment contrast by applying the same contrast coefficients to the treatment means,

\[ \widehat{\boldsymbol{c}^{\mathrm{T}}\boldsymbol{\tau}} = \sum_{i=1}^tc_i\bar{y}_{i.}\,, \] where, for this experiment, each \(\bar{y}_{i.}\) is the mean of a single observation (as there is no treatment replication). We see that inference about this contrast is not possible, as no standard error can be obtained.

Definition 4.1 The main effect of a factor \(A\) is defined as the difference in the average response from the high and low levels of the factor

\[ \mbox{ME}(A) = \bar{y}(A+) - \bar{y}(A-)\,, \] where \(\bar{y}(A+)\) is the average response when factor \(A\) is set to its high level, averaged across all combinations of levels of the other factors (with \(\bar{y}(A+)\) defined similarly for the low level of \(A\) ).

As we have averaged the response across the levels of the other factors, the intepretation of the main effect extends beyond this experiment. That is, we can use it to infer something about the system under study. Assuming model (4.1) is correct, any variation in the main effect can only come from random error in the observations. In fact,

\[\begin{align*} \mbox{var}\{ME(A)\} & = \frac{\sigma^2}{n/2} + \frac{\sigma^2}{n/2} \\ & = \frac{4\sigma^2}{n}\,, \end{align*}\]

and assuming \(r>1\) ,

\[\begin{equation} \hat{\sigma}^2 = \frac{1}{2^f(r-1)} \sum_{i=1}^{2^f}\sum_{j=1}^r(y_{ij} - \bar{y}_{i.})^2\,, \tag{4.2} \end{equation}\]

which is the residual mean square.

For Example 4.1 , we can also calculate main effect estimates for the other three factors by defining appropriate contrasts in the treatments.

Table 4.2: Desilylation experiment: main effect contrast coefficients
Temperature Time Solvent Reagent
Trt 1 -0.125 -0.125 -0.125 -0.125
Trt 2 0.125 -0.125 -0.125 -0.125
Trt 3 -0.125 0.125 -0.125 -0.125
Trt 4 0.125 0.125 -0.125 -0.125
Trt 5 -0.125 -0.125 0.125 -0.125
Trt 6 0.125 -0.125 0.125 -0.125
Trt 7 -0.125 0.125 0.125 -0.125
Trt 8 0.125 0.125 0.125 -0.125
Trt 9 -0.125 -0.125 -0.125 0.125
Trt 10 0.125 -0.125 -0.125 0.125
Trt 11 -0.125 0.125 -0.125 0.125
Trt 12 0.125 0.125 -0.125 0.125
Trt 13 -0.125 -0.125 0.125 0.125
Trt 14 0.125 -0.125 0.125 0.125
Trt 15 -0.125 0.125 0.125 0.125
Trt 16 0.125 0.125 0.125 0.125

Estimates can be obtained by applying these coefficients to the observed treatment means.

Main effects are often displayed graphically, using main effect plots which simply plot the average response for each factor level, joined by a line. The larger the main effect, the larger the slope of the line (or the bigger the difference between the averages). Figure 4.1 presents the four main effect plots for Example 4.1 .

Desilylation experiment: main effect plots

Figure 4.1: Desilylation experiment: main effect plots

4.1.2 Interactions

Another contrast that could be of interest in Example 4.1 has coefficients

\[ \boldsymbol{c}^{\mathrm{T}} = (\boldsymbol{1}_4^{\mathrm{T}}, -\boldsymbol{1}_8^{\mathrm{T}}, \boldsymbol{1}_4^{\mathrm{T}}) / 8 \,, \] where the divisor \(8 = 2^{f-1} = 2^3\) .

This contrast measures the difference between the average treatment effect from treatments 1-4, 13-16 and treatments 5-12. Checking back against Table 4.1 , we see this is comparing those treatments where solvent and reagent are both set to their low (1-4) or high (13-16) level against those treatments where one of the two factors is set high and the other is set low (5-12).

Focusing on reagent , if the effect of this factor on the response was independent of the level to which solvent has been set, you would expect this contrast to be zero - changing from the high to low level of reagent should affect the response in the same way, regardless of the setting of solvent . This argument can be reversed, focussing on the effect of solvent . Therefore, if this contrast is large, we say the two factors interact .

For Example 4.1 , this interaction contrast seems quite small, although of course without an estimate of the standard error we are still lacking a formal method to judge this.

It is somewhat more informative to consider the above interaction contrast as the average difference in two “sub-contrasts”

\[ \boldsymbol{c}^{\mathrm{T}}\boldsymbol{\tau} = \frac{1}{2}\left\{\frac{1}{4}\left(\tau_{13} + \tau_{14} + \tau_{15} + \tau_{16} - \tau_5 - \tau_6 - \tau_7 - \tau_8\right) - \frac{1}{4}\left(\tau_9 + \tau_{10} + \tau_{11} + \tau_{12} - \tau_1 - \tau_2 - \tau_3 - \tau_4\right) \right\}\,, \] The first component in the above expression is the effect of changing reagent from high to low given solvent is set to it’s high level . The second component the effect of changing reagent from high to low given solvent is set to it’s low level . This leads to our definition of a two-factor interaction.

Definition 4.2 The two-factor interaction between factors \(A\) and \(B\) is defined as the average difference in main effect of factor \(A\) when computed at the high and low levels of factor \(B\) .

\[\begin{align*} \mbox{Int}(A, B) & = \frac{1}{2}\left\{\mbox{ME}(A\mid B+) - \mbox{ME}(A \mid B-)\right\} \\ & = \frac{1}{2}\left\{\mbox{ME}(B \mid A+) - \mbox{ME}(B \mid A-)\right\} \\ & = \frac{1}{2}\left\{\bar{y}(A+, B+) - \bar{y}(A-, B+) - \bar{y}(A+, B-) + \bar{y}(A-, B-)\right\}\,, \end{align*}\]

where \(\bar{y}(A+, B-)\) is the average response when factor \(A\) is set to its high level and factor \(B\) is set to its low level, averaged across all combinations of levels of the other factors, and other averages are defined similarly. The conditional main effects of factor \(A\) when factor \(B\) is set to its high level is defined as

\[ \mbox{ME}(A\mid B+) = \bar{y}(A+, B+) - \bar{y}(A-, B+)\,, \]

with similar definitions for other conditional main effects.

As the sum of the squared contrast coefficients is the same for two-factor interactions as for main effects, the variance of the contrast estimator is also the same.

\[ \mbox{var}\left\{\mbox{Int}(A, B)\right\} = \frac{4\sigma^2}{n}\,. \] For Example 4.1 we can calculate two-factor interactions for all \({4 \choose 2} = 6\) pairs of factors. The simplest way to calculate the contrast coefficients is as the elementwise, or Hadamard, product 25 of the unscaled main effect contrasts (before dividing by \(2^{f-1}\) ).

Table 4.3: Desilylation experiment: two-factor interaction contrast coefficients
tem_x_tim tem_x_sol tem_x_rea tim_x_sol tim_x_rea sol_x_rea
Trt 1 0.125 0.125 0.125 0.125 0.125 0.125
Trt 2 -0.125 -0.125 -0.125 0.125 0.125 0.125
Trt 3 -0.125 0.125 0.125 -0.125 -0.125 0.125
Trt 4 0.125 -0.125 -0.125 -0.125 -0.125 0.125
Trt 5 0.125 -0.125 0.125 -0.125 0.125 -0.125
Trt 6 -0.125 0.125 -0.125 -0.125 0.125 -0.125
Trt 7 -0.125 -0.125 0.125 0.125 -0.125 -0.125
Trt 8 0.125 0.125 -0.125 0.125 -0.125 -0.125
Trt 9 0.125 0.125 -0.125 0.125 -0.125 -0.125
Trt 10 -0.125 -0.125 0.125 0.125 -0.125 -0.125
Trt 11 -0.125 0.125 -0.125 -0.125 0.125 -0.125
Trt 12 0.125 -0.125 0.125 -0.125 0.125 -0.125
Trt 13 0.125 -0.125 -0.125 -0.125 -0.125 0.125
Trt 14 -0.125 0.125 0.125 -0.125 -0.125 0.125
Trt 15 -0.125 -0.125 -0.125 0.125 0.125 0.125
Trt 16 0.125 0.125 0.125 0.125 0.125 0.125

Estimates of the interaction contrasts can again by found by considering the equivalent contrasts in the observed treatment means.

As with main effects, interactions are often displayed graphically using interaction plots, plotting average responses for each pairwise combination of factors, joined by lines.

Desilylation experiment: two-factor interaction plots

Figure 4.2: Desilylation experiment: two-factor interaction plots

Parallel lines in an interaction plot indicate no (or very small) interaction ( time and solvent , time and reagent , solvent and reagent ). The three interactions with temp all demonstrate much more robust behaviour at the high level; changing time , solvent or reagent makes little difference to the response at the high level of temp , and much less difference than at the low level of temp .

If a system displays important interactions, the main effects of factors involved in those interactions should no longer be interpreted. For example, it makes little sense to discuss the main effect of temp when is changes so much with the level of reagent (from strongly positive when reagent is low to quite small when reagent is high).

Higher order interactions can be defined similarly, as average differences in lower-order effects. For example, a three-factor interaction measures how a two-factor interaction changes with the levels of a third factor.

\[\begin{align*} \mbox{Int}(A, B, C) & = \frac{1}{2}\left\{\mbox{Int}(A, B \mid C+) - \mbox{Int}(A, B \mid C-)\right\} \\ & = \frac{1}{2}\left\{\mbox{Int}(A, C \mid B+) - \mbox{Int}(A, C \mid B-)\right\} \\ & = \frac{1}{2}\left\{\mbox{Int}(B, C \mid A+) - \mbox{Int}(B, C \mid A-)\right\}\,, \\ \end{align*}\]

\[ \mbox{Int}(A, B \mid C+) = \frac{1}{2}\left\{\bar{y}(A+, B+, C+) - \bar{y}(A-, B+, C+) - \bar{y}(A+, B-, C+) + \bar{y}(A-, B-, C+)\right\} \] is the interaction between factors \(A\) and \(B\) using only those treatments where factor \(C\) is set to it’s high level. Higher order interaction contrasts can again be constructed by (multiple) hadamard products of (unscaled) main effect contrasts.

Definition 4.3 A factorial effect is a main effect or interaction contrast defined on a factorial experiment. For a \(2^f\) factorial experiment with \(f\) factors, there are \(2^f-1\) factorial effects, ranging from main effects to the interaction between all \(f\) factors. The contrast coefficients in a factorial contrast all take the form \(c_i = \pm 1 / 2^{f-1}\) .

For Example 4.1 , we can now calculate all the factorial effects.

4.2 Three principles for factorial effects

Empirical study of many experiments ( Box and Meyer, 1986 ; Li et al. , 2006 ) have demonstrated that the following three principles often hold when analysing factorial experiments.

Definition 4.4 Effect hierarchy : lower-order factorial effects are more likely to be important than higher-order effects; factorial effects of the same order are equally likely to be important.

For example, we would anticipate more large main effects from the analysis of a factorial experiment than two-factor interactions.

Definition 4.5 Effect sparsity : the number of large factorial effects is likely to be small, relative to the total number under study.

This is sometimes called the pareto principle .

Definition 4.6 Effect heredity : interactions are more likely to be important if at least one parent factor also has a large main effect.

These three principles will provide us with some useful guidelines when analysing, and eventually constructing, factorial experiments.

4.3 Normal effect plots for unreplicated factorial designs

The lack of an estmate for \(\sigma^2\) means alternatives to formal inference methods (e.g. hypothesis tests) must be found to assess the size of factorial effects. We will discuss a method that essentially treats the identification of large factorial effects as an outlier identification problem.

Let \(\hat{\theta}_j\) be the \(j\) th estimated factorial effect, with \(\hat{\theta}_j = \sum_{i=1}^tc_{ij}\bar{y}_{i.}\) for \(\boldsymbol{c}_j^{\mathrm{T}} = (c_{1j}, \ldots, c_{tj})\) a vector of factorial contrast coefficients (defining a main effect or interaction). Then the estimator follows a normal distribution

\[ \hat{\theta}_j \sim N\left(\theta_j, \frac{4\sigma^2}{n}\right)\,,\qquad j = 1, \ldots, 2^f-1\,, \] for \(\theta_j\) the true, unknown, value of the factorial effect, \(j = 1,\ldots, 2^f\) . Further more, for \(j, l = 1, \ldots 2^f-1; \, j\ne l\) ,

\[\begin{align*} \mbox{cov}(\hat{\theta}_j, \hat{\theta}_l) & = \mbox{cov}\left(\sum_{i=1}^tc_{ij}\bar{y}_{i.}, \sum_{i=1}^tc_{il}\bar{y}_{i.}\right) \\ & = \sum_{i=1}^tc_{ij}c_{il}\mbox{var}(\bar{y}_{i.}) \\ & = \frac{\sigma^2}{r} \sum_{i=1}^tc_{ij}c_{il} \\ & = 0\,, \\ \end{align*}\]

as \(\sum_{i=1}^tc_{ij}c_{il} = 0\) for \(j\ne l\) . That is, the factorial contrasts are independent as the contrast coefficient vectors are orthogonal.

Hence, under the null hypothesis \(H_0: \theta_1 = \cdots = \theta_{2^f-1} = 0\) (all factorial effects are zero), the \(\hat{\theta}_j\) form a sample from independent normally distributed random variables from the distribution

\[\begin{equation} \hat{\theta}_j \sim N\left(0, \frac{4\sigma^2}{n}\right)\,,\qquad j = 1, \ldots, 2^f-1\,. \tag{4.3} \end{equation}\]

To assess evidence against \(H_0\) , we can plot the ordered estimates of the factorial effects against the ordered quantiles of a standard normal distribution. Under \(H_0\) , the points in this plot should lie on a straightline (the slope of the line will depend on the unknown \(\sigma^2\) ). We anticipate that the majority of the effects will be small ( effect sparsity ), and hence any large effects that lie away from the line are unlikely to come from distribution (4.3) and may be significantly different from zero. We have essentially turned the problem into an outlier identification problem.

For Example 4.1 , we can easily produce this plot in R . Table 4.4 gives the ordered factorial effects, which are then plotted against standard normal quantiles in Figure 4.3 .

Table 4.4: Desilylation experiment: sorted estimated factorial effects
contrast estimate
temp 8.1200
reagent 3.0875
time 2.5675
temp.solvent 2.3575
solvent.reagent 0.4900
time.solvent 0.4400
temp.time.solvent 0.2450
temp.time.reagent 0.1950
temp.time.solvent.reagent 0.1925
temp.solvent.reagent -0.0300
time.solvent.reagent -0.2375
time.reagent -0.6450
solvent -2.2175
temp.time -2.3575
temp.reagent -2.7725

Desilylation experiment: normal effects plot

Figure 4.3: Desilylation experiment: normal effects plot

In fact, it is more usual to use a half-normal plot to assess the size of factorial effects, where we plot the sorted absolute values of the estimated effects against the quantiles of a half-normal distribution 26

Desilylation experiment: half-normal effects plot

Figure 4.4: Desilylation experiment: half-normal effects plot

The advantage of a half-normal plot such as Figure 4.4 is that we only need to look at effects appearing in the top right corner (significant effects will always appear “above” a hypothetical straight line) and we do not need to worry about comparing large positive and negative values. For these reason, they are usually preferred over normal plots.

For the desilylation experiment, we can see the effects fall into three groups: one effect standing well away from the line, and almost certainly significant ( temp , from Table 4.4 ), then a group of six effects ( reagent , time , temp.solvent , solvent , temp.time , temp.reagent ) which may be significant, and then a group of 8 much smaller effects.

4.3.1 Lenth’s method for approximate hypothesis testing

The assessment of normal or half-normal effect plots can be quite subjective. Lenth ( 1989 ) introduced a simple method for conducting more formal hypothesis testing in unreplicated factorial experiments.

Lenth’s method uses a pseudo standard error (PSE):

\[ \mbox{PSE} = 1.5 \times \mbox{median}_{|\hat{\theta}_i| < 2.5s_0}|\hat{\theta}_i|\,, \] where \(s_0 = 1.5\times \mbox{median} |\hat{\theta}_i|\) is a consistent 27 estimator of the standard deviation of the \(\hat{\theta}_i\) under \(H_0: \theta_1 = \cdots=\theta_{2^f-1}=0\) . The PSE trims approximately 1% 28 of the \(\hat{\theta}_i\) to produce a robust estimator of the standard deviation, in the sense that it is not influenced by large \(\hat{\theta}_i\) belonging to important effects.

For Example 4.1 , we can construct the PSE as follows.

The PSE can be used to construct test statistics

\[ t_{\mbox{PSE}, i} = \frac{\hat{\theta}_i}{\mbox{PSE}}\,, \]

which mimic the usual \(t\) -statistics used when an estimate of \(\sigma^2\) is available. These quantities can be compared to reference distribution which was tabulated by Lenth ( 1989 ) and simulated in R using the unrepx package.

Table 4.5: Desilylation experiment: hypothesis tests using Lenth’s method.
effect Lenth_PSE t.ratio p.value simult.pval
temp 8.1200 0.66 12.303 0.0001 0.0007
reagent 3.0875 0.66 4.678 0.0039 0.0322
temp.reagent -2.7725 0.66 -4.201 0.0059 0.0529
time 2.5675 0.66 3.890 0.0079 0.0724
temp.solvent 2.3575 0.66 3.572 0.0110 0.1016
temp.time -2.3575 0.66 -3.572 0.0110 0.1016
solvent -2.2175 0.66 -3.360 0.0138 0.1241
time.reagent -0.6450 0.66 -0.977 0.3057 0.9955
solvent.reagent 0.4900 0.66 0.742 0.4306 1.0000
time.solvent 0.4400 0.66 0.667 0.5393 1.0000
temp.time.solvent 0.2450 0.66 0.371 0.7299 1.0000
time.solvent.reagent -0.2375 0.66 -0.360 0.7384 1.0000
temp.time.reagent 0.1950 0.66 0.295 0.7827 1.0000
temp.time.solvent.reagent 0.1925 0.66 0.292 0.7849 1.0000
temp.solvent.reagent -0.0300 0.66 -0.045 0.9661 1.0000

The function eff.test calculates unadjusted p-values ( p.value ) and simultaneous p-values ( simult.pval ) adjusted to account for multiple testing. Using the latter, from Table 4.5 we see that the main effects of temp and reagent are significant at the experiment-wise 5% level and, obeying effect heredity , their interaction (the p-value is borderline, and hovers around 0.05 depending on simulation error).

The package unrepx also provides the function hnplot to display these results graphically by adding a reference line to a half-normal plot; see Figure 4.5 . The ME and SME lines indicate the absolute size of effects that would be required to reject \(H_0: \theta_i = 0\) at an individual or experimentwise \(100\alpha\) % level, respectively.

Desilylation experiment: half-normal plot with reference lines from Lenth's method.

Figure 4.5: Desilylation experiment: half-normal plot with reference lines from Lenth’s method.

Informally, factorial effects with estimates greater than SME are thought highly likely to be significant, and effects between ME and SME are considered somewhat likely to be significant (and still worthy of further investigation if the budget allows).

4.4 Regression modelling for factorial experiments

We have identified \(d = 2^f-1\) factorial effects that we wish to estimate from our experiment. As \(d < t = 2^f\) , we can estimate these factorial effects using a full-rank linear regression model.

Let \(t\times d\) matrix \(C\) hold each factorial contrast as a column. Then

\[ \hat{\boldsymbol{\theta}} = C^{\mathrm{T}}\bar{\boldsymbol{y}}\,, \]

with \(\hat{\boldsymbol{\theta}}^{\mathrm{T}} = (\hat{\theta}_1, \ldots, \hat{\theta}_d)\) being the vector of estimated factorial effects and \(\bar{\boldsymbol{y}}^{\mathrm{T}} = (\bar{y}_{1.}, \ldots, \bar{y}_{t.})\) being the vector of treatment means.

We can define an \(n\times d\) expanded contrast matrix as \(\tilde{C} = C \otimes \boldsymbol{1}_r\) , where each row of \(\tilde{C}\) gives the contrast coefficients for each run of the experiment. Then,

\[ \hat{\boldsymbol{\theta}} = \frac{1}{r}\tilde{C}^{\mathrm{T}}\boldsymbol{y}\,. \] To illustrate, we will imagine a hypothetical version of Example 4.1 where each treatment was repeated three times (with \(y_{i1} = y_{i2} = y_{i3}\) ).

If we define a model matrix \(X = \frac{2^{f}}{2}\tilde{C}\) , then \(X\) is a \(n\times d\) matrix with entries \(\pm 1\) and columns equal to unscaled factorial contrasts. Then

\[\begin{align} \left(X^{\mathrm{T}}X\right)^{-1}X^{\mathrm{T}}\boldsymbol{y}& = \frac{1}{n} \times \frac{2^f}{2}\tilde{C}^{\mathrm{T}}\boldsymbol{y}\tag{4.4}\\ & = \frac{1}{2r}\tilde{C}^{\mathrm{T}}\boldsymbol{y}\\ & = \frac{1}{2}\hat{\boldsymbol{\theta}}\,. \\ \end{align}\]

The left-hand side of equation (4.4) is the least squares estimator \(\hat{\boldsymbol{\beta}}\) from the model

\[ \boldsymbol{y}= \boldsymbol{1}_n\beta_0 + X\boldsymbol{\beta} + \boldsymbol{\varepsilon}\,, \] where \(\boldsymbol{y}\) is the response vector and \(\boldsymbol{\varepsilon}\) the error vector from unit-treatment model (4.1) . We have simply re-expressed the mean response as \(\mu + \tau_i = \beta_0 + \boldsymbol{x}_i^{\mathrm{T}}\boldsymbol{\beta}\) , where \(d\) -vector \(\boldsymbol{x}_i\) holds the unscaled contrast coefficients for the main effects and interactions.

We can illustrate these connections for Example 4.1 .

The more usual way to think about this modelling approach is as a regression model with \(f\) (quantitative 29 ) variables, labelled \(x_1, \ldots, x_{2^f-1}\) , scaled to lie in the interval \([-1, 1]\) (in fact, they just take values \(\pm 1\) ). We can then fit a regression model in these variables, and include products of these variables to represent interactions. We usually also include the intercept term. For Example 4.1 :

Table 4.6: Desilylation example: factorial effects calculated using a regression model.
x
temp 8.1200
time 2.5675
solvent -2.2175
reagent 3.0875
temp:time -2.3575
temp:solvent 2.3575
temp:reagent -2.7725
time:solvent 0.4400
time:reagent -0.6450
solvent:reagent 0.4900
temp:time:solvent 0.2450
temp:time:reagent 0.1950
temp:solvent:reagent -0.0300
time:solvent:reagent -0.2375
temp:time:solvent:reagent 0.1925

A regression modelling approach is usually more straightforward to apply than defining contrasts in the unit-treatment model, and makes clearer the connection between interaction contrasts and products of main effect contrasts (automatically defined in a regression model). It also enables us to make use of the effects package in R to quickly produce main effect and interaction plots.

Desilylation experiment: interaction plot generated using the `effects` package.

Figure 4.6: Desilylation experiment: interaction plot generated using the effects package.

4.4.1 ANOVA for factorial experiments

The basic ANOVA table has the following form.

Table 4.7: The ANOVA table for a full factorial experiment
Source Degress of Freedom (Sequential) Sum of Squares Mean Square
Regression \(2^f-1\) \(\sum_{j=1}^{2^f-1}n\hat{\beta}_j^2 - n\bar{y}^2\) Reg SS/\((2^f-1)\)
Residual \(2^f(r-1)\) \((\boldsymbol{Y}-X\hat{\boldsymbol{\beta}})^{\textrm{T}}(\boldsymbol{Y}-X\hat{\boldsymbol{\beta}})\) RSS/\((2^f(r-1))\)
Total \(2^fr-1\) \(\boldsymbol{Y}^{\textrm{T}}\boldsymbol{Y}-n\bar{Y}^{2}\)

The regression sum of squares for a factorial experiment has a very simple form. If we include an intercept column in \(X\) , from Section 1.5.1 ,

\[\begin{align*} \mbox{Regression SS} & = \mbox{RSS(null)} - \mbox{RSS} \\ & = \hat{\boldsymbol{\beta}}^{\mathrm{T}}X^{\mathrm{T}}X\hat{\boldsymbol{\beta}} - n\bar{y}^2 \\ & = \sum_{j=1}^{2^f-1}n\hat{\beta}_j^2 - n\bar{y}^2\,, \end{align*}\]

as \(X^{\mathrm{T}}X = nI_{2^f}\) . Hence, the \(j\) th factorial effect contributes \(n\hat{\beta}_j^2\) to the regression sum if squares, and this quantity can be used to construct a test statistic if \(r>1\) and hence an estimate of \(\sigma^2\) is available. For Example 4.1 , the regression sum of squares and ANOVA table are given in Tables 4.8 and 4.9 .

Table 4.8: Desilylation experiment: regression sums of squares for each factorial effect calculated directly.
Sum Sq.
Regression 427.2837
temp 263.7376
time 26.3682
solvent 19.6692
reagent 38.1306
temp:time 22.2312
temp:solvent 22.2312
temp:reagent 30.7470
time:solvent 0.7744
time:reagent 1.6641
solvent:reagent 0.9604
temp:time:solvent 0.2401
temp:time:reagent 0.1521
temp:solvent:reagent 0.0036
time:solvent:reagent 0.2256
temp:time:solvent:reagent 0.1482
Table 4.9: Desilylation experiment: ANOVA table from function.
Df Sum Sq
temp 1 263.7376
time 1 26.3682
solvent 1 19.6692
reagent 1 38.1306
temp:time 1 22.2312
temp:solvent 1 22.2312
temp:reagent 1 30.7470
time:solvent 1 0.7744
time:reagent 1 1.6641
solvent:reagent 1 0.9604
temp:time:solvent 1 0.2401
temp:time:reagent 1 0.1521
temp:solvent:reagent 1 0.0036
time:solvent:reagent 1 0.2256
temp:time:solvent:reagent 1 0.1482
Residuals 0 0.0000

4.5 Exercises

A reactor experiment that was presented by Box, Hunter and Hunter (2005, pp259-261) that used a full factorial design for \(m=5\) factors, each at two levels, to investigate the effect of feed rate (litres/min), catalyst (%), agitation rate (rpm), temperature (C) and concentration (%) on the percentage reacted . The levels of the experimental factors will be coded as \(-1\) for low level, and \(1\) for high level. Table 4.10 presents the true factor settings corresponding to these coded values.

Table 4.10: Factor levels for the full factorial reactor experiment
Factor Low level (\(-1\)) High level (\(1\))
Feed Rate (litres/min) 10 15
Catalyst (%) 1 2
Agitation Rate (rpm) 100 120
Temperature (C) 140 180
Concentration (%) 3 6

The data from this experiment is given in Table 4.11 .

Table 4.11: Reactor experiment.
FR Cat AR Temp Conc pre.react
-1 -1 -1 -1 -1 61
1 -1 -1 -1 -1 53
-1 1 -1 -1 -1 63
1 1 -1 -1 -1 61
-1 -1 1 -1 -1 53
1 -1 1 -1 -1 56
-1 1 1 -1 -1 54
1 1 1 -1 -1 61
-1 -1 -1 1 -1 69
1 -1 -1 1 -1 61
-1 1 -1 1 -1 94
1 1 -1 1 -1 93
-1 -1 1 1 -1 66
1 -1 1 1 -1 60
-1 1 1 1 -1 95
1 1 1 1 -1 98
-1 -1 -1 -1 1 56
1 -1 -1 -1 1 63
-1 1 -1 -1 1 70
1 1 -1 -1 1 65
-1 -1 1 -1 1 59
1 -1 1 -1 1 55
-1 1 1 -1 1 67
1 1 1 -1 1 65
-1 -1 -1 1 1 44
1 -1 -1 1 1 45
-1 1 -1 1 1 78
1 1 -1 1 1 77
-1 -1 1 1 1 49
1 -1 1 1 1 42
-1 1 1 1 1 81
1 1 1 1 1 82

Estimate all the factorial effects from this experiment, and use a half-normal plot and Lenth’s method to decide which are significantly different from zero.

Use the effects package to produce main effect and/or interaction plots for each significant factorial effect from part a.

Now fit a regression model that only includes terms corresponding to main effects and two-factor interactions. How many degrees of freedom does this model use? What does this mean for the estimation of \(\sigma^2\) ? How does the estimate of \(\sigma^2\) from this model relate to your analysis in part a?

  • We will estimate the factorial effects as twice the corresponding regression parameters.
Table 4.12: Reactor experiment: estimated factorial effects.
x
FR -1.375
Cat 19.500
AR -0.625
Temp 10.750
Conc -6.250
FR:Cat 1.375
FR:AR 0.750
FR:Temp -0.875
FR:Conc 0.125
Cat:AR 0.875
Cat:Temp 13.250
Cat:Conc 2.000
AR:Temp 2.125
AR:Conc 0.875
Temp:Conc -11.000
FR:Cat:AR 1.500
FR:Cat:Temp 1.375
FR:Cat:Conc -1.875
FR:AR:Temp -0.750
FR:AR:Conc -2.500
FR:Temp:Conc 0.625
Cat:AR:Temp 1.125
Cat:AR:Conc 0.125
Cat:Temp:Conc -0.250
AR:Temp:Conc 0.125
FR:Cat:AR:Temp 0.000
FR:Cat:AR:Conc 1.500
FR:Cat:Temp:Conc 0.625
FR:AR:Temp:Conc 1.000
Cat:AR:Temp:Conc -0.625
FR:Cat:AR:Temp:Conc -0.500

There are several large factorial effects, including the main effects of Catalyst and Temperature and the interaction between these factors, and the interaction between Concentration and Temperature. We can assess their significance using a half-normal plot and Lenth’s method.

factorial experiment equation

We see that PSE = 1.3125, giving individual and simultaneous margins of error of 2.7048 and 5.0625, respectively (where the latter is adjusted for multiple testing). There is a very clear distinction between the five effects which are largest in absolute value and the other factorial effects, which form a very clear line. The five of the largest effects are given in Table 4.13 , are all greater than both margins of error and can be declared as significant.

Table 4.13: Reactor experiment: factorial effects significantly different from zero via Lenth’s method.
x
Cat 19.50
Temp 10.75
Conc -6.25
Cat:Temp 13.25
Temp:Conc -11.00
  • We will produce plots for the interactions between Catalyst and Temperature and Temperature and Concentration. We will not produce main effect plots for Catalyst and Temperature, as these are involved in the large interactions.

Reactor experiment: interaction plots.

Figure 4.7: Reactor experiment: interaction plots.

Notice that changing the level of Temperature changes substantial the effect of both Catalyst and Concentration on the response; in particular, the effect of Concentration changes sign depending on the level of Temperature.

  • We start by fiting the reduced regression model.

This model includes regression parameters corresponding to \(5 + {5 \choose 2} = 15\) factorial effects, plus the intercept, and hence uses 16 degrees of freedom. The remaining 16 degrees of freedom, which were previously used to estimate three-factor and higher interactions, is now used to estimate \(\sigma^2\) , the background variation.

The residual mean square in the reduced model, used to estimate \(\sigma^2\) , is the sum of the sums of squares for the higher-order interactions in the original model, divided by 16 (the remaining degrees of freedom).

(Adapted from Morris, 2011 ) Consider an unreplicated ( \(r=1\) ) \(2^6\) factorial experiment. The total sums of squares,

\[ \mbox{Total SS} = \sum_{i=1}^n(y_i - \bar{y})^2\,, \]

has value 2856. Using Lenth’s method, an informal analysis of the data suggests that there are only three important factorial effects, with least squares estimates

  • main effect of factor \(A\) = 3
  • interaction between factors \(A\) and \(B\) = 4
  • interaction between factors \(A\) , \(B\) and \(C\) = 2.

If a linear model including only an intercept and these three effects is fitted to the data, what is the value of the residual sum of squares?

The residual sum of squares has the form

\[ \mbox{RSS} = (\boldsymbol{y}- X\hat {\boldsymbol{\beta}})^{\mathrm{T}}(\boldsymbol{y}- X\hat {\boldsymbol{\beta}})\,, \]

where in this case \(X\) is a \(2^6\times 4\) model matrix, with columsn corresponding to the intercept, main effect of factor \(A\) , the interaction between factors \(A\) and \(B\) , the interaction between factors \(A\) , \(B\) and \(C\) . We can rewrite the RSS as

\[\begin{equation*} \begin{split} \mbox{RSS} & = (\boldsymbol{y}- X\hat {\boldsymbol{\beta}})^{\mathrm{T}}(\boldsymbol{y}- X\hat {\boldsymbol{\beta}}) \\ & = \boldsymbol{y}^{\mathrm{T}}\boldsymbol{y}- 2\boldsymbol{y}^{\mathrm{T}}X\hat {\boldsymbol{\beta}} + \hat {\boldsymbol{\beta}}^{\mathrm{T}}X^{\mathrm{T}}X\hat {\boldsymbol{\beta}} \\ & = \boldsymbol{y}^{\mathrm{T}}\boldsymbol{y}- 2\hat {\boldsymbol{\beta}}^{\mathrm{T}}X^{\mathrm{T}}X\hat {\boldsymbol{\beta}} + \hat {\boldsymbol{\beta}}^{\mathrm{T}}X^{\mathrm{T}}X\hat {\boldsymbol{\beta}} \\ & = \boldsymbol{y}^{\mathrm{T}}\boldsymbol{y}- \hat {\boldsymbol{\beta}}^{\mathrm{T}}X^{\mathrm{T}}X\hat {\boldsymbol{\beta}}\,, \end{split} \end{equation*}\]

as \(\boldsymbol{y}^{\mathrm{T}}X = \hat{\boldsymbol{\beta}}^{\mathrm{T}}X^{\mathrm{T}}X\) .

Due the matrix \(X\) having orthogonal columns, \(X^{\mathrm{T}}X = 2^fI_{p+1}\) , for a model containing coefficients corresponding to \(p\) factorial effects; here, \(p=3\) . Hence,

\[ \mbox{RSS} = \boldsymbol{y}^{\mathrm{T}}\boldsymbol{y}- 2^f \sum_{i=0}^{p}\hat{\beta}_i^2\,. \]

Finally, the estimate of the intercept takes the form \(\hat{\beta}_0 = \bar{Y}\) , and so

\[\begin{equation*} \begin{split} \mbox{RSS} & = \boldsymbol{y}^{\mathrm{T}}\boldsymbol{y}- 2^f\bar{y}^2 - 2^f\sum_{i=1}^{p}\hat{\beta}_i^2 \\ & = \sum_{i=1}^{2^f}(y_i - \bar{y})^2 - 2^f\sum_{i=1}^{p}\hat{\beta}_i^2 \\ & = \mbox{Total SS} - 2^f\sum_{i=1}^{p}\hat{\beta}_i^2\, \end{split} \end{equation*}\]

Recalling that each regression coefficient is one-half of the corresponding factorial effect, for this example we have:

\[ \mbox{RSS} = 2856 - 2^6(1.5^2 + 2^2 + 1^2) = 2392\,. \]

(Adapted from Morris, 2011 ) Consider a \(2^7\) experiment with each treatment applied to two units ( \(r=2\) ). Assume a linear regression model will be fitted containing terms corresponding to all factorial effects.

What is the variance of the estimator of each factorial effect, up to a constant factor \(\sigma^2\) ?

What is the variance of the least squares estimator of \(E(y_{11})\) , the expected value of an observation with the first treatment applied? You can assume the treatments are given in standard order, so the first treatment is defined by setting all factors to their low level. [The answer is, obviously, the same for \(E(y_{12})\) ]. In a practical experimental setting, why is this not a useful quantity to estimate?

What is the variance of the least squares estimator of \(E(y_{11}) - E(y_{21})\) ? You may assume that the second treatment has all factors set to their low levels except for the seventh factor.

Each factorial contrast is scaled so the variance for the estimator is equal to \(4\sigma^2/n = \sigma^2 / 64\) .

\(E(y_{11}) = \boldsymbol{x}_1^{\mathrm{T}}\boldsymbol{\beta}\) , where \(\boldsymbol{x}_1^{\mathrm{T}}\) is the row of the \(X\) matrix corresponding to the first treatment and \(\boldsymbol{\beta}\) are the regression coefficients. The estimator is given by

\[ \hat{E}(y_{11}) = \boldsymbol{x}_1^{\mathrm{T}}\hat{\boldsymbol{\beta}}\,, \]

with variance

\[\begin{align*} \mathrm{var}\left\{\hat{E}(y_{11})\right\} & = \mathrm{var}\left\{\boldsymbol{x}_1^{\mathrm{T}}\hat{\boldsymbol{\beta}}\right\} \\ & = \boldsymbol{x}_1^{\mathrm{T}}\mbox{var}(\hat{\boldsymbol{\beta}})\boldsymbol{x}_1 \\ & = \boldsymbol{x}_1^{\mathrm{T}}\left(X^\mathrm{T}X\right)^{-1}\boldsymbol{x}_1\sigma^2 \\ & = \frac{\boldsymbol{x}_1^{\mathrm{T}}\boldsymbol{x}_1\sigma^2}{2^8} \\ & = \frac{2^7\sigma^2}{2^8} \\ & = \sigma^2 / 2\,. \end{align*}\]

This holds for the expected response from any treatment, as \(\boldsymbol{x}_j^{\mathrm{T}}\boldsymbol{x}_j = 2^7\) for all treatments, as each entry of \(\boldsymbol{x}_j\) is equal to \(\pm 1\) .

This would not be a useful quantity to estimate in a practical experiment, as it is not a contrast in the treatments. In particular, it depends on the estimate of the overall mean, \(\mu\) or \(\beta_0\) (in the unit-treatment or regression model) that will vary from experiment to experiment.

The expected values of \(y_{11}\) and \(y_{21}\) will only differ in terms involving the seventh factor, which is equal to its low level (-1) for the first treatment and its high level (+1) for the second treatment; all the other terms will cancel. Hence

\[ E(y_{11}) - E(y_{21}) = -2\left(\beta_7 + \sum_{j=1}^6\beta_{j7} + \sum_{j=1}^6\sum_{k=j+1}^6\beta_{jk7} + \ldots + \beta_{1234567}\right)\,. \]

The variance of the estimator has the form

\[\begin{align*} \mathrm{var}\left\{\widehat{E(y_{11}) - E(y_{21})}\right\} & = 4\times\mathrm{var}\bigg(\hat{\beta}_7 + \sum_{j=1}^6\hat{\beta}_{j7} + \sum_{j=1}^6\sum_{k=j+1}^6\hat{\beta}_{jk7} + \\ & \ldots + \hat{\beta}_{1234567}\bigg) \\ & = \frac{4\sigma^2}{2\times 2^7}\sum_{j=0}^6{6 \choose j} \\ & = \frac{\sigma^2}{2^6}\times 64 \\ & = \sigma^2\,. \end{align*}\]

Or, as this is a treatment comparison in a CRD, we have

\[ \hat{E}(y_{11}) - \hat{E}(y_{21}) = \widehat{\boldsymbol{c}^{\mathrm{T}}\boldsymbol{\tau}}\,, \]

where \(\boldsymbol{c}\) corresponds to a pairwise treatment comparison, and hence has one entry equal to +1 and one entry equal to -1. From Section 2.5 ,

\[\begin{align*} \mathrm{var}\left(\widehat{\boldsymbol{c}^{\mathrm{T}}\boldsymbol{\tau}}\right) & = \sum_{i=1}^tc_i^2\mathrm{var}(\bar{y}_{i.}) \\ & = \sigma^2\sum_{i=1}^tc_i^2/n_i\,, \end{align*}\]

where in this example \(n_i = 2\) for all \(i\) and \(\sum_{i=1}^tc_i^2 = 2\) . Hence, the variance is again equal to \(\sigma^2\) .

Desilylation is a process of removing silyl, SiH \(_3\) a silicon hydride, from a compound. ↩︎

For two matrices \(A\) and \(B\) of the same dimension \(m\times n\) , the Hadamard product \(A\odot B\) is a matrix of the same dimension with elements given by the elementwise product, \((A\odot B)_{ij} = A_{ij}B_{ij}\) . ↩︎

The absolute value of a normally distributed random variable follows a half-normal distribution. ↩︎

Essentially, \(s_0\) tends in probability to \(\sigma\) as the number of factorial effects tends to infinity. ↩︎

Under \(H_0\) , the \(\hat{\theta}_i\) come from a mean-zero normal distribution, and about 1% of deviates fall outside \(\pm 2.57\sigma^2\) . ↩︎

When qualitative factors only have two levels, each regression term only has 1 degree of freedom, and so there is practically little difference from a quantitative variable. ↩︎

Design of Experiments

Chapter 9 fractional factorial designs, 9.1 introduction.

Factorial treatment designs are necessary for estimating factor interactions and offer additional advantages (Chapter 6 ). However, their implementation is challenging if we consider many factors or factors with many levels, because the number of treatments then might require prohibitive experiment sizes. Large factorial experiments also pose problems for blocking, since reasonable block sizes that ensure homogeneity of the experimental material within a block are often smaller than the number of treatment level combinations.

For example, a factorial treatment structure with five factors of two levels each already has \(2^5=32\) treatment combinations. An experiment with 32 experimental units then has no residual degrees of freedom, but two full replicates of this design already require 64 experimental units. If each factor has three levels, the number of treatment combinations increases drastically to \(3^5=243\) .

On the other hand, we can often justify the assumption of effect sparsity : effect sizes of high-order interactions are often negligible, especially if interactions of lower orders already have small effect sizes. The key observation for reducing the experiment size is that a large portion of model parameters relate to higher-order interactions: in our example, there are 32 model parameters: one grand mean, five main effects, ten two-way interactions, ten three-way interactions, five four-way interactions, and one five-way interaction. The number of higher-order interactions and their parameters grows fast with increasing number of factors as shown in Table 9.1 for factorials with two factor levels and 3 to 7 factors.

If we ignore three-way and higher interactions in the example, we remove 16 parameters from the model equation and only require 16 observations for estimating the remaining model parameters; this is known as a half-fraction of the \(2^5\) -factorial. Of course, the ignored interactions do not simply vanish, but their effects are now confounded with those of lower-order interactions or main effects. The question then arises: which 16 out of the 32 possible treatment combinations should we consider such that no effect of interest is confounded with a another non-negligible effect?

Table 9.1: Number of parameters for effects of different order in \(2^k\) designs.
Factorial 0 1 2 3 4 5 6 7
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1

In this chapter, we discuss the general construction and analysis of fractional replications of \(2^k\) -factorial designs where all factors have two levels. This restriction is often sufficient for practical experiments with many factors, where interest focuses on identifying relevant factors and low-order interactions. We first consider generic factors which we call A , B and so forth, and denote their levels as low (or \(-1\) ) and high (or \(+1\) ). Similar techniques to those discussed here are available for factorials with more than two factors levels and for combination of factors with different number of levels, but the required mathematics is beyond our scope.

We further extend our ideas of fractional replication to deliberately confound some effects with blocks. This allows us to run a \(2^5\) -factorial in blocks of size 16, for example. By altering the confounding between pairs of blocks, we can still recover all effects, albeit with reduced precision.

9.2 Aliasing in the \(2^3\) factorial

9.2.1 introduction.

We begin our discussion with the simple example of a \(2^3\) -factorial treatment structure in a completely randomized design. We denote the treatment factors A , B , and C and their levels as \(A\) , \(B\) , and \(C\) with values \(-1\) and \(+1\) . Recall that for any \(2^k\) -factorial, all main effects and all interaction factors (of any order) have one degree of freedom. We can thus also encode the two independent levels of any interaction as \(-1\) and \(+1\) , and we define the level by multiplying the levels of the constituent factors: for \(A=-1\) , \(B=+1\) , \(C=-1\) , the level of A:B is \(AB=A\cdot B=-1\) and the level of A:B:C is \(ABC=A\cdot B\cdot C=+1\) .

It is also convenient to use an additional shorthand notation for a treatment combination, where we use a character string containing the lower-case letter of a treatment factor if it is present on its high level, and no letter if it is present on its low level. For example, we write \(abc\) if A , B , C are on level \(+1\) , and all potential other factors are on the low level \(-1\) , and \(ac\) if A and C are on the high level, and B on its low level. We denote a treatment combination with all factors on their low level by \((1)\) . For a \(2^3\) -factorial, the eight different treatments are then \((1)\) , \(a\) , \(b\) , \(c\) , \(ab\) , \(ac\) , \(bc\) , and \(abc\) .

For example, testing compositions for growth media with factors Carbon with levels glucose and fructose , Nitrogen with levels low and high , and Vitamin with levels Mix 1 and Mix 2 leads to a \(2^3\) -factorial with the 8 possible treatment combinations shown in Table 9.2 .

Table 9.2: Eight treatment level combinations for \(2^3\) factorial with corresponding level of interactions and shorthand notation.
A B C AB AC BC ABC Shorthand
\(-1\) \(-1\) \(-1\) \(+1\) \(+1\) \(+1\) \(-1\) \((1)\)
\(-1\) \(-1\) \(+1\) \(+1\) \(-1\) \(-1\) \(+1\) \(c\)
\(-1\) \(+1\) \(-1\) \(-1\) \(+1\) \(-1\) \(+1\) \(b\)
\(-1\) \(+1\) \(+1\) \(-1\) \(-1\) \(+1\) \(-1\) \(bc\)
\(+1\) \(-1\) \(-1\) \(-1\) \(-1\) \(+1\) \(+1\) \(a\)
\(+1\) \(-1\) \(+1\) \(-1\) \(+1\) \(-1\) \(-1\) \(ac\)
\(+1\) \(+1\) \(-1\) \(+1\) \(-1\) \(-1\) \(-1\) \(ab\)
\(+1\) \(+1\) \(+1\) \(+1\) \(+1\) \(+1\) \(+1\) \(abc\)

9.2.2 Effect estimates

In a \(2^k\) -factorial treatment structure, we estimate main effects and interactions as simple contrasts by subtracting the sum of responses of all observations with the corresponding factors on the low level from those with the factors on the high level. For our example, we estimate the main effect of C-Source (or generically A ) by subtracting all observations with fructose as our carbon source from those with glucose , and averaging: \[\begin{align*} \text{A main effect} &= \frac{1}{4}\left(\,(a-(1)) + (ab-b) + (ac-c) + (abc-bc)\,\right) \\ &= \frac{1}{4}\left(\underbrace{(a+ab+ac+abc)}_{A=+1}-\underbrace{((1)+b+c+bc)}_{A=-1}\right)\;. \end{align*}\] A two-way interaction is a difference of differences and we find the interaction of B with C by first finding the difference between them for A on the low level and for A on the high level: \[ \frac{1}{2}\underbrace{\left((abc-ab)\,-\,(ac-a)\right)}_{A=+1} \quad\text{and}\quad \frac{1}{2}\underbrace{\left((bc-b)\,-\,(c-(1))\right)}_{A=-1}\;. \] The interaction effect is then the averaged difference between the two \[\begin{align*} \text{B:C interaction} &= \frac{1}{4} \left(\;\left((abc-ab)-(ac-a)\right)+\left((bc-b)-(c-(1))\right)\;\right) \\ &= \frac{1}{4} \left(\; \underbrace{(abc+bc+a+(1))}_{BC=+1}\,-\,\underbrace{(ab+ac+b+c)}_{BC=-1}\; \right)\;. \end{align*}\] This value is equivalently found by taking the difference between observations with \(BC=+1\) (the interaction at its ‘high’ level) and \(BC=-1\) (the interaction at its ‘low’ level) and averaging. The other interaction effects are estimated by contrasting the corresponding observations for \(AB=\pm 1\) and \(AC=\pm 1\) , and \(ABC=\pm 1\) , respectively.

9.2.3 Design with four treatment combinations

We are interested in reducing the size of the experiment and for reasons that will become clear shortly, we choose a design based on measuring the response for four out of the eight treatment combinations. This will only allow estimation of four parameters in the linear model, and exactly which parameters can be estimated depends on the treatments chosen. The question then is: which four treatment combinations should we select?

We investigate three specific choices to get a better understanding of the consequences for effect estimation. The designs are illustrated in Figure 9.1 , where treatment level combinations form a cube with eight vertices, from which four are selected in each case.

Some fractions of a $2^3$-factorial. A: Arbitrary choice of treatment combinations leads to problems in estimating any effects properly. B: One variable at a time (OVAT) design. C: Keeping one factor at a constant level confounds this factor with the grand mean and creates a $2^2$-factorial of the remaining factors.

Figure 9.1: Some fractions of a \(2^3\) -factorial. A: Arbitrary choice of treatment combinations leads to problems in estimating any effects properly. B: One variable at a time (OVAT) design. C: Keeping one factor at a constant level confounds this factor with the grand mean and creates a \(2^2\) -factorial of the remaining factors.

First, we arbitrarily select the four treatment combinations \((1), a, b, ac\) (Fig. 9.1 A). With this choice, none of the main effects or interaction effects can be estimated using all four data points. For example, an estimate of the A main effect involves \(a-(1)\) , \(ab-b\) , \(ac-c\) , and \(abc-bc\) , but only one of these— \(a-(1)\) —is available in this experiment. Compared to a factorial experiment in four runs, this choice of treatment combinations thus allows using only one-half of the available data for estimating this effect. If we would follow the above logic and contrast the observations with A at the high level with those with A at the low level, thereby using all data, the main effect is estimated as \((ac+a)-(b+(1))\) and obviously leads to a biased and incorrect estimate of the main effect, since the other factors are at ‘incompatible’ levels. Similar problems arise for B and C main effects, where only \(b-(1)\) , respectively \(ac-a\) are available. None of the interactions can be estimated from these data and we are left with a very unsatisfactory muddle of conditional effect estimates that are valid only if other factors are kept at particular levels.

Next, we try to be more systematic and select the four treatment combinations \((1), a, b, c\) (Fig. 9.1 C) where all factors occur on low and high levels. Again, main effect estimates are based on half of the data for each factor, but their calculation is now simpler: \(a-(1)\) , \(b-(1)\) , and \(c-(1)\) , respectively. We note that each estimate involves the same level \((1)\) . This design resembles a one variable at a time experiment, where effects can be estimated individually for each factor, by no estimates of interactions are available. All advantages of a factorial treatment design are then lost.

Finally, we select the four treatment combinations \((1), b, c, bc\) with A on the low level (Fig. 9.1 B). This design is effectively a \(2^2\) -factorial with treatment factors B and C and allows estimation of their main effects and their interaction, but no information is available on any effects involving the third treatment factor A . For example, we estimate the B main effect using \((bc+b)\,-\,(c+(1))\) , and the B:C interaction using \((bc-b)-(c-(1))\) . If we look more closely into Table 9.2 , we find a simple confounding structure: the level of B is always identical to that of A:B . In other words, the two effects are completely confounded in this design, and \((bc+b)\,-\,(c+(1))\) is in fact an estimate of the sum of the B main effect and the A:B interaction. Similarly, C is completely confounded with A:C , and B:C with A:B:C . Finally, the grand mean is confounded with the A main effect; this makes sense since any estimate of the overall average is based only on the low level of A .

9.2.4 The half-replicate or fractional factorial

Neither of the previous three choices provided a convincing reduction of the factorial design. We now discuss a fourth possibility, the half-replicate of the \(2^3\) -factorial, called a \(2^{3-1}\) -fractional factorial . The main idea is to deliberately alias a high-order interaction with the grand mean. For a \(2^3\) -factorial, we alias the three-way interaction A:B:C by selecting either those four treatment combinations that have \(ABC=-1\) or those that have \(ABC=+1\) . We call the corresponding equation the generator of the fractional factorial; the two possible sets are shown in Figure 9.2 . With either choice, we find three more effect aliases by consulting Table 9.2 . For example, using \(ABC=+1\) as our generator yields the four treatment combinations \(a, b, c, abc\) and we find that A is completely confounded with B:C , B with A:C , and C with A:B .

In this design, any estimate thus corresponds to the sum of two effects. For example, \((a+abc)-(b+c)\) estimates the sum of A and B:C : first, the main effect of A is found as the difference of the runs \(a\) and \(abc\) with A on its high level, and the runs \(b\) and \(c\) with A on its low level: \((a+abc)-(b+c)\) . Second, we contrast runs with B:C on the high level ( \(a\) and \(abc\) ) with those with B:C on its low level ( \(b\) and \(c\) ) for estimating the B:C interaction effect, which is again \((a+abc)-(b+c)\) .

The fractional factorial based on a generator deliberately aliases each main effect with a two-way interaction, and the grand mean with the three-way interaction. This yields a very simple aliasing of effects and each estimate is based on the full data. Moreover, we note that by pooling the treatment combinations over levels of one of the three factors, we create three different \(2^2\) -factorials based on the two remaining factors. For example, ignoring the level of C leads to the full factorial in A and B shown in Figure 9.2 . This is a consequence of the aliasing, as C is completely confounded with A:B .

The two half-replicates of a $2^3$-factorial with three-way interaction and grand mean confounded. Any projection of the design to two factors yields a full $2^2$-factorial design and main effects are confounded with two-way interactions. A: design based on low level of three-way interaction; B: complementary design based on high level.

Figure 9.2: The two half-replicates of a \(2^3\) -factorial with three-way interaction and grand mean confounded. Any projection of the design to two factors yields a full \(2^2\) -factorial design and main effects are confounded with two-way interactions. A: design based on low level of three-way interaction; B: complementary design based on high level.

Our full linear model for a three-factor factorial is \[ y_{ijkl} = \mu + \alpha_i + \beta_j + \gamma_k + (\alpha\beta)_{ij} + (\alpha\gamma)_{ik} + (\beta\gamma)_{jk} + (\alpha\beta\gamma)_{ijk} + e_{ijkl} \] and it contains eight sets of parameters plus the residual variance. In a half-replicate of the \(2^3\) -factorial, we can only estimate the four derived parameters \[ \mu + (\alpha\beta\gamma)_{ijk}, \quad \alpha_i + (\beta\gamma)_{jk}, \quad \beta_j + (\alpha\gamma)_{ik}, \quad \gamma_k + (\alpha\beta)_{ij}\;. \] These provide the alias sets of confounded parameters, where only the sum of parameters in each set can be estimated: \[ \{1, ABC\}, \quad \{A, BC\}, \quad \{B, AC\}, \quad \{C, AB\}\;. \]

If the three interactions are negligible, then our four estimates correspond exactly to the grand mean and the three main effects. This corresponds to an additive model without interactions and allows a simple and clean interpretation of the parameter estimates. For example, with \((\beta\gamma)_{jk}=0\) , the second derived parameter is now identical to \(\alpha_i\) .

It might also be the case that the A and B main effects and their interaction are the true effects, while the factor C plays no role. The estimates of the four derived parameters are now estimates of the parameters \(\mu\) , \(\alpha_i\) , \(\beta_j\) , and \((\alpha\beta)_{ij}\) , while \(\gamma_k=(\alpha\gamma)_{ik}=(\beta\gamma)_{jk}=(\alpha\beta\gamma)_{ijk}=0\) .

Many other combinations are possible, but the aliasing in the \(2^{3-1}\) -fractional factorial does not allow us to distinguish the different interpretations without additional experimentation.

9.3 Aliasing in the \(2^k\) -factorial

The half-replicate of a \(2^3\) -factorial does not provide an entirely convincing example for the usefulness of fractional factorial designs due to the complete confounding of main effects and two-way interactions, both of which are typically of great interest. With more factors in the treatment structure, however, we are able to alias interactions of higher order and confound low-order interactions of interest with high-order interactions that we might assume negligible.

9.3.1 Using generators

The generator or generating equation provides a convenient way for constructing fractional factorial designs. The generator is then a word written by concatenating the factor letters, such that \(AB\) denotes a two-way interaction, and our previous example \(ABC\) is a three-way interaction; the special ‘word’ \(1\) denotes the grand mean. A generator is then a formal equation that identifies two words and enforces the equality of the corresponding treatment combinations. In our \(2^{3-1}\) design, the generator \[ ABC=+1\;, \] selects all those rows in Table 9.2 for which the relation is true, i.e., for which \(ABC\) is on the high level.

A generator determines the effect confounding of the experiment: the generator itself is one confounding and \(ABC=+1\) describes the complete confounding of the the three-way interaction A:B:C with the grand mean.

From the generator, we can derive all other confoundings by simple algebraic manipulation. By formally ‘multiplying’ the generator with an arbitrary word, we find a new relation between effects. In this manipulation, the multiplication with the letter \(+1\) leaves the equation unaltered, multiplication with \(-1\) inverses signs, and a product of two identical letters yields \(+1\) . For example, multiplying our generator \(ABC=+1\) with the word \(B\) yields \[ ABC\cdot B=(+1)\cdot B \iff AC=B\;. \] In other words, the B main effect is confounded with the A:C interaction. Similarly, we find \(AB=C\) and \(BC=A\) as two further confounding relations by multiplying the generator with \(C\) and \(A\) , respectively.

Further trials with manipulating the generator show that no further relations can be obtained. For example, multiplying \(ABC=+1\) with the word \(AB\) yields \(C=AB\) again, and multiplying this relation with \(C\) yields \(C\cdot C=AB\cdot C\iff +1=ABC\) , the original generator. This means that indeed, we have fully confounded four pairs of effects and no others. In general, a generator for a \(2^k\) factorial produces \(2^k/2=2^{k-1}\) such alias relations between factors, so we have a direct way to check if we found all. In our example, \(2^3/2=2^2=4\) , so our alias relations \(ABC=+1\) , \(AB=C\) , \(AC=B\) , and \(BC=A\) cover all existing confoundings.

This property also means that by choosing any of the implied relations as our generator, we get exactly the same set of treatment combinations. For example, instead of \(ABC=+1\) , we might equally well choose \(A=BC\) ; this selects the same set of rows and implies the same set of confounding relations. Usually, we use a generator that aliases a high-order interaction with the grand mean, simply because it is the most obvious and convenient thing to do.

Useful fractions of factorial designs with manageable aliasing are associated with a generator, because then can effects be properly estimated and meaningful confounding arises. Each generator selects one-half of the possible treatment combinations and this is the reason why we set out to choose four rows for our examples, and not, say, six.

We briefly note that our first and second choice in Section 9.2.3 are not based on a generator, leaving us with a complex partial confounding of effects. In contrast, our third choice selected all treatments with A on the low level and does have a generator, namely \[ A=-1\;. \] Algebraic manipulation then shows that this design implies the additional three confounding relations \(AB=-C\) , \(AC=-B\) , and \(ABC=-BC\) . In other words, any effect involving the factor A is confounded with another effect not involving that factor, which we easily verify from Table 9.2 .

9.3.2 Half-fractions of higher \(2^k\) factorials

Generators and their algebraic manipulation provide an efficient way for finding the confoundings in higher-order factorials, where looking at the corresponding table of treatment combinations quickly becomes unfeasible. As we can see from the algebra, the most useful generator is always confounding the grand mean with the highest-order interaction.

For four factors, this generator is \(ABCD=+1\) and we expect that there are \(2^4/2=8\) relations in total. Multiplying with any letter reveals that main effects are then confounded with three-way interactions, such as \(ABCD=+1\iff BCD=A\) after multiplying with \(A\) , and similarly \(B=ACD\) , \(C=ABD\) , and \(D=ABC\) . Moreover, by multiplication with two-letter words we find that all two-way interactions are confounded with other two-way interactions, namely via the three relations \(AB=CD\) , \(AC=BD\) , and \(AD=BC\) . This is already an improvement over fractions of the \(2^3\) -factorial, especially if we can make the argument that three-way interactions can be neglected and we thus have direct estimates of all main effects. If we find a significant and large two-way interaction— A:B , say—then we cannot distinguish if it is A:B , its alias C:D , or a combination of the two that produces the effect. Subject-matter considerations might be available to separate these possibilities. If not, there is at least a clear goal for a subsequent experiment to disentangle the two interaction effects.

Things improve further for five factors and the generator \(ABCDE=+1\) which reduces the number of treatment combinations from \(2^5=32\) to \(2^{5-1}=16\) . Now, main effects are confounded with four-way interactions, and two-way interactions are confounded with three-way interactions. Invoking the principle of effect sparsity and neglecting the three- and four-way interactions yields estimable main effects and two-way interactions.

Starting from factorials with six factors, main effects and two-way interactions are confounded with interactions of order five and four, respectively, which in most cases can be assumed to be negligible.

A simple way for creating the design table of a fractional factorial using R exploits these algebraic manipulations: first, we define our generator. We then create the full design table with \(k\) columns, one for each treatment factor, and one row for each of the \(2^k\) combinations of treatment levels, where each cell is either \(-1\) or \(+1\) . Next, we create a new column for the generator and calculate its entries by multiplying the corresponding columns. Finally, we remove all rows for which the generator equation is not fulfilled and keep the remaining rows as our design table. For a 3-factor design with generator \(ABC=-1\) , we create three columns \(A\) , \(B\) , \(C\) and eight rows. The new column \(ABC\) has entries \(A\cdot B\cdot C\) , and we delete those rows for which \(A\cdot B\cdot C\not=-1\) .

9.4 A real-life example: yeast medium composition

As a larger example of a fractional factorial treatment design, we discuss an experiment conducted during the sequential optimization of a yeast growth medium optimization. The overall aim was to find a medium composition that maximizes growth, and we discuss this aspect in more detail in Chapter 10 . Here, we concentrate on determining the individual and combined effects of five medium ingredients—glucose Glc , two different nitrogen sources N1 (monosodium glutamate) and N2 (an amino acid mixture), and two vitamin sources Vit1 and Vit2 —on the resulting number of yeast cells. Different combinations of concentrations of these ingredients are tested on a 48-well plate, and the growth curve is recorded for each well by measuring the optical density over time. We use the increase in optical density ( \(\Delta\text{OD}\) ) between onset of growth and flattening of the growth curve at the diauxic shift as a rough but sufficient approximation for increase in number of cells.

9.4.1 Experimental design

To determine how the five medium components influence the growth of the yeast culture, we used the composition of a standard medium as a reference point, and simultaneously altered the concentrations of the five components. For this, we selected two concentrations per component, one lower, the other higher than the standard, and considered these as two levels for each of five treatment factors. The treatment structure is then a \(2^5\) -factorial and would in principle allow estimation of the main effects and all two-, three-, four-, and five-factor interactions when all \(32\) possible combinations are used. However, a single replicate would require two-thirds of a plate and this is undesirable because we would like sufficient replication and also be able to compare several yeast strains in the same plate. Both requirements can be accommodated by using a half-replicate of the \(2^5\) -factorial with 16 treatment combinations, such that three independent experiments fit on a single plate.

A generator \(ABCDE=1\) confounds the main effects with four-way interactions, which we consider negligible for this experiment. Still, two-way interactions are confounded with three-way interactions, and in the first implementation we assume that three-way interactions are much smaller than two-way interactions. We can then interpret main effect estimates directly, and assume that derived parameters involving two-way interactions have only small contributions from the corresponding three-way interactions.

A single replicate of this \(2^{5-1}\) -fractional factorial generates 16 observations, sufficient for estimating the grand mean, five main effects, and the ten two-way interactions, but we are left with no degrees of freedom for estimating the residual variance. We say the design is saturated . This problem is circumvented by using two replicates of this design per plate. While this requires 32 wells, the same size as the full factorial, this strategy produces duplicate measurements of the same treatment combinations which we can manually inspect for detecting errors and aberrant observations. The 16 treatment combinations considered are shown in Table 9.3 together with the measured difference in OD for the first and second replicate, with higher differences indicating higher growth.

Table 9.3: Treatment combinations for half-replicate of \(2^5\)-factorial design for determining yeast growth medium composition. The measured growth is shown in the last two columns for two replicates
Glc N1 N2 Vit1 Vit2 Growth_1 Growth_2
20 1 0 1.5 4 1.7 35.68
60 1 0 1.5 0 0.1 67.88
20 3 0 1.5 0 1.5 27.08
60 3 0 1.5 4 0.0 80.12
20 1 2 1.5 0 120.2 143.39
60 1 2 1.5 4 140.3 116.30
20 3 2 1.5 4 181.0 216.65
60 3 2 1.5 0 40.0 47.48
20 1 0 4.5 0 5.8 41.35
60 1 0 4.5 4 1.4 5.70
20 3 0 4.5 4 1.5 84.87
60 3 0 4.5 0 0.6 8.93
20 1 2 4.5 4 106.4 117.48
60 1 2 4.5 0 90.9 104.46
20 3 2 4.5 0 129.1 157.82
60 3 2 4.5 4 131.5 143.33

Clearly, the medium composition has a huge impact on the resulting growth, ranging from a minimum of 0 to a maximum of 181. The original medium has an average ‘growth’ of \(\Delta\text{OD}\approx 80\) , and this experiment already reveals a condition with approximately 2.3 fold increase. We also see that measurement with N2 at the low level are abnormally low in the first replicate. We remove these eight values from our analysis. 13

9.4.2 Analysis

Our fractional factorial design has five treatment factors and several interaction factors, and we use an analysis of variance initially to determine which of the medium components has an appreciable effect on growth, and how the components interact. The full model is growth~Glc*N1*N2*Vit1*Vit2 , but only half of its parameters can be estimated. Since we deliberately confounded effects in our fractional factorial treatment structure, we know which derived parameters are estimated, and can select one member of each alias set for our model. The model specification growth~(Glc+N1+N2+Vit1+Vit2)^2 asks for an ANOVA based on all main effects and all two-way interactions (it expands to growth~Glc+N1+N2+...+Glc:N1+...+Vit1:Vit2 ). After pooling the data from both replicates and excluding the aberrant N2 observation of the first replicate, the resulting ANOVA table is

Analysis of Variance Model
  Df Sum Sq Mean Sq F value Pr(>F)
1 6148 6148 26.49 0.0008772
1 1038 1038 4.475 0.0673
1 34298 34298 147.8 1.94e-06
1 369.9 369.9 1.594 0.2423
1 6040 6040 26.03 0.0009276
1 3907 3907 16.84 0.003422
1 1939 1939 8.357 0.02017
1 264.8 264.8 1.141 0.3166
1 753.3 753.3 3.247 0.1092
1 0.9298 0.9298 0.004007 0.9511
1 1450 1450 6.248 0.03697
1 9358 9358 40.33 0.0002204
1 277.9 277.9 1.198 0.3057
1 811.4 811.4 3.497 0.0984
1 1280 1280 5.515 0.0468
8 1856 232

We find several substantial main effects in this analysis, with N2 the main contributor followed by Glc and Vit2 . Even though N1 has no significant main effect, it appears in several significant interactions; this also holds to a lesser degree for Vit1 . Several pronounced interactions demonstrate that optimizing individual components will not be a fruitful strategy, and we need to simultaneously change multiple factors to maximize the growth. This information can only be acquired by using a factorial design.

We do not discuss the necessary subsequent analyses of contrasts and effect sizes for the sake of brevity; they work exactly as for smaller factorial designs.

9.4.3 Alternative analysis of single replicate

Since the design is saturated, a single replicate does not provide information about uncertainty. If only the single replicate can be analyzed, we have to reduce the model to free up degrees of freedom from parameter estimation to estimate the residual variance. If subject-matter knowledge is available to decide which factors can be safely removed without missing important effects, then a single replicate can be a successfully analysed. For example, knowing that the two nitrogen sources and the two vitamin components do not interact, we might specify the model Growth~(Glc+N1+N2+Vit1+Vit2)^2 - N1:N2 - Vit1:Vit2 that removes the two corresponding interactions while keeping the three remaining ones. This strategy is somewhat unsatisfactory, since we now still only have two residual degrees of freedom and correspondingly low precision and power, and we cannot test if removal of the factors was really justified. Without good subject-matter knowledge, this strategy can give very misleading results if significant and large effects are removed from the analysis.

9.5 Multiple aliasing

The definition of a single generator creates a half-replicate of the factorial design. For higher-order factorials starting with the \(2^5\) -factorials, useful designs are also available for higher fractions, such as quarter-replicates that would require only 8 of the 32 treatment combinations in a \(2^5\) -factorial. These designs are constructed by using more than one generator, which also leads to more complicated confounding.

For example, a quarter-fractional requires two generators: one generator to specify one-half of the treatment combinations, and a second generator to specify one-half of those. Both generators introduce their own aliases which we determine using the generator algebra. In addition, multiplying the two generators introduces further aliases through the generalized interaction .

9.5.1 A generic \(2^{5-2}\) fractional factorial

As a first example, we construct a quarter-replicate of a \(2^5\) -factorial. Which two generators should we use? Our first idea is probably to use the five-way interaction for defining the first set of aliases, and one of the four-way interactions for defining the second set. We might choose the two generators \(G_1\) and \(G_2\) as \[ G_1: ABCDE=1 \quad\text{and}\quad G_2: BCDE=1\;, \] for example. The resulting eight treatment combinations are shown in Table 9.4 (left). We see that in addition to the two generators, we also have a further highly undesirable confounding of the main effect of A with the grand mean: the column \(A\) only contains the high level. This is a consequence of the interplay of the two generators, and we find this additional confounding directly by comparing the left- and right-hand side of their generalized interaction: \[ G_1G_2 = ABCDE\cdot BCDE=ABBCCDDEE = A =1\;. \]

Table 9.4: Quarter-fractionals of \(2^5\) design. Left: \(ABCDE=1\) and \(BCDE=1\) confounds main effect of A with grand mean. Right: generators \(ABD=1\) and \(ACE=1\) confound main effects with two-way interactions.
A B C D E ABCDE BCDE
1 -1 -1 -1 -1 1 1
1 1 1 -1 -1 1 1
1 1 -1 1 -1 1 1
1 -1 1 1 -1 1 1
1 1 -1 -1 1 1 1
1 -1 1 -1 1 1 1
1 -1 -1 1 1 1 1
1 1 1 1 1 1 1
A B C D E ABD ACE
1 -1 -1 -1 -1 1 1
-1 1 1 -1 -1 1 1
1 1 -1 1 -1 1 1
-1 -1 1 1 -1 1 1
-1 1 -1 -1 1 1 1
1 -1 1 -1 1 1 1
-1 -1 -1 1 1 1 1
1 1 1 1 1 1 1

Some further trial-and-error reveals that no useful second generator is available if we confound the five-way interaction with the grand mean in our first generator. A reasonably good pair of generators uses two three-way interactions, such as \[ G_1: ABD=1 \quad\text{and}\quad G_2: ACE=1\;, \] with generalized interaction \[ G_1G_2 = AABCDE = BCDE = 1\;. \] The resulting treatment combinations are shown in Table 9.4 (right). We note that main effects and two-way interactions are now confounded.

Finding good pairs of generators is not entirely straightforward, and software or tabulated designs are often used. 14

9.5.2 A real-life \(2^{7-2}\) fractional factorial

The transformation of yeast cells is an important experimental technique, but many protocols have very low yield. In an attempt to define a more reliable and efficient protocol, seven treatment factors were considered in combination: Ion, PEG, DMSO, Glycerol, Buffer, EDTA, and amount of carrier DNA. With each component in two concentrations, the full treatment structure is a \(2^7\) -factorial with 128 treatment combinations. This experiment size is prohibitive since each treatment requires laborious subsequent steps, but 32 treatment combinations were considered reasonable for implementing this experiment. This requires a quarter-replicate of the full design.

Ideally, we want to find two generators that alias main effects and two-way interactions with interactions of order three and higher, but no such pair of generators exists in this case. We are confronted with the problem of confounding some two-way interactions with each other, while other two-way interactions are confounded with three-way interactions.

Preliminary experiments suggested that the largest interactions involve Ion, PEG, and potentially Glycerol, while the two-way interactions involving other components are all comparatively small. A reasonable design then uses the two generic generators \[ G_1: ABCDF=+1 \text{ and } G_2: ABDEG=+1 \] with generalized interaction \(CF=EG\) . The two-factor interactions involving the factors C , E , F , and G are then confounded with each other, but two-way interactions involving the remaining factors A , B , and D are confounded with interactions of order three or higher. Hence, selecting A , B , D as the factors Ion, PEG, and Glycerol allows us to create a design with 32 treatment combinations that reflects our subject-matter knowledge and allows estimation of all relevant two-way interactions while confounding those two-way interactions that we consider negligible. For example, we cannot disentangle an interaction of DMSO and EDTA from an interaction of Buffer and carrier DNA, but this does not jeopardize the interpretation of this experiment.

9.6 Characterizing fractional factorials

Two measures to describe the severity of confounding in a fractional factorial design are the resolution and the abberration .

9.6.1 Resolution

A fractional factorial design has resolution \(K\) if the grand mean is confounded with at least one factor of order \(K\) , and no factor of lower order. The order is typically given as a roman numeral. For example, a \(2^{3-1}\) design with generator \(ABC=1\) has order III, and we denote such a design as \(2^{3-1}_{\text{III}}\) .

Designs with more factors allow fractions of higher resolution. Our \(2^5\) -factorial example in the previous section admits a \(2^{5-1}_{\text{V}}\) design with 16 combinations, and a \(2^{5-2}_{\text{III}}\) design with 8 combinations. With the first design, we can estimate main effects and two-way interactions free of other main effects and two-way interactions, while the second design aliases main effects with two-way interactions. Our 7-factor example has resolution IV.

For a factor of any order \(N\) , the resolution also gives the lowest order of a factor confounded with it: a resolution-III design confounds main effects with two-way interactions ( \(\text{III}=1+2\) ), and the grand mean with a three-way interaction ( \(\text{III}=0+3\) ). A resolution-V design confounds main effects with four-way interactions ( \(\text{V}=1+4\) ), two-way interactions with three-way interactions ( \(\text{V}=2+3\) ), and the five-way interaction with the grand mean ( \(\text{V}=5+0\) ).

In general, resolutions \(\text{III}\) , \(\text{IV}\) , and \(\text{V}\) are the most ubiquitous, and a resolution of \(\text{V}\) is often the most useful if it is achievable, since then main effects and two-way interactions are aliased only with interactions of order three and higher. Main effects and two-way interactions are confounded for resolution III, and these designs are useful for screening larger numbers of factors, but usually not for experiments where relevant information is expected in the two-way interactions. If a design has many treatment factors, we can also construct fractions with resolution higher than V, but it is usually more practical to use an additional generator to construct a design with resolution V and fewer treatment combinations.

Resolution IV confounds two-way interaction effects with each other. While this is rarely desirable, we might find multiple generators that leave some two-way interactions unconfounded with other two-way interactions, as in our 7-factor example. Such designs offer dramatic decreases in the experiment size for large number of factors. For example, full factorials for nine, ten, and eleven factors have 512, 1024, and 2048 treatment combinations, respectively. For most experiments, this is clearly not practically implementable. However, fractional factorial of resolution IV only require 32 runs in each case, which is a very practical proposition in most situations.

Similarly, a \(2^{7-2}\) design has resolution IV, since some of the two-way interactions are confounded. The maximal resolutions for the \(2^7\) series are \(2^{7-1}_{VII}\) , \(2^{7-2}_{IV}\) , \(2^{7-3}_{IV}\) , \(2^{7-4}_{III}\) . Thus, the resolution drops with increasing fraction, and not all resolutions might be achievable for a given number of factors (there is no resolution-VI design for seven factors, for example).

9.6.2 Aberration

For the \(2^7\) -factorial, both a reduction by \(1/4\) and by \(1/8\) leads to a resolution-IV design, but these designs are clearly not comparable in other aspects. For example, all two-way interactions are confounded in the \(2^{7-3}\) design, while we saw that only some are confounded in the \(2^{7-2}\) design.

The abberration provides an additional criterion to compare designs with identical resolution and is found as follows: we write down the generators and derive their generalized interactions. We then sort the resulting set of alias relations by word length and count how many relations there are of each length. The fewer words of short length occur, the better the set of generators. This criterion thus encodes that we prefer aliasing higher-order interactions to aliasing lower-order interactions.

For the two \(2^7\) fractions of resolution IV, we find two relations of length four for the \(2^{7-2}\) -design, while there are seven such relations for the \(2^{7-3}\) -design. The confounding of the former is therefore less severe than the confounding of the latter.

The abberration can also be used to compare different sets of generators for the same fractional factorial. For example, the following two sets of generators both yield a \(2^{7-2}_{\text{IV}}\) design: \[ ABCDE=1,\,ABCEG=1 \quad\text{and}\quad ABCF=1\,,ADEG=1\;. \] The first set of generators has generalized interaction \(ABCDE\cdot ABCEG=DEFG=1\) , so this design has a set of generating alias relations with one word of length four, and two words of length five. In contrast, the second set of generators has generalized interaction \(ABCF\cdot ADEG=BCDEFG=1\) , and contains two words of length four and one word of length six. We would therefore prefer to use the first set of generators, because is yields a less confounded set of aliases.

9.7 Factor screening

A common problem, especially at the beginning of designing an assay or investigating any system, is to determine which of the vast number of possible factors actually have a relevant influence on the response. For example, let us say we want to design a toxicity assay with a luminescent readout on a 48-well plate, where luminescence is supposed to be directly related to the number of living cells in each well, and is thus a proxy for toxicity of a substance pipetted into a well. Apart from the substance’s concentration and toxicity, there are many other factors that one might imagine can influence the readout. Examples include the technician, amount of shaking before reading, the reader type, batch effects of chemicals, temperature, setting time, labware, type of pipette (small/large volume), and many others.

Before designing any experiment for more detailed analyses of relevant factors, we may want to conduct a factor screening to determine which factors are active and appreciably affect the response. Subsequent experimentation then only includes the active factors and, having reduced the number of treatment factors, can then be designed with the methods previously discussed.

Factor screening designs make extensive use of the assumption that the proportion of active factors among those considered is small. We usually also assume that we are only interested in the main effects and can ignore the interaction effects for the screening. This assumption is justified because we will not make any inference on how exactly the factors influence the response, but are for the moment only interested in discarding factors of no further interest.

9.7.1 Fractional factorials

One class of screening designs uses fractional factorial design of resolution \(\text{III}\) . Noteworthy examples are the \(2^{15-11}_{\text{III}}\) design, which allows screening 15 factors in 16 runs, or the \(2^{31-26}_{\text{III}}\) design, which allows screening 31 factors in 32 runs!

A problem of this class of designs is that the ‘gap’ between useful screening design increases with increasing number of factors, because we can only consider fractions that are powers of two: reducing a \(2^7\) design with 128 runs yields designs of 64 runs ( \(2^{7-1}\) ) and 32 runs ( \(2^{7-2}\) ), but we cannot find designs with less than 64 and more than 32 runs, for example. On the other hand, fractional factorials are familiar designs that are relatively easy to interpret and if a reasonable design is available, there is no reason not to consider it.

Factor screening experiments will typically use a single replicate of the fractional factorial, and effects cannot be tested formally. If only a minority of factors is active, we can use a method by Lenth to still identify the active factors by more informal comparisons (Lenth 1989 ) . The main idea is to calculate a robust estimate of the standard error and use it to discard factors whose effects are not sufficiently larger than this estimate.

Specifically, we denote the estimated average difference between low and high level of the \(j\) th factor by \(c_j\) and estimate the standard error as 1.5 times the median of absolute effect estimates: \[ s_0 = 1.5 \cdot \text{median}_{j} |c_j|\;. \] If no effect were active, then \(s_0\) would already provide an estimate of the standard error. If some effects are active, they inflate the estimate by an unknown amount. We therefore restrict our estimation to those effects that are ‘small enough’ and do not exceed 2.5 times the current standard error estimate. The pseudo standard error is then \[ \text{PSE} = 1.5 \cdot \text{median}_{|c_j|<2.5\cdot s_0} |c_j|\;. \] The margin of error (ME) (i.e., the upper limit of a confidence interval) is then \[ \text{ME} = t_{0.975, d} \cdot \text{PSE}\;, \] and Lenth proposes to use \(d=m/3\) as the degrees of freedom, where \(m\) is the number of effects in the model. This limit is corrected for multiple comparisons by adjusting the confidence limit from \(\alpha=0.975\) to \(\gamma=(1+0.95^{1/m})/2\) . The resulting simultaneous margin of error (SME) is then \[ \text{SME} = t_{\gamma, d} \cdot \text{PSE}\;. \] Factors with effects exceeding SME in either direction are considered active, those between the ME limits are inactive, and those between ME and SME have unclear status. We therefore choose those factors that exceed SME as our safe choice, and might include those exceeding ME as well for subsequent experimentation.

In his paper, Lenth discusses a \(2^4\) full factorial experiment, where the effect of acid strength (S), time (t), amount of acid (A), and temperature (T) on the yield of isatin is studied (Davies 1954 ) . The experiment design and the resulting yield are shown in Table 9.5 .

Table 9.5: Experimental design and isatin yield of Davies’ experiment.
S t A T Yield
-1 -1 -1 -1 0.08
+1 -1 -1 -1 0.04
-1 +1 -1 -1 0.53
+1 +1 -1 -1 0.43
-1 -1 +1 -1 0.31
+1 -1 +1 -1 0.09
-1 +1 +1 -1 0.12
+1 +1 +1 -1 0.36
S t A T Yield
-1 -1 -1 +1 0.79
+1 -1 -1 +1 0.68
-1 +1 -1 +1 0.73
+1 +1 -1 +1 0.08
-1 -1 +1 +1 0.77
+1 -1 +1 +1 0.38
-1 +1 +1 +1 0.49
+1 +1 +1 +1 0.23

The results are shown in Figure 9.3 . No factor seems to be active, with temperature, acid strength, and the interaction of temperature and time coming closest.

Analysis of active effects in unreplicated $2^4$-factorial with Lenth's method.

Figure 9.3: Analysis of active effects in unreplicated \(2^4\) -factorial with Lenth’s method.

9.7.2 Plackett-Burman designs

A different idea for constructing screening designs was proposed by Plackett and Burman in a seminal paper (Plackett and Burman 1946 ) . These designs require that the number of runs is a multiple of four. The most commonly used are the designs in 12, 20, 24, and 28 runs, which can screen 11, 19, 23, and 27 factors, respectively. Plackett-Burman designs do not have a simple confounding structure that could be determined with generators. Rather, they are based on the idea of partially confounding some fraction of each effect with other effects. These designs are used for screening main effects only, as main effects are already confounded with two-way interactions in rather complicated ways that cannot be easily disentangled by follow-up experiments. Plackett-Burman designs considerably increase the available options for the experiment size, and offer several designs in the range of \(16, \dots, 32\) runs for which no fractional factorial design is available.

Tables of Plackett-Burman designs are found on the NIST website 15 and in many older texts on experimental design. In R , they can be constructed using the function pb() from package FrF2 , which requires the number of runs \(n\) (a multiple of four) as its only input and returns a design for \(n-1\) factors.

9.8 Blocking factorial experiments

With many treatment, blocking a design becomes challenging because the efficiency of blocking deteriorates with increasing block size, or there are other limits on the maximal number of units per block. The incomplete block designs in Section 7.3 are a remedy for this problem for unstructured treatment levels. The ideas of fractional factorial designs is useful for blocking factorial treatment structures and explicitly exploit their properties by deliberately confounding (higher-order) interactions with block effects. This reduces the required block size to the size of the corresponding fractional factorial.

We can further extend this idea by using different confoundings for different sets of blocks, such that each set accommodates a different fraction of the same factorial treatment structure. We are then able to recover most of the effects of the full factorial, albeit with different precision.

We consider the \(2^3\) -factorial treatment structure as our main example, as it already allows discussion of all relevant ideas. We consider the case that our blocking factor only allows accommodating four out of the eight possible treatment combinations. This is a realistic scenario if studying combinations of three drug treatments on mice and blocking by litter, with typical litter sizes being below eight. Two questions arise: (i) which treatment combinations should we assign to the same block? and (ii) with replication of blocks, should we use the same assignment of treatment combinations to blocks? If not, how should we determine treatment combinations for sets of blocks?

9.8.1 Half-fraction

A first idea is to use a half-replicate of the \(2^3\) -factorial with four treatment combinations, and confound the generator with the block effect. If we use the generator \(ABC=+1\) and each block effect is confounded with the grand mean, so \(Block=+1\) , then we get the formal generator \(ABC=Block\) and assign only those four treatment combinations with \(ABC=+1\) to each block. With four blocks, this yields the following assignment:

Block Generator 1 2 3 4
I ABC=+1 a b c abc
II ABC=+1 a b c abc
III ABC=+1 a b c abc
IV ABC=+1 a b c abc

Within each block, we have the same one-half fraction of the \(2^3\) -factorial with runs \(\{a,b,c,abc\}\) and this design resembles a four-fold replication of the same fractional factorial, where systematic differences between replicates are accounted for by the block effects. The fractional factorial has resolution- \(\text{III}\) , and main effects are confounded with two-way interactions.

From the 16 observations, we required four degrees of freedom for estimating the treatment parameters, and three degrees of freedom for the block effect, leaving us with nine residual degrees of freedom. The latter can be increased by using more blocks, where we gain four observations with each block, and loose one degree of freedom per block for the block effect. Since the effect aliases are the same in each block, increasing the number of blocks does not change the confounding an no matter how many block we use, we are unable to disentangle the main effect of A , say, and the B:C interaction in this design.

9.8.2 Half-fraction with alternating replication

We can improve the design substantially by noting that it is not required to use the same half-replicate in each block. Instead, we might use the generator \(ABC=+1\) with combinations \(\{(1),ab,ac,bc\}\) for two of the four blocks, and the corresponding generator \(ABC=-1\) (the fold-over ) with combinations \(\{a,b,c,abc\}\) for the other two blocks. The design is then

Block Generator 1 2 3 4
I ABC=+1 a b c abc
II ABC=+1 a b c abc
III ABC=-1 ab ac bc
IV ABC=-1 ab ac bc

With two replicates for each of the two levels of the three-way interaction, its parameters are estimable using the block totals. Somewhat loosely speaking, this resembles a split-unit design with A:B:C having blocks as experimental units, and all other effects randomized on units within blocks. All other effects can be estimated more precisely, since we now effectively have two replicates of the full factorial design after we account for the block effects. While the half-fraction of a \(2^3\) -factorial is not an interesting option in itself due to the severe confounding, it gives a very appealing design for reducing block sizes.

For example, we have confounding of A with B:C for observations based on the \(ABC=+1\) half-replicates (with \(A=BC\) ), but we can resolve this confounding using observations from the other half-replicate, for which \(A=-BC\) . Indeed, for blocks I and II, the estimate of the A main effect is \((a+abc)-(b+c)\) and for blocks III and IV it is \((ab+ac)-(bc+(1))\) . Similarly, the estimates for B:C are \((a+abc)-(b+c)\) and \((bc+(1))-(ab+ac)\) , respectively. Note that these estimates are all free of block effects. Then, the estimates of the two effects are also free of block effects and are proportional to \(\left[(a+abc)-(b+c)\right]\, +\, \left[(ab+ac)-(bc+(1))\right] = (a+ab+ac+abc)-((1)+b+c+bc)\) for A , respectively \(\left[(a+abc)-(b+c)\right]\, -\, \left[(ab+ac)-(bc+(1))\right]=((1)+a+bc+abc)-(b+c+ab+ac)\) for B:C . These are the same estimates as for two-fold replicate of the full factorial design. Somewhat simplified: the first two blocks allow estimation of \(A+BC\) , the second pair allows estimation of \(A-BC\) , the sum of the two is \(2\cdot A\) , while the difference is \(2\cdot BC\) .

The same argument does not hold for the A:B:C interaction, of course. Here, we have to contrast observations in \(ABC=+1\) blocks with observations in \(ABC=-1\) blocks, and block effects do not cancel. If instead of four blocks, our design only uses two blocks—one for each generator—then main effects and two-way interactions can still be estimated, but the three-way interaction is completely confounded with the block effect.

Using a classical ANOVA for the analysis, we indeed find two error strata for the inter- and intra-block errors, and the corresponding \(F\) -test for A:B:C in the inter-block stratum with two denominator degrees of freedom: we have four blocks, and loose one degree of freedom for the grand mean, and one degree of freedom for the A:B:C parameters. All other tests are in the intra-block stratum and based on six degrees of freedom: a total of \(4\times 4=16\) observations, with seven degrees of freedom spent on the model parameters except the three-way interaction, and three degrees of freedom spent on the block effects.

In summary, we can exploit the factorial treatment structure to our advantage when blocking, with only slightly more complex logistics to organize different treatment combinations for different blocks. Using a generator and its fold-over to alias a high-order interaction with the block effect, we achieve precise estimation of all effects not aliased with the block effect, and we can estimate the confounded effect with sufficient number of blocks based on the inter-block information.

9.8.3 Excursion: split-unit designs

While using the highest-order interaction to define the confounding with blocks is the natural choice, we could also use any other generator. In particular, we might use \(A=+1\) and \(A=-1\) as our two generators, thereby allocating half the blocks to the low level of A , and the other half to its high level. In other words, we randomize A on the block factor, and the remaining treatment factors are randomized within each block. This is precisely the split-unit design with the blocking factor as the whole-unit factor, and A randomized on it. With four blocks, we need one degree of freedom to estimate the block effect, and the remaining three degrees of freedom are split into estimating the A main effect (1 .d.f) and the between-block residual variance (2 d.f.). All other treatment effects profit from the removal of the block effect and are tested with 6 degrees of freedom for the within-block residual variance.

The use of generators offers more flexibility than a split-unit design, because it allows us to confound any effect with the blocking factor, not just a main effect. Whether this is an advantage depends on the experiment: if application of the treatment factors to experimental units is equally simple for all factors, then it is usually more helpful to confound a higher-order interaction with the blocking factor. This design then allows estimation of all main effects and their contrasts with equal precision, and lower-order interaction effects can also be estimated precisely. A split-unit design, however, offers advantages for the logistics of the experiment if levels of a treatment factor are more difficult to change than levels of the other factors. By confounding the hard-to-change factor with the blocking factor, the experiment becomes easier to implement. Split-unit designs are also conceptually simpler than confounding of interaction effects with blocks, but that should not be the sole motivation for using them.

9.8.4 Half-fraction with multiple generators

We are often interested in all effects of a factorial treatment design, especially if this design has only few factors. Using a single generator and a fold-over, however, provides much lower precision for the corresponding effect, which might be undesirable. An alternative strategy is then to use different generators and fold-overs for different pairs of blocks. In this partial confounding of effects with blocks, we confound a different effect in each pair of blocks, but can estimate the same effect with high precision from observations in the remaining blocks.

For example, we consider again the half-replicate of a \(2^3\) -factorial, with four units per block. If we have resources for 32 units in eight blocks, we can form four pairs of blocks with four units each. Then, we might use the previous generator \(G_1: ABC=\pm 1\) for our first pair of blocks, the generator \(G_2: AB=\pm 1\) for the second pair, \(G_3: AC=\pm 1\) for the third pair, and \(G_4: BC=\pm 1\) for the fourth pair of blocks. Each pair of blocks is then a fold-over pair for a specific generator with treatment combinations assigned as follows:

Block Generator 1 2 3 4
I ABC=+1 a b c abc
II ABC=-1 ab ac bc
III AB=+1 c ab abc
IV AB=-1 a b ac bc
V AC=+1 b ac abc
VI AC=-1 a b ab bc
VII BC=-1 a bc abc
VIII BC=+1 b c ab ac

Looking at the resulting ANOVA table, we clearly see how information about effects is present both between and within blocks. Effects occurring in a generator are now present both in the inter-block error stratum and the residual (intra-block) error stratum:

In this design each two-way interaction can be estimated using with-block information of three pairs of blocks, and the same is true for the three-way interaction. Additional estimates can be defined based on the inter-block information, similar to a BIBD. The inter- and intra-block estimates can be combined, but this is rarely done in practice for a classic ANOVA, where the more precise within-block estimates are often used exclusively. In contrast, linear mixed model offer a direct way of basing all estimates on all available data; a corresponding model for this example is specified as y~A*B*C+(1|block) .

9.8.5 Multiple aliasing

We can further reduce the required block size by considering higher fractions of a factorial. As we saw in Section 9.5 , these require several simultaneous generators, and additional aliasing occurs due to the generalized interaction between the generators.

For example, the half-fraction of a \(2^5\) -factorial still requires a block size of 16, which might not be practical. We further reduce the block size using the two pairs of generators \[ ABC=\pm 1\,,\quad ADE=\pm 1\,, \] with generalized interaction \(ABC\cdot ADE=BCDE\) , leading to a \(2^{5-2}\) treatment design (Finney 1955 , p101) . Each of the four combinations of these two pairs selects eight of the 32 possible treatment combinations and a single replicate of this design requires four blocks:

Block Generator 1 2 3 4 5 6 7 8
I ABC=-1, ADE=-1 bc de bcde abd acd abe ace
II ABC=+1, ADE=-1 b c bde cde ad abcd ae abce
III ABC=+1, ADE=-1 d bcd e bce ab ac abde acde
IV ABC=+1, ADE=+1 bd cd be ce a abc ade abcde

In this design, the two three-way interactions A:B:C and A:D:E , and the four-way interaction B:C:D:E used in the generators are partially confounded with block effects. All other effects, and in particular all main effects and all two-way interactions, are free of block effects and estimated precisely. By carefully selecting the generators, we are often able to confound effects that are known to be of lesser interest to the researcher.

Similar partially confounded designs exist for higher-order factorials. A prominent example is a \(2^{7-4}\) design that allows block sizes of eight instead of 128 and requires eight blocks for on replicate. The \(2^{7-4}\) fractional factorial has resolution III and thus confounds main effects with two-way interactions. By choosing the three generators intelligently, however, the partial confounding with blocks leaves main effects and two-way interaction confounded only with three-way and higher-order interactions.

9.8.6 Example: proteomics experiment

As a concrete example of blocking a factorial design, we discuss a simplified variant of a proteomics study in mice. The main target of this study is the response to inflammation, and a drug is available to trigger this response. One pathway involved in the response is known and many of the proteins involved as well as the receptor upstream of the pathway have been identified. However, related experiments suggested that the drug also activates alternative pathways involving other receptors, and one goal of the experiment is to identify proteins involved these pathways.

The experiment has three factors in a \(2^3\) -factorial treatment design: administration of the drug or a placebo, a short or long waiting time between drug administration and measurements, and the use of the wild-type or a mutant receptor for the known pathway, where the mutant inhibits binding of the drug and hence deactivates the pathway.

Expected results

We can broadly distinguish three classes of proteins that we expect to find in this experiment.

The first class are proteins directly involved in the known pathway. For these, we expect low levels of abundance for a placebo treatment, because the placebo does not activate the pathway. For the drug treatment, we expect to see high abundance in the wild-type, as the pathway is then activated, but low abundance in the mutant, since the drug cannot bind to the receptor and thus pathway activation is impeded. In other words, we expect a large genotype-by-drug interaction.

The second class are proteins in the alternative pathway(s) activated by the drug but exhibiting a different receptor. Here, we would expect to see high abundance in both wild-type and mutant for the drug treatment and low abundance in both genotypes for a placebo treatment, since the mutation does not affect receptors in these pathways. This translates into a large drug main effect, but no genotype main effect and no genotype-by-drug interaction.

The third class are proteins unrelated to any mechanisms activated by the drug. Here, we expect to see the same abundance levels in both genotypes for both drug treatments, and no treatment factor should show a large and significant effect.

We are somewhat unsure what to expect for the duration. It seems plausible that a protein in an activated pathway will show lower abundance after longer time, since the pathway should trigger a response to the inflammation and lower the inflammation. This would mean that a three-way interaction exists at least for proteins involved in the known or alternative pathways. A different scenario results if one pathway takes longer to activate than another pathway, which would present as a two- or three-way interaction of drug and/or genotype with the duration.

Mass spectrometry using tags

Absolute quantification of protein abundances is very difficult to achieve in mass spectrometry. A common technique is to use tags , small molecules that attach to each protein and modify its mass by a known amount. With four different tags available, we can then pool all proteins from four different experimental conditions and determine their relative abundances by comparing the four resulting peaks in the mass spectrum for each protein.

We have 16 mice available, eight wild-type and eight mutant mice. Since we have eight treatment combinations but only four tags, we need to block the experiment in sets of four. An obvious candidate is confounding the block effect with the three-way interaction genotype-by-drug-by-time. This choice is shown in Figure 9.4 , and each label corresponds to a treatment combination in the first two blocks and the opposite treatment combination in the remaining two blocks.

Proteomics experiment. A: $2^3$-factorial treatment structure  with three-way interaction confounded in two blocks. B: mass spectra with four tags (symbol) for same protein from two blocks (shading).

Figure 9.4: Proteomics experiment. A: \(2^3\) -factorial treatment structure with three-way interaction confounded in two blocks. B: mass spectra with four tags (symbol) for same protein from two blocks (shading).

The main disadvantage of this choice is the confounding of the three-way interaction with the block effect, which only allows imprecise estimation, and it is unlikely that the effect sizes are large enough to allow reliable detection in this design. Alternatively, we can use two generators for the two pairs of blocks, the first confounding the three-way interaction, and the second confounding one of the three two-way interactions. A promising candidate is the drug-by-duration interaction, since we are very interested in the genotype-by-drug interaction and would like to detect different activation times between the known and alternative pathways, but we do not expect a drug-by-duration interaction of interest. This yields the data shown in Figure 9.5 , where the eight resulting protein abundances are shown separately for short and long duration between drug administration and measurement, and for three typical proteins in the known pathway, in an alternative pathway, and unrelated to the inflammation response.

Data of proteomics experiment. Round point: placebo, triangle: drug treatment. Panels show typical protein scenarios in columns and waiting duration in rows.

Figure 9.5: Data of proteomics experiment. Round point: placebo, triangle: drug treatment. Panels show typical protein scenarios in columns and waiting duration in rows.

Davies, O. L. 1954. “The Design and Analysis of Industrial Experiments.” In. Oliver & Boyd, London.

Finney, David J. 1955. Experimental Design and its Statistical Basis . The University of Chicago Press.

Lenth, Russell V. 1989. “Quick and easy analysis of unreplicated factorials.” Technometrics 31 (4): 469–73. https://doi.org/10.1080/00401706.1989.10488595 .

Plackett, R L, and J P Burman. 1946. “The design of optimum multifactorial experiments.” Biometrika 33 (4): 305–25. https://doi.org/10.1093/biomet/33.4.305 .

It later transpired that the low level of N2 was zero in the first, but a low, non-zero concentration in the second replicate. ↩

The NIST provides helpful designs on their website http://www.itl.nist.gov/div898/handbook/pri/section3/pri3347.htm . ↩

http://www.itl.nist.gov/div898/handbook/pri/section3/pri335.htm ↩

Teach yourself statistics

ANOVA With Full Factorial Experiments

This lesson explains how to use analysis of variance (ANOVA) with balanced, completely randomized, full factorial experiments. The discussion covers general issues related to design, analysis, and interpretation with fixed factors and with random factors .

Future lessons expand on this discussion, using sample problems to demonstrate the analysis under the following scenarios:

  • Two-factor ANOVA: Fixed-effects model .
  • Two-factor ANOVA: Random-effects model .
  • Two-factor ANOVA: Mixed-effects model .
  • Two-factor ANOVA with Excel .

Design Considerations

Since this lesson is all about implementing analysis of variance with a balanced, completely randomized, full factorial experiment, we begin by answering four relevant questions:

  • What is a full factorial experiment?
  • What is a completely randomized design?
  • What are the data requirements for analysis of variance with a completely randomized, full factorial design?
  • What is a balanced design?

What is a Full Factorial Experiment?

A factorial experiment allows researchers to study the joint effect of two or more factors on a dependent variable .

With a full factorial design, the experiment includes a treatment group for every combination of factor levels. Therefore, the number of treatment groups is the product of factor levels. For example, consider the full factorial design shown below:

  C C C C
A B Grp 1 Grp 2 Grp 3 Grp 4
B Grp 5 Grp 6 Grp 7 Grp 8
B Grp 9 Grp 10 Grp 11 Grp 12
A B Grp 13 Grp 14 Grp 15 Grp 16
B Grp 17 Grp 18 Grp 19 Grp 20
B Grp 21 Grp 22 Grp 23 Grp 24
  A A
B B B B B B
C Group 1 Group 2 Group 3 Group 4 Group 5 Group 6
C Group 7 Group 8 Group 9 Group 10 Group 11 Group 12
C Group 13 Group 14 Group 15 Group 16 Group 17 Group 18
C Group 19 Group 20 Group 21 Group 22 Group 23 Group 24

Factor A has two levels, factor B has three levels, and factor C has four levels. Therefore, the full factorial design has 2 x 3 x 4 = 24 treatment groups.

Full factorial designs can be characterized by the number of treatment levels associated with each factor, or by the number of factors in the design. Thus, the design above could be described as a 2 x 3 x 4 design (number of treatment levels) or as a three-factor design (number of factors).

Note: Another type of factorial experiment is a fractional factorial. Unlike full factorial experiments, which include a treatment group for every combination of factor levels, fractional factorial experiments include only a subset of possible treatment groups. Our focus in this lesson is on full factorial experiments, rather than fractional factorial experiments.

Completely Randomized Design

With a full factorial experiment, a completely randomized design is distinguished by the following attributes:

  • The design has two or more factors (i.e., two or more independent variables ), each with two or more levels .
  • Treatment groups are defined by a unique combination of non-overlapping factor levels.
  • The number of treatment groups is the product of factor levels.
  • Experimental units are randomly selected from a known population .
  • Each experimental unit is randomly assigned to one, and only one, treatment group.
  • Each experimental unit provides one dependent variable score.

Data Requirements

Analysis of variance requires that the dependent variable be measured on an interval scale or a ratio scale . In addition, analysis of variance with a full factorial experiment makes three assumptions about dependent variable scores:

  • Independence . The dependent variable score for each experimental unit is independent of the score for any other unit.
  • Normality . In the population, dependent variable scores are normally distributed within treatment groups.
  • Equality of variance . In the population, the variance of dependent variable scores in each treatment group is equal. (Equality of variance is also known as homogeneity of variance or homoscedasticity.)

The assumption of independence is the most important assumption. When that assumption is violated, the resulting statistical tests can be misleading. This assumption is tenable when (a) experimental units are randomly sampled from the population and (b) sampled unitsare randomly assigned to treatments.

With respect to the other two assumptions, analysis of variance is more forgiving. Violations of normality are less problematic when the sample size is large. And violations of the equal variance assumption are less problematic when the sample size within groups is equal.

Before conducting an analysis of variance with data from a full factorial experiment, it is best practice to check for violations of normality and homogeneity assumptions. For further information, see:

  • How to Test for Normality: Three Simple Tests
  • How to Test for Homogeneity of Variance: Hartley's Fmax Test
  • How to Test for Homogeneity of Variance: Bartlett's Test

Balanced versus Unbalanced Design

A balanced design has an equal number of observations in all treatment groups. In contrast, an unbalanced design has an unequal number of observations in some treatment groups.

Balance is not required with one-way analysis of variance , but it is helpful with full-factorial designs because:

  • Balanced factorial designs are less vulnerable to violations of the equal variance assumption.
  • Balanced factorial designs have more statistical power .
  • Unbalanced factorial designs can produce confounded factors, making it hard to interpret results.
  • Unbalanced designs use special weights for data analysis, which complicates the analysis.

Note: Our focus in this lesson is on balanced designs.

Analytical Logic

To implement analysis of variance with a balanced, completely randomized, full factorial experiment, a researcher takes the following steps:

  • Specify a mathematical model to describe how main effects and interaction effects influence the dependent variable.
  • Write statistical hypotheses to be tested by experimental data.
  • Specify a significance level for a hypothesis test.
  • Compute the grand mean and the mean scores for each treatment group.
  • Compute sums of squares for each effect in the model.
  • Find the degrees of freedom associated with each effect in the model.
  • Based on sums of squares and degrees of freedom, compute mean squares for each effect in the model.
  • Find the expected value of the mean squares for each effect in the model.
  • Compute a test statistic for each effect, based on observed mean squares and their expected values.
  • Find the P value for each test statistic.
  • Accept or reject the null hypothesis for each effect, based on the P value and the significance level.
  • Assess the magnitude of effect, based on sums of squares.

If you are familiar with one-way analysis of variance (see One-Way Analysis of Variance ), you might notice that the analytical logic for a completely-randomized, single-factor experiment is very similar to the logic for a completely randomized, full factorial experiment. Here are the main differences:

  • Formulas for mean scores and sums of squares differ, depending on the number of factors in the experiment.
  • Expected mean squares differ, depending on whether the experiment tests fixed effects and/or random effects.

Below, we'll explain how to implement analysis of variance for fixed-effects models, random-effects models, and mixed models with a balanced, two-factor, completely randomized, full-factorial experiment.

Mathematical Model

For every experimental design, there is a mathematical model that accounts for all of the independent and extraneous variables that affect the dependent variable.

Fixed Effects

For example, here is the fixed-effects mathematical model for a two-factor, completely randomized, full-factorial experiment:

X i j m = μ + α i + β j + αβ i j + ε m ( ij )

where X i j m is the dependent variable score for subject m in treatment group ij , μ is the population mean, α i is the main effect of Factor A at level i ; β j is the main effect of Factor B at level j ; αβ i j is the interaction effect of Factor A at level i and Factor B at level j ; and ε m ( ij ) is the effect of all other extraneous variables on subject m in treatment group ij .

For this model, it is assumed that ε m ( ij ) is normally and independently distributed with a mean of zero and a variance of σ ε 2 . The mean ( μ ) is constant.

Note: The parentheses in ε m ( ij ) indicate that subjects are nested under treatment groups. When a subject is assigned to only one treatment group, we say that the subject is nested under a treatment.

Random Effects

The random-effects mathematical model for a completely randomized full factorial experiment is similar to the fixed-effects mathematical model. It can also be expressed as:

Like the fixed-effects mathematical model, the random-effects model also assumes that (1) ε m ( ij ) is normally and independently distributed with a mean of zero and a variance of σ ε 2 and (2) the mean ( μ ) is constant.

Here's the difference between the two mathematical models. With a fixed-effects model, the experimenter includes all treatment levels of interest in the experiment. With a random-effects model, the experimenter includes a random sample of treatment levels in the experiment. Therefore, in the random-effects mathematical model, the following is true:

  • The main effect ( α i  ) is a random variable with a mean of zero and a variance of σ 2 α .
  • The main effect ( β j  ) is a random variable with a mean of zero and a variance of σ 2 β .
  • The interaction effect ( αβ ij  ) is a random variable with a mean of zero and a variance of σ 2 αβ .

All three effects are assumed to be normally and independently distributed (NID).

Statistical Hypotheses

With a full factorial experiment, it is possible to test all main effects and all interaction effects. For example, here are the null hypotheses (H 0 ) and alternative hypotheses (H 1 ) for each effect in a two-factor full factorial experiment.

For fixed-effects models, it is common practice to write statistical hypotheses in terms of treatment effects:

H : α = 0 for all H : β = 0 for all H : αβ = 0 for all
H : α ≠ 0 for some H : β ≠ 0 for some H : αβ ≠ 0 for some

For random-effects models, it is common practice to write statistical hypotheses in terms of the variance of treatment levels included in the experiment:

H : σ = 0 H : σ = 0 H : σ = 0
H : σ ≠ 0 H : σ ≠ 0 H : σ ≠ 0

Significance Level

The significance level (also known as alpha or α) is the probability of rejecting the null hypothesis when it is actually true. The significance level for an experiment is specified by the experimenter, before data collection begins. Experimenters often choose significance levels of 0.05 or 0.01.

A significance level of 0.05 means that there is a 5% chance of rejecting the null hypothesis when it is true. A significance level of 0.01 means that there is a 1% chance of rejecting the null hypothesis when it is true. The lower the significance level, the more persuasive the evidence needs to be before an experimenter can reject the null hypothesis.

Mean Scores

Analysis of variance for a full factorial experiment begins by computing a grand mean, marginal means , and group means. Here are formulas for computing the various means for a balanced, two-factor, full factorial experiment:

  • Grand mean. The grand mean ( X ) is the mean of all observations, computed as follows: N = p Σ i=1 q Σ j=1 n = pqn X  = ( 1 / N ) p Σ i=1 q Σ j=1 n Σ m=1 ( X  i j m  )
  • Marginal means for Factor A. The mean for level i of Factor A is computed as follows: X  i  = ( 1 / q ) q Σ j=1 n Σ m=1 ( X  i j m  )
  • Marginal means for Factor B. The mean for level j of Factor B is computed as follows: X  j  = ( 1 / p ) p Σ i=1 n Σ m=1 ( X  i j m  )
  • Group means. The mean of all observations in group i j ( X i j ) is computed as follows: X  i j  = ( 1 / n ) n Σ m=1 ( X  i j m  )

In the equations above, N is the total sample size across all treatment groups; n is the sample size in a single treatment group, p is the number of levels of Factor A, and q is the number of levels of Factor B.

Sums of Squares

A sum of squares is the sum of squared deviations from a mean score. Two-way analysis of variance makes use of five sums of squares:

  • Factor A sum of squares. The sum of squares for Factor A (SSA) measures variation of the marginal means of Factor A (  X  i  ) around the grand mean (  X  ). It can be computed from the following formula: SSA = nq p Σ i=1 (  X  i  -  X  ) 2
  • Factor B sum of squares. The sum of squares for Factor B (SSB) measures variation of the marginal means of Factor B (  X  j  ) around the grand mean (  X  ). It can be computed from the following formula: SSB = np q Σ j=1 (  X  j  -  X  ) 2
  • Interaction sum of squares. The sum of squares for the interaction between Factor A and Factor B (SSAB) can be computed from the following formula: SSAB = n p Σ i=1 q Σ j=1 (  X  i j  -  X   i  -  X   j  +  X  ) 2
  • Within-groups sum of squares. The within-groups sum of squares (SSW) measures variation of all scores ( X  i j m  ) around their respective group means (  X   i j  ). It can be computed from the following formula: SSW = p Σ i=1 q Σ j=1 n Σ m=1 ( X  i j m  -  X   i j  ) 2 Note: The within-groups sum of squares is also known as the error sum of squares (SSE).
  • Total sum of squares. The total sum of squares (SST) measures variation of all scores ( X  i j m  ) around the grand mean (  X  ). It can be computed from the following formula: SST = p Σ i=1 q Σ j=1 n Σ m=1 ( X  i j m  -  X  ) 2

In the formulas above, n is the sample size in each treatment group, p is the number of levels of Factor A, and q is the number of levels of Factor B.

It turns out that the total sum of squares is equal to the sum of the component sums of squares, as shown below:

SST = SSA + SSB + SSAB + SSW

As you'll see later on, this relationship will allow us to assess the relative magnitude of any effect (Factor A, Factor B, or the AB interaction) on the dependent variable.

Degrees of Freedom

The term degrees of freedom (df) refers to the number of independent sample points used to compute a statistic minus the number of parameters estimated from the sample points.

The degrees of freedom used to compute the various sums of squares for a balanced, two-way factorial experiment are shown in the table below:

Sum of squares Degrees of freedom
Factor A p - 1
Factor B q - 1
AB interaction ( p - 1 )( q - 1)
Within groups pq( n - 1 )
Total npq - 1

Notice that there is an additive relationship between the various sums of squares. The degrees of freedom for total sum of squares (df TOT ) is equal to the degrees of freedom for the Factor A sum of squares (df A ) plus the degrees of freedom for the Factor B sum of squares (df B ) plus the degrees of freedom for the AB interaction sum of squares (df AB ) plus the degrees of freedom for within-groups sum of squares (df WG ). That is,

df TOT = df A + df B + df AB + df WG

Mean Squares

A mean square is an estimate of population variance. It is computed by dividing a sum of squares (SS) by its corresponding degrees of freedom (df), as shown below:

MS = SS / df

To conduct analysis of variance with a two-factor, full factorial experiment, we are interested in four mean squares:

MS A = SSA / df A

MS B = SSB / df B

MS AB = SSAB / df AB

MS WG = SSW / df WG

Expected Value

The expected value of a mean square is the average value of the mean square over a large number of experiments.

Statisticians have derived formulas for the expected value of mean squares for balanced, two-factor, full factorial experiments. The expected values differ, depending on whether the experiment uses all fixed factors, all random factors, or a mix of fixed and random factors.

Fixed-Effects Model

A fixed-effects model describes an experiment in which all factors are fixed factors. The table below shows the expected value of mean squares for a balanced, two-factor, full factorial experiment when both factors are fixed:

Mean square Expected value
MS σ + nqσ
MS σ + npσ
MS σ + nσ
MS σ

In the table above, n is the sample size in each treatment group, p is the number of levels for Factor A, q is the number of levels for Factor B, σ 2 A is the variance of main effects due to Factor A, σ 2 B is the variance of main effects due to Factor B, σ 2 AB is the variance due to interaction effects, and σ 2 WG is the variance due to extraneous variables (also known as variance due to experimental error).

Random-Effects Model

A random-effects model describes an experiment in which all factors are random factors. The table below shows the expected value of mean squares for a balanced, two-factor, full factorial experiment when both factors are random:

Mean square Expected value
MS σ + nσ + nqσ
MS σ + nσ + npσ
MS σ + nσ
MS σ

Mixed Model

A mixed model describes an experiment in which at least one factor is a fixed factor, and at least one factor is a random factor. The table below shows the expected value of mean squares for a balanced, two-factor, full factorial experiment, when Factor A is a fixed factor and Factor B is a random factor:

Mean square Expected value
MS σ + nσ + nqσ
MS σ + npσ
MS σ + nσ
MS σ

Note: The expected values shown in the tables are approximations. For all practical purposes, the values for the fixed-effects model will always be valid for computing test statistics (see below). The values for the random-effects model and the mixed model will be valid when random-effect levels in the experiment represent a small fraction of levels in the population.

Test Statistics

Suppose we want to test the significance of a main effect or the interaction effect in a two-factor, full factorial experiment. We can use the mean squares to define a test statistic F as follows:

F(v 1 , v 2 ) = MS EFFECT 1 / MS EFFECT 2

where MS EFFECT 1 is the mean square for the effect we want to test; MS EFFECT 2 is an appropriate mean square, based on the expected value of mean squares; v 1 is the degrees of freedom for MS EFFECT 1  ; and v 2 is the degrees of freedom for MS EFFECT 2 .

How do you choose an appropriate mean square for the denominator in an F ratio? The expected value of the denominator of the F ratio should be identical to the expected value of the numerator, except for one thing: The numerator should have an extra term that includes the variance of the effect being tested (σ 2 EFFECT ).

The table below shows how to construct F ratios when an experiment uses a fixed-effects model.

Table 1. Fixed-Effects Model

Effect Mean square:
Expected value
F ratio
A σ + nqσ
B σ + nqσ
AB σ + nσ
Error σ  

The table below shows how to construct F ratios when an experiment uses a Random-effects model.

Table 2. Random-Effects Model

Effect Mean square:
Expected value
F ratio
A σ + nσ + nqσ
B σ + nσ + npσ
AB σ + nσ
Error σ  

The table below shows how to construct F ratios when an experiment uses a mixed model. Here, Factor A is a fixed effect, and Factor B is a random effect.

Table 3. Mixed Model

Effect Mean square:
Expected value
F ratio
A
(fixed)
σ + nσ + nqσ
B
(random)
σ + npσ
AB σ + nσ
Error σ  

How to Interpret F Ratios

For each F ratio in the tables above, notice that numerator should equal the denominator when the variation due to the source effect ( σ 2  SOURCE ) is zero (i.e., when the source does not affect the dependent variable). And the numerator should be bigger than the denominator when the variation due to the source effect is not zero (i.e., when the source does affect the dependent variable).

Defined in this way, each F ratio is a convenient measure that we can use to test the null hypothesis about the effect of a source (Factor A, Factor B, or the AB interaction) on the dependent variable. Here's how to conduct the test:

  • When the F ratio is close to one, the numerator of the F ratio is approximately equal to the denominator. This indicates that the source did not affect the dependent variable, so we cannot reject the null hypothesis.
  • When the F ratio is significantly greater than one, the numerator is bigger than the denominator. This indicates that the source did affect the dependent variable, so we must reject the null hypothesis.

What does it mean for the F ratio to be significantly greater than one? To answer that question, we need to talk about the P-value.

In an experiment, a P-value is the probability of obtaining a result more extreme than the observed experimental outcome, assuming the null hypothesis is true.

With analysis of variance for a full factorial experiment, the F ratios are the observed experimental outcomes that we are interested in. So, the P-value would be the probability that an F ratio would be more extreme (i.e., bigger) than the actual F ratio computed from experimental data.

How does an experimenter attach a probability to an observed F ratio? Luckily, the F ratio is a random variable that has an F distribution . The degrees of freedom (v 1 and v 2 ) for the F ratio are the degrees of freedom associated with the effects used to compute the F ratio.

For example, consider the F ratio for Factor A when Factor A is a fixed effect. That F ratio (F A ) is computed from the following formula:

F A = F(v 1 , v 2 ) = MS A / MS WG

MS A (the numerator in the formula) has degrees of freedom equal to df A  ; so for F A  , v 1 is equal to df A  . Similarly, MS WG (the denominator in the formula) has degrees of freedom equal to df WG  ; so for F A  , v 2 is equal to df WG  . Knowing the F ratio and its degrees of freedom, we can use an F table or an online calculator to find the probability that an F ratio will be bigger than the actual F ratio observed in the experiment.

F Distribution Calculator

To find the P-value associated with an F ratio, use Stat Trek's free F distribution calculator . You can access the calculator by clicking a link in the table of contents (at the top of this web page in the left column). find the calculator in the Appendix section of the table of contents, which can be accessed by tapping the "Analysis of Variance: Table of Contents" button at the top of the page. Or you can click tap the button below.

For examples that show how to find the P-value for an F ratio, see Problem 1 or Problem 2 at the end of this lesson.

Hypothesis Test

Recall that the experimenter specified a significance level early on - before the first data point was collected. Once you know the significance level and the P-values, the hypothesis tests are routine. Here's the decision rule for accepting or rejecting a null hypothesis:

  • If the P-value is bigger than the significance level, accept the null hypothesis.
  • If the P-value is equal to or smaller than the significance level, reject the null hypothesis.

A "big" P-value for a source of variation (Factor A, Factor B, or the AB interaction) indicates that the source did not have a statistically significant effect on the dependent variable. A "small" P-value indicates that the source did have a statistically significant effect on the dependent variable.

Magnitude of Effect

The hypothesis tests tell us whether sources of variation in our experiment had a statistically significant effect on the dependent variable, but the tests do not address the magnitude of the effect. Here's the issue:

  • When the sample size is large, you may find that even small effects (indicated by a small F ratio) are statistically significant.
  • When the sample size is small, you may find that even big effects are not statistically significant.

With this in mind, it is customary to supplement analysis of variance with an appropriate measure of effect size. Eta squared (η 2 ) is one such measure. Eta squared is the proportion of variance in the dependent variable that is explained by a treatment effect. The eta squared formula for a main effect or an interaction effect is:

η 2 = SS EFFECT / SST

where SS EFFECT is the sum of squares for a particular treatment effect (i.e., Factor A, Factor B, or the AB interaction) and SST is the total sum of squares.

ANOVA Summary Table

It is traditional to summarize ANOVA results in an analysis of variance table. Here, filled with hypothetical data, is an analysis of variance table for a 2 x 3 full factorial experiment.

Analysis of Variance Table

Source SS df MS F P
A 13,225 p - 1 = 1 13,225 9.45 0.004
B 2450 q - 1 = 2 1225 0.88 0.427
AB 9650 (p-1)(q-1) = 2 4825 3.45 0.045
WG 42,000 pq(n - 1) = 30 1400
Total 67,325 npq - 1 = 35

In this experiment, Factors A and B were fixed effects; so F ratios were computed with that in mind. There were two levels of Factor A, so p equals two. And there were three levels of Factor B, so q equals three. And finally, each treatment group had six subjects, so n equal six. The table shows critical outputs for each main effect and for the AB interaction effect.

Many of the table entries are derived from the sum of squares (SS) and degrees of freedom (df), based on the following formulas:

MS A = SS A / df A = 13,225/1 = 13,225

MS B = SS B / df B = 2450/2 = 1225

MS AB = SS AB / df AB = 9650/2 = 4825

MS WG = MS WG / df WG = 42,000/30 = 1400

F A = MS A / MS WG = 13,225/1400 = 9.45

F B = MS B / MS WG = 2450/1400 = 0.88

F AB = MS AB / MS WG = 9650/1400 = 3.45

where MS A is mean square for Factor A, MS B is mean square for Factor B, MS AB is mean square for the AB interaction, MS WG is the within-groups mean square, F A is the F ratio for Factor A, F B is the F ratio for Factor B, and F AB is the F ratio for the AB interaction.

An ANOVA table provides all the information an experimenter needs to (1) test hypotheses and (2) assess the magnitude of treatment effects.

Hypothesis Tests

The P-value (shown in the last column of the ANOVA table) is the probability that an F statistic would be more extreme (bigger) than the F ratio shown in the table, assuming the null hypothesis is true. When a P-value for a main effect or an interaction effect is bigger than the significance level, we accept the null hypothesis for the effect; when it is smaller, we reject the null hypothesis.

For example, based on the F ratios in the table above, we can draw the following conclusions:

  • The P-value for Factor A is 0.004. Since the P-value is smaller than the significance level (0.05), we reject the null hypothesis that Factor A has no effect on the dependent variable.
  • The P-value for Factor B is 0.427. Since the P-value is bigger than the significance level (0.05), we cannot reject the null hypothesis that Factor B has no effect on the dependent variable.
  • The P-value for the AB interaction is 0.045. Since the P-value is smaller than the significance level (0.05), we reject the null hypothesis of no significant interaction. That is, we conclude that the effect of each factor varies, depending on the level of the other factor.

Magnitude of Effects

To assess the strength of a treatment effect, an experimenter can compute eta squared (η 2 ). The computation is easy, using sum of squares entries from an ANOVA table in the formula below:

where SS EFFECT is the sum of squares for the main or interaction effect being tested and SST is the total sum of squares.

To illustrate how to this works, let's compute η 2 for the main effects and the interaction effect in the ANOVA table below:

Source SS df MS F P
A 100 2 50 2.5 0.09
B 180 3 60 3 0.04
AB 300 6 50 2.5 0.03
WG 960 48 20
Total 1540 59

Based on the table entries, here are the computations for eta squared (η 2 ):

η 2 A = SSA / SST = 100 / 1540 = 0.065

η 2 B = SSB / SST = 180 / 1540 = 0.117

η 2 AB = SSAB / SST = 300 / 1540 = 0.195

Conclusion: In this experiment, Factor A accounted for 6.5% of the variance in the dependent variable; Factor B, 11.7% of the variance; and the interaction effect, 19.5% of the variance.

Test Your Understanding

In the ANOVA table shown below, the P-value for Factor B is missing. Assuming Factors A and B are fixed effects , what is the correct entry for the missing P-value?

Source SS df MS F P
A 300 4 75 5.00 0.002
B 100 2 50 3.33 ???
AB 200 8 25 1.67 0.12
WG 900 60 15
Total 1500 74

Hint: Stat Trek's F Distribution Calculator may be helpful.

(A) 0.01 (B) 0.04 (C) 0.20 (D) 0.97 (E) 0.99

The correct answer is (B).

A P-value is the probability of obtaining a result more extreme (bigger) than the observed F ratio, assuming the null hypothesis is true. From the ANOVA table, we know the following:

  • The observed value of the F ratio for Factor B is 3.33.

F B = F(v 1 , v 2 ) = MS B / MS WG

  • The degrees of freedom (v 1 ) for the Factor B mean square (MS B ) is 2.
  • The degrees of freedom (v 2 ) for the within-groups mean square (MS WG ) is 60.

Therefore, the P-value we are looking for is the probability that an F with 2 and 60 degrees of freedom is greater than 3.33. We want to know:

P [ F(2, 60) > 3.33 ]

Now, we are ready to use the F Distribution Calculator . We enter the degrees of freedom (v1 = 2) for the Factor B mean square, the degrees of freedom (v2 = 60) for the within-groups mean square, and the F value (3.33) into the calculator; and hit the Calculate button.

The calculator reports that the probability that F is greater than 3.33 equals about 0.04. Hence, the correct P-value is 0.04.

In the ANOVA table shown below, the P-value for Factor B is missing. Assuming Factors A and B are random effects , what is the correct entry for the missing P-value?

Source SS df MS F P
A 300 4 75 3.00 0.09
B 100 2 50 2.00 ???
AB 200 8 25 1.67 0.12
WG 900 60 15
Total 1500 74

(A) 0.01 (B) 0.04 (C) 0.20 (D) 0.80 (E) 0.96

The correct answer is (C).

  • The observed value of the F ratio for Factor B is 2.0.

F B = F(v 1 , v 2 ) = MS B / MS AB

  • The degrees of freedom (v 2 ) for the AB interaction (MS AB ) is 8.

Therefore, the P-value we are looking for is the probability that an F with 2 and 8 degrees of freedom is greater than 2.0. We want to know:

P [ F(2, 8) > 2.0 ]

Now, we are ready to use the F Distribution Calculator . We enter the degrees of freedom (v1 = 2) for the Factor B mean square, the degrees of freedom (v2 = 8) for the AB interaction mean square, and the F value (2.0) into the calculator; and hit the Calculate button.

The calculator reports that the probability that F is greater than 2.0 equals about 0.20. Hence, the correct P-value is 0.20.

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

buildings-logo

Article Menu

factorial experiment equation

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Factorial experiments of soil conditioning for earth pressure balance shield tunnelling in water-rich gravel sand and conditioning effects’ prediction based on particle swarm optimization–relevance vector machine algorithm.

factorial experiment equation

1. Introduction

2. tunnel overview and engineering geology, 3. laboratory tests on soil conditioning, 3.1. factorial experimental design, 3.2. analysis of test results, 3.3. normalized effect analysis, 3.4. main effect analysis, 3.5. interaction analysis, 3.6. equivalence relationship prediction, 4. soil conditioning prediction based on pso–rvm, 4.1. pso–rvm algorithm, 4.2. case study, 5. field application, 6. conclusions, author contributions, data availability statement, conflicts of interest.

  • Zou, B.; Yin, J.; Liu, Z.; Long, X. Transient rock breaking characteristics by successive impact of shield disc cutters under confining pressure conditions. Tunn. Undergr. Space Technol. 2024 , 150 , 105861. [ Google Scholar ] [ CrossRef ]
  • Moghtader, T.; Sharafati, A.; Naderpour, H.; Gharouni Nik, M. Estimating maximum surface settlement caused by EPB shield tunneling utilizing an intelligent approach. Buildings 2023 , 13 , 1051. [ Google Scholar ] [ CrossRef ]
  • Koohsari, A.; Kalatehjari, R.; Moosazadeh, S.; Hajihassani, M.; Van, B. A Critical Investigation on the Reliability, Availability, and Maintainability of EPB Machines: A Case Study. Appl. Sci. 2022 , 12 , 11245. [ Google Scholar ] [ CrossRef ]
  • Wu, Y.; Nazem, A.; Meng, F.; Mooney, M.A. Experimental study on the stability of foam-conditioned sand under pressure in the EPBM chamber. Tunn. Undergr. Space Technol. 2020 , 106 , 103590. [ Google Scholar ] [ CrossRef ]
  • Liu, Z.; Wang, S.; Qu, T.; Geng, X. The role of foam in improving the workability of sand: Insights from DEM. Minerals 2022 , 12 , 186. [ Google Scholar ] [ CrossRef ]
  • Li, S.; Wan, Z.; Zhao, S.; Ma, P.; Wang, M.; Xiong, B. Soil conditioning tests on sandy soil for earth pressure balance shield tunneling and field applications. Tunn. Undergr. Space Technol. 2022 , 120 , 104271. [ Google Scholar ] [ CrossRef ]
  • Ling, F.; Wang, S.; Hu, Q.; Huang, S.; Feng, Z. Effect of bentonite slurry on the function of foam for changing the permeability characteristics of sand under high hydraulic gradients. Can. Geotech. J. 2022 , 59 , 1061–1070. [ Google Scholar ] [ CrossRef ]
  • Dai, Z.; Peng, L.; Qin, S. Experimental and numerical investigation on the mechanism of ground collapse induced by underground drainage pipe leakage. Environ. Earth Sci. 2024 , 83 , 32. [ Google Scholar ] [ CrossRef ]
  • Hu, W.; Rostami, J. Evaluating rheology of conditioned soil using commercially available surfactants (foam) for simulation of material flow through EPB machine. Tunn. Undergr. Space Technol. 2021 , 112 , 103881. [ Google Scholar ] [ CrossRef ]
  • Hu, Q.; Wang, S.; Qu, T.; Xu, T.; Huang, S.; Wang, H. Effect of hydraulic gradient on the permeability characteristics of foam-conditioned sand for mechanized tunnelling. Tunn. Undergr. Space Technol. 2020 , 99 , 103377. [ Google Scholar ] [ CrossRef ]
  • Wang, S.; Ni, Z.; Qu, T.; Wang, H.; Pan, Q. A novel index to evaluate the workability of conditioned coarse-grained soil for EPB shield tunnelling. J. Constr. Eng. Manag. 2022 , 148 , 04022028. [ Google Scholar ] [ CrossRef ]
  • Mori, L.; Mooney, M.; Cha, M. Characterizing the influence of stress on foam conditioned sand for EPB tunneling. Tunn. Undergr. Space Technol. 2018 , 71 , 454–465. [ Google Scholar ] [ CrossRef ]
  • Sun, Y.; Zhao, D. Research and Experimental application of new slurry proportioning for slag improvement of EPB shield crossing sand and gravel layer. Coatings 2022 , 12 , 1961. [ Google Scholar ] [ CrossRef ]
  • Souwaissi, N.E.; Djeran-Maigre, I.; Boulange, L.; Trottin, J.L. Effects of the physical characteristics of foams on conditioned soil’s flow behavior: A case study. Tunn. Undergr. Space Technol. 2023 , 137 , 105111. [ Google Scholar ] [ CrossRef ]
  • Huang, Z.; Wang, C.; Dong, J.; Zhou, J.; Yang, J.; Li, Y. Conditioning experiment on sand and cobble soil for shield tunneling. Tunn. Undergr. Space Technol. 2019 , 87 , 187–194. [ Google Scholar ] [ CrossRef ]
  • Budach, C.; Thewes, M. Application ranges of EPB shields in coarse ground based on laboratory research. Tunn. Undergr. Space Technol. 2015 , 50 , 296–304. [ Google Scholar ] [ CrossRef ]
  • Wang, S.; Hu, Q.; Wang, H.; Thewes, M.; Ge, L.; Yang, J.; Liu, P. Permeability characteristics of poorly graded sand conditioned with foam in different conditioning states. J. Test. Eval. 2021 , 49 , 3620–3636. [ Google Scholar ] [ CrossRef ]
  • Yang, Z.; Yang, X.; Ding, Y.; Jiang, Y.; Qi, W.; Sun, Z.; Shao, X. Characteristics of conditioned sand for EPB shield and its influence on cutterhead torque. Acta Geotech. 2022 , 17 , 5813–5828. [ Google Scholar ] [ CrossRef ]
  • Lin, X.; Zhou, X.; Yang, Y. A new soil conditioner for highly permeable sandy gravel stratum in EPBs. Appl. Sci. 2021 , 11 , 2109. [ Google Scholar ] [ CrossRef ]
  • Jafari, S.; Farhanieh, B.; Afshin, H. Effects of fire parameters on critical velocity in curved tunnels: A numerical study and response surface analysis. Fire Technol. 2024 , 60 , 1769–1802. [ Google Scholar ] [ CrossRef ]
  • Xu, Q.; Zhang, L.; Zhu, H.; Gong, Z.; Liu, J.; Zhu, Y. Laboratory tests on conditioning the sandy cobble soil for EPB shield tunnelling and its field application. Tunn. Undergr. Space Technol. 2020 , 105 , 103512. [ Google Scholar ] [ CrossRef ]
  • Kong, G.; Wu, D.; Wei, Y. Experimental and numerical investigations on the energy and structural performance of a full-scale energy utility tunnel. Tunn. Undergr. Space Technol. 2023 , 139 , 105208. [ Google Scholar ] [ CrossRef ]
  • Huang, Z.; Cheng, Y.; Zhang, D.; Yan, D.; Shen, Y. Seismic fragility and resilience assessment of shallowly buried large-section underground civil defense structure in soft soils: Framework and application. Tunn. Undergr. Space Technol. 2024 , 146 , 105640. [ Google Scholar ] [ CrossRef ]
  • Deng, L.C.; Zhang, W.; Deng, L.; Shi, Y.H.; Zi, J.J.; He, X.; Zhu, H.H. Forecasting and early warning of shield tunnelling-induced ground collapse in rock-soil interface mixed ground using multivariate data fusion and Catastrophe Theory. Eng. Geol. 2024 , 335 , 107548. [ Google Scholar ] [ CrossRef ]
  • Shi, M.; Hu, W.; Li, M.; Zhang, J.; Song, X.; Sun, W. Ensemble regression based on polynomial regression-based decision tree and its application in the in-situ data of tunnel boring machine. Mech. Syst. Signal Process. 2023 , 188 , 110022. [ Google Scholar ] [ CrossRef ]
  • Liu, H.; Yue, Y.; Lian, Y.; Meng, X.; Du, Y.; Cui, J. Reverse-time migration of GPR data for imaging cavities behind a reinforced shield tunnel. Tunn. Undergr. Space Technol. 2024 , 146 , 105649. [ Google Scholar ] [ CrossRef ]
  • Zhu, L.; Chen, D.; Feng, P. Equipment operational reliability evaluation method based on RVM and PCA-Fused features. Math. Probl. Eng. 2021 , 2021 , 6687248. [ Google Scholar ] [ CrossRef ]
  • Suprajitno, H. Long-term forecasting of crop water requirement with BP-RVM algorithm for food security and harvest risk reduction. Int. J. Saf. Secur. Eng. 2023 , 13 , 565–575. [ Google Scholar ] [ CrossRef ]
  • Lu, C.T. Noise reduction using three-step gain factor and iterative-directional-median filter. Appl. Acoust. 2014 , 76 , 249–261. [ Google Scholar ] [ CrossRef ]
  • Zhang, Y.; Wang, Z.; Kuang, H.; Fu, F.; Yu, A. Prediction of surface settlement in shield-tunneling construction process using PCA-PSO-RVM machine learning. J. Perform. Constr. Facil. 2023 , 37 , 04023012. [ Google Scholar ] [ CrossRef ]
  • ASTM D2488-17e1 ; Standard Practice for Description and Identification of Soils (Visual-Manual Procedures). ASTM International: West Conshohocken, PA, USA, 2017. [ CrossRef ]
  • Avunduk, E.M.R.E.; Copur, H.; Tolouei, S.; Tumac, D.; Balci, C.; Bilgin, N.; Shaterpour–Mamaghani, A. Possibility of using torvane shear testing device for soil conditioning optimization. Tunn. Undergr. Space Technol. 2021 , 107 , 103665. [ Google Scholar ] [ CrossRef ]
  • Lee, H.; Kwak, J.; Choi, J.; Hwang, B.; Choi, H. A lab-scale experimental approach to evaluate rheological properties of foam-conditioned soil for EPB shield tunnelling. Tunn. Undergr. Space Technol. 2022 , 128 , 104667. [ Google Scholar ] [ CrossRef ]
  • Carigi, A.; Luciani, A.; Todaro, C.; Martinelli, D.; Peila, D. Influence of conditioning on the behaviour of alluvial soils with cobbles. Tunn. Undergr. Space Technol. 2020 , 96 , 103225. [ Google Scholar ] [ CrossRef ]
  • Peila, D.; Picchio, A.; Martinelli, D.; Negro, E.D. Laboratory tests on soil conditioning of clayey soil. Acta Geotech. 2016 , 11 , 1061–1074. [ Google Scholar ] [ CrossRef ]
  • GB/T 50123-2019 ; Standard for Geotechnical Testing Method. China Planning Press: Beijing, China, 2019.

Click here to enlarge figure

A A A
B B B B B B B B B B B B B B B
C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C
C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C
C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C
C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C
C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C A B C
Data SourceDegree of FreedomAdjusted Sum of SquaresAdjusted Mean SquaresT-ValueF-Valuep-Value
A260,78430,391.8−178.6937,947.470.0011
B437,6389409.5−139.6611,748.750.0013
C426,3626590.5−110.608229.010.0041
A × B818,0602257.5102.172818.770.0062
A × C86833854.261.441066.520.0068
B × C162766172.931.16215.830.0071
A × B × C32307095.9−7.45119.800.0077
Sample NumberInput VariablesOutput Variables
c (%)c (%)c (%)Permeability Coefficient (Before Conditioning)
(×10 m/s)
Resistivity of Sand
(Ω·m)
Slump Value
(mm)
Permeability Coefficient (After Conditioning)
(×10 m/s)
1255148.9854.241214.08
2504108.9054.101439.88
3752149.1256.44550.41
450469.0553.6316821.88
5753109.0954.68510.21
6252109.2556.6719258.82
750368.7053.1318745.11
8755108.9555.89570.49
925188.8954.9420573.53
10503128.9855.371244.73
11754109.1351.58500.14
1225569.1155.2318028.59
1350389.1456.4917626.33
1425388.9651.4218640.85
15502109.2453.4416216.24
16251129.1353.4419562.09
1750288.8756.4018131.76
18252149.2753.1818232.68
1975389.2255.541173.04
20255108.8055.9914811.29
2150188.5952.2518435.29
2275268.8351.8315914.12
23504148.7953.741132.82
24751128.9453.501183.27
25505109.0454.931001.41
26753149.0954.15330.02
27253148.7354.541408.17
2825289.1554.8019361.27
2950169.2355.2819664.94
30755148.6254.98460.08
31253128.7153.0615914.12
32753128.6955.85410.07
33503148.9556.571244.24
3425469.0955.1418333.18
3550488.9956.6616014.26
3675189.1356.151408.17
3725588.9154.1416922.88
3875168.9652.5416115.53
39251108.6252.3520070.26
40751148.7055.021102.45
4175568.9355.39To be predictedTo be predicted
42254149.2755.81
43255128.8356.13
44501109.1852.78
45254128.7254.51
46253109.2656.66
47501129.2355.51
48252128.7351.44
49251149.0356.44
5050269.1755.12
VariablesMinimumMaximumStandard DeviationDispersion CoefficientCoefficient of SkewnessCoefficient of Kurtosis
Input layerc (%)257520.5300.4160.046−1.516
c (%)151.3820.4810.111−1.164
c (%)6142.8100.2820.153−1.253
Permeability coefficient (before conditioning)
(m/s)
8.599.270.1900.021−0.302−0.933
Resistivity of sand (Ω·m)51.4256.671.4790.027−0.327−0.833
Output layerSlump value (mm)3320550.7520.362−0.771−0.606
Permeability coefficient (after conditioning)
(m/s)
0.0273.5322.2601.0490.996−0.233
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Nong, X.; Bai, W.; Chen, J.; Zhang, L. Factorial Experiments of Soil Conditioning for Earth Pressure Balance Shield Tunnelling in Water-Rich Gravel Sand and Conditioning Effects’ Prediction Based on Particle Swarm Optimization–Relevance Vector Machine Algorithm. Buildings 2024 , 14 , 2800. https://doi.org/10.3390/buildings14092800

Nong X, Bai W, Chen J, Zhang L. Factorial Experiments of Soil Conditioning for Earth Pressure Balance Shield Tunnelling in Water-Rich Gravel Sand and Conditioning Effects’ Prediction Based on Particle Swarm Optimization–Relevance Vector Machine Algorithm. Buildings . 2024; 14(9):2800. https://doi.org/10.3390/buildings14092800

Nong, Xingzhong, Wenfeng Bai, Jiandang Chen, and Lihui Zhang. 2024. "Factorial Experiments of Soil Conditioning for Earth Pressure Balance Shield Tunnelling in Water-Rich Gravel Sand and Conditioning Effects’ Prediction Based on Particle Swarm Optimization–Relevance Vector Machine Algorithm" Buildings 14, no. 9: 2800. https://doi.org/10.3390/buildings14092800

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

5.1 - factorial designs with two treatment factors.

For now we will just consider two treatment factors of interest. It looks almost the same as the randomized block design model only now we are including an interaction term:

\(Y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + e_{ijk}\)

where \(i = 1, \dots, a, j = 1, \dots, b, \text{ and } k = 1, \dots, n\). Thus we have two factors in a factorial structure with n observations per cell. As usual, we assume the \(e_{ijk} ∼ N(0, \sigma^2)\), i.e. independently and identically distributed with the normal distribution. Although it looks like a multiplication, the interaction term need not imply multiplicative interaction.

The Effects Model vs. the Means Model Section  

The cell means model is written:

\(Y_{ijk}=\mu_{ij} + e_{ijk}\)

Here the cell means are: \(\mu_{11}, \dots , \mu_{1b}, \dots , \mu_{a1} \dots \mu_{ab}\). Therefore we have a × b cell means, μ ij . We will define our marginal means as the simple average over our cell means as shown below:

\(\bar{\mu}_{i.}=\frac{1}{b} \sum\limits_j \mu_{ij}\), \(\bar{\mu}_{.j}=\frac{1}{a} \sum\limits_i \mu_{ij}\)

From the cell means structure we can talk about marginal means and row and column means. But first we want to look at the effects model and define more carefully what the interactions are We can write the cell means in terms of the full effects model:

\(\mu_{ij} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij}\)

It follows that the interaction terms \((\alpha \beta)_{ij}\)are defined as the difference between our cell means and the additive portion of the model:

\((\alpha\beta)_{ij} = \mu_{ij} - (\mu + \alpha_i + \beta_j) \)

If the true model structure is additive then the interaction terms\((\alpha \beta)_{ij}\) are equal to zero. Then we can say that the true cell means, \(\mu_{ij} = (\mu + \alpha_i + \beta_j)\), have additive structure.

Example 1 Section  

Let's illustrate this by considering the true means \(\mu_{ij} \colon\)

  B
  \(\mu_{ij}\)  
A     1 2 \(\bar{\mu}_{i.}\) \(\alpha_i\)
  1 5 11 8 -2
  2 9 15 12 2
  \(\bar{\mu}_{.j}\) 7 13 10  
  \(\beta_j\) -3 3    

Note that both a and b are 2, thus our marginal row means are 8 and 12, and our marginal column means are 7 and 13. Next, let's calculate the \(\alpha\) and the \(\beta\) effects; since the overall mean is 10, our \(\alpha\) effects are -2 and 2 (which sum to 0), and our \(\beta\) effects are -3 and 3 (which also sum to 0). If you plot the cell means you get two lines that are parallel.

The difference between the two means at the first \(\beta\) factor level is 9 - 5 = 4. The difference between the means for the second \(\beta\) factor level is 15 - 11 = 4. We can say that the effect of \(\alpha\) at the first level of \(\beta\) is the same as the effect of \(\alpha\) at the second level of \(\beta\). Therefore we say that there is no interaction and as we will see the interaction terms are equal to 0.

This example simply illustrates that the cell means, in this case, have additive structure. A problem with data that we actually look at is that you do not know in advance whether the effects are additive or not. Because of random error, the interaction terms are seldom exactly zero. You may be involved in a situation that is either additive or non-additive, and the first task is to decide between them.

Now consider the non-additive case. We illustrate this with Example 2 which follows.

Example 5.2 Section  

This example was constructed so that the marginal means and the overall means are the same as in Example 1. However, it does not have additive structure.

Using the definition of interaction:

\((\alpha \beta)_{ij} = \mu_{ij} - (\mu + \alpha_i + \beta_j)\)

which gives us \((\alpha \beta)_{ij}\) interaction terms that are -2, 2, 2, -2. Again, by the definition of our interaction effects, these \((\alpha \beta)_{ij}\) terms should sum to zero in both directions.

We generally call the \(\alpha_i\) terms the treatment effects for treatment factor A and the \(\beta_j\) terms for treatment factor B, and the \((\alpha \beta)_{ij}\) terms the interaction effects.

The model we have written gives us a way to represent in a mathematical form a two-factor design, whether we use the means model or the effects model, i.e.,

\(Y_{ijk} = \mu_{ij} + e_{ijk}\)

There is really no benefit to the effects model when there is interaction, except that it gives us a mechanism for partitioning the variation due to the two treatments and their interactions. Both models have the same number of distinct parameters. However, when there is no interaction then we can remove the interaction terms from the model and use the reduced additive model.

Now, we'll take a look at the strategy for deciding whether our model fits, whether the assumptions are satisfied and then decide whether we can go forward with an interaction model or an additive model. This is the first decision. When you can eliminate the interactions because they are not significantly different from zero, then you can use the simpler additive model. This should be the goal whenever possible because then you have fewer parameters to estimate, and a simpler structure to represent the underlying scientific process.

Before we get to the analysis, however, we want to introduce another definition of effects - rather than defining the \(\alpha_i\) effects as deviation from the mean, we can look at the difference between the high and the low levels of factor A . These are two different definitions of effects that will be introduced and discussed in this chapter and the next, the \(\alpha_i\) effects and the difference between the high and low levels, which we will generally denote as the A effect.

Factorial Designs with 2 Treatment Factors, cont'd Section  

For a completely randomized design, which is what we discussed for the one-way ANOVA, we need to have n × a × b = N total experimental units available. We randomly assign n of those experimental units to each of the a × b treatment combinations. For the moment we will only consider the model with fixed effects and constant experimental random error.

The model is:

\(i = 1, \dots , a\) \(j = 1, \dots , b\) \(k = 1, \dots , n\)

Read the text section 5.3.2 for the definitions of the means and the sum of squares.

Testing Hypotheses Section  

We can test the hypotheses that the marginal means are all equal, or in terms of the definition of our effects that the \(\alpha_i\)'s are all equal to zero, and the hypothesis that the \(\beta_j\)'s are all equal to zero. And, we can test the hypothesis that the interaction effects are all equal to zero. The alternative hypotheses are that at least one of those effects is not equal to zero.

How we do this, in what order, and how do we interpret these tests?

One of the purposes of a factorial design is to be efficient about estimating and testing factors A and B in a single experiment. Often we are primarily interested in the main effects. Sometimes, we are also interested in knowing whether the factors interact. In either case, the first test we should do is the test on the interaction effects.

The Test of H0: \((\alpha\beta)_{ij}=0\) Section  

If there is interaction and it is significant, i.e. the p -value is less than your chosen cut off, then what do we do? If the interaction term is significant that tells us that the effect of A is different at each level of B . Or you can say it the other way, the effect of B differs at each level of A . Therefore, when we have significant interaction, it is not very sensible to even be talking about the main effect of A and B , because these change depending on the level of the other factor. If the interaction is significant then we want to estimate and focus our attention on the cell means. If the interaction is not significant, then we can test the main effects and focus on the main effect means.

The estimates of the interaction and main effects are given in the text in section 5.3.4.

Note that the estimates of the marginal means for A are the marginal means:

\(\bar{y}_{i..}=\dfrac{1}{bn} \sum\limits_j \sum\limits_k y_{ijk}\), with \(var(\bar{y}_{i..})=\dfrac{\sigma^2}{bn}\)

A similar formula holds for factor B , with

\(var(\bar{y}_{.j.})=\dfrac{\sigma^2}{an}\)

Just the form of these variances tells us something about the efficiency of the two-factor design. A benefit of a two factor design is that the marginal means have n × b number of replicates for factor A and n × a for factor B . The factorial structure, when you do not have interactions, gives us the efficiency benefit of having additional replication, the number of observations per cell times the number of levels of the other factor. This benefit arises from factorial experiments rather than single factor experiments with n observations per cell. An alternative design choice could have been to do two one-way experiments, one with a treatments and the other with b treatments, with n observations per cell. However, these two experiments would not have provided the same level of precision, nor the ability to test for interactions.

Another practical question: If the interaction test is not significant what should we do?

Do we get remove the interaction term in the model? You might consider dropping that term from the model. If n is very small and your df for error are small, then this may be a critical issue. There is a 'rule of thumb' that I sometimes use in these cases. If the p-value for the interaction test is greater than 0.25 then you can drop the interaction term. This is not an exact cut off but a general rule. Remember, if you drop the interaction term, then a variation accounted for by SSab would become part of the error and increasing the SSE, however your error df would also become larger in some cases enough to increase the power of the tests for the main effects. Statistical theory shows that in general dropping the interaction term increases your false rejection rate for subsequent tests. Hence we usually do not drop nonsignificant terms when there are adequate sample sizes. However, if we are doing an independent experiment with the same factors we might not include interaction in the model for that experiment.

What if n = 1, and we have only 1 observation per cell? If n = 1 then we have 0 df for SSerror and we cannot estimate the error variance with MSE. What should we do in order to test our hypothesis? We obviously cannot perform the test for interaction because we have no error term.

If you are willing to assume, and if it is true that there is no interaction, then you can use the interaction as your F -test denominator for testing the main effects. It is a fairly safe and conservative thing to do. If it is not true then the MSab will tend to be larger than it should be, so the F -test is conservative. You're not likely to reject a main effect if it is not true. You won't make a Type I error, however you could more likely make a Type II error.

Extension to a 3 Factor Model Section  

The factorial model with three factors can be written as:

\(Y_{ijk} = \mu + \alpha_i + \beta_j + \gamma_k + (\alpha \beta)_{ij} + (\alpha \gamma)_{ik} + (\beta \gamma)_{jk} + (\alpha \beta \gamma)_{ijk} + e_{ijkl}\)

where \(i = 1, \dots , a, j = 1 , \dots , b, k = 1 , \dots , c, l = 1 , \dots , n\)

We extend the model in the same way. Our analysis of variance has three main effects, three two-way interactions, a three-way interaction and error. If this were conducted as a Completely Randomized Design experiment, each of the a × b × c treatment combinations would be randomly assigned to n of the experimental units.

Sample Size Determination [Section 5.3.5] Section  

We first consider the two-factor case where N = a × b × n , ( n = the number of replicates per cell). The non-centrality parameter for calculating sample size for the A factor is:

\(\phi^2 = ( nb \times D^{2}) / ( 2a \times \sigma^2)\)

where D is the difference between the maximum of \(\bar{\mu_{i.}}\) and the minimum of \(\bar{\mu_{i.}}\), and where b is the number of observations in each level of factor A .

Actually, at the beginning of our design process, we should decide how many observations we should take, if we want to find a difference of D , between the maximum and the minimum of the true means for the factor A . There is a similar equation for factor B .

\(\phi^{2} = ( na \times D^{2} ) / ( 2b \times \sigma^{2})\)

where na is the number of observations in each level of factor B .

In the two factor case, this is just an extension of what we did in the one-factor case. But now we have the marginal means benefiting from a number of observations per cell and the number of levels of the other factor. In this case, we have n observations per cell, and we have b cells. So, we have nb observations.

IMAGES

  1. Can You Solve this Factorial Equation?

    factorial experiment equation

  2. Step by Step Tutorial of 2 Factor Factorial Experiment

    factorial experiment equation

  3. Factorial experiment

    factorial experiment equation

  4. Factorial Equation

    factorial experiment equation

  5. PPT

    factorial experiment equation

  6. Analysis of two factor factorial experiment under CRD

    factorial experiment equation

VIDEO

  1. Factorial experiment analysis using Minitab software

  2. Analysis of Factorial Experiment

  3. Equation With Factorial

  4. Design of Experiment [DOE] by full factorial

  5. Can You Solve this Factorial Equation ?

  6. Factorial ANOVA Computed Using Formula

COMMENTS

  1. Factorial experiment

    Factorial experiment

  2. 14.2: Design of experiments via factorial designs

    14.2: Design of experiments via factorial designs

  3. Full Factorial Design: Understanding the Impact of Independent

    Full Factorial Design: Understanding the Impact of ...

  4. PDF Chapter 8 Factorial Experiments

    Chapter 8 Factorial Experiments

  5. All Topics Factorial Design of Experiments

    All Topics Factorial Design of Experiments

  6. 1. What is a Factorial Design of Experiment?

    What is a Factorial Design of an Experiment?

  7. PDF Topic 9. Factorial Experiments [ST&D Chapter 15]

    Topic 9. Factorial Experiments [ST&D Chapter 15]

  8. 9.1 Setting Up a Factorial Experiment

    Figure 9.1 Factorial Design Table Representing a 2 × 2 Factorial Design. In principle, factorial designs can include any number of independent variables with any number of levels. For example, an experiment could include the type of psychotherapy (cognitive vs. behavioral), the length of the psychotherapy (2 weeks vs. 2 months), and the sex of ...

  9. Lesson 5: Introduction to Factorial Designs

    This benefit arises from factorial experiments rather than single factor experiments with n observations per cell. An alternative design choice could have been to do two one-way experiments, one with a treatments and the other with b treatments, with n observations per cell. ... There is a similar equation for factor B. \(\phi^{2} = ( na \times ...

  10. Factorial design: design, measures, and classic examples

    A full factorial design (also known as complete factorial design) is the most complete of the design options, meaning that each factor and level are combined to test every possible combination condition. Let us expand upon the theoretical ERAS factorial experiment as an illustrative example. We designed our own ERAS protocol for Whipple procedures, and our objective is to test which components ...

  11. Chapter 4 Factorial experiments

    Chapter 4 Factorial experiments. In Chapters 2 and 3, we assumed the objective of the experiment was to investigate \(t\) unstructured treatments, defined only as a collection of distinct entities (drugs, advertisements, receipes, etc.). That is, there was not necessarily any explicit relationship between the treatments (although we could clearly choose which paticular comparisons between ...

  12. Lesson 6: The \(2^k\) Factorial Design

    Lesson 6: The \(2^k\) Factorial Design

  13. Lesson 5: Introduction to Factorial Designs

    Introduction. Factorial designs are the basis for another important principle besides blocking - examining several factors simultaneously. We will start by looking at just two factors and then generalize to more than two factors. Investigating multiple factors in the same design automatically gives us replication for each of the factors.

  14. PDF FACTORIAL DESIGNS Two Factor Factorial Designs

    ORIAL DESIGNS4.1 Two Factor Factorial DesignsA two-factor factorial design is an experimental design in which data is collected for all possible combination. sible factor combinations then the de. ign is abalanced two-factor factorial design.A balanced a b factorial design is a factorial design for which there are a levels of factor A, b levels ...

  15. What is a Full Factorial Experiment?

    Factorial Experiment

  16. PDF The 2k Factorial Design

    Chapter 6 of BHH (2nd ed) discusses fractional factorial designs. Example: full 25 factorial would require 32 runs. An experiment with only 8 runs is a 1/4th (quarter) fraction. Because 1⁄4=(1⁄2)2=2-2, this is referred to as a 25-2 design. In general, 2k-p design is a (1⁄2)p fraction of a 2k design using 2k-p runs.

  17. PDF Factorial Models

    2k Factorial Models Design of Experiments - Montgomery Chapter 6 23 2k Factorial Design † Each factor has two levels (often labeled + and ¡) † Very useful design for preliminary analysis † Can \weed out" unimportant factors † Also allows initial study of interactions † For general two-factor factorial model y ijk = „ + fi i + fl j +(fifl) ij + † ijk † Have 1+(a ¡ 1) + (b ...

  18. 7.4.3.7. The two-way ANOVA

    The two-way ANOVA is probably the most popular layout in the Design of Experiments. To begin with, let us define a factorial experiment: An experiment that utilizes every combination of factor levels as treatments is called a factorial experiment. Model for the two-way factorial experiment. In a factorial experiment with factor A at a levels ...

  19. 5.1: Factorial or Crossed Treatment Designs

    It is easy to see that with the addition of more and more crossed factors, the replicate size will increase rapidly and design modifications have to be made to make the experiment more manageable. In a factorial experiment, as combinations of different factor levels play an important role, it is important to differentiate between the lone (or ...

  20. Chapter 9 Fractional factorial designs

    The generator or generating equation provides a convenient way for constructing fractional factorial designs. The generator is then a word written by concatenating the factor letters, such that \(AB\) denotes a two-way interaction, and our previous example \(ABC\) is a three-way interaction; the special 'word' \(1\) denotes the grand mean.

  21. 11.3: Two-Way ANOVA (Factorial Design)

    11.3: Two-Way ANOVA (Factorial Design)

  22. ANOVA With Full Factorial Experiments

    The degrees of freedom (v 1 and v 2) for the F ratio are the degrees of freedom associated with the effects used to compute the F ratio. For example, consider the F ratio for Factor A when Factor A is a fixed effect. That F ratio (F A) is computed from the following formula: F A = F (v 1, v 2) = MS A / MS WG.

  23. Buildings

    The factorial experiment is a multifactor cross-grouping experiment that can test the differences between each factor and examine the interaction effects between factors . Therefore, the interactions between different concentrations of conditioning additives based on the factorial experiment need to be further studied.

  24. 5.1

    5.1 - Factorial Designs with Two Treatment Factors