Read the first two articles in the series:
Interaction Effects in Designed Experiments
Some Comments on "Historical" Designed Experiments

Randomization in Designed Experiments

by Keith M. Bower

As part of a Six Sigma project a practitioner may have to perform a designed experiment to assess the effect of a change in one or more variables on a response. The necessity of randomizing the order of experimental runs does not always seem clear. Using factor level settings in a "natural" sequence-running an experiment using the lowest settings first and increasing to the highest settings-may, for example, seem simpler from a practical perspective. The third and final article in a series on misconceptions regarding designed experiments, this discussion will show, however, that neglecting randomization can lead to incorrect conclusions.

True, for certain experiments randomization may incur practical problems, including high financial costs. An obvious example involves the use of a blast furnace, where temperature alterations would be time-consuming and very expensive. Fortunately, there are approaches that address such a restriction on randomization.1

Essentially, randomization allows for the valid comparison of effects when other, possibly unknown, factors may impact the response variable. For an illustration of the advantages to randomizing an experimental design, consider the following hypothetical example.

Example

A chemical reaction is under study, with the resulting yield measured using an adequate measurement system. The goal is to assess the effect on yield at different temperatures. Four fixed temperatures-1000C, 1200C, 1400C, and 1600C-are to be investigated. Ten runs will be performed using each of the four temperature settings, leading to forty runs in total.

There is some concern that other factors, including ambient humidity in the test laboratory, may affect the results. Unfortunately, these variables cannot explicitly be accounted for in the experiment. Consider these effects as adding a nonstationary disturbance to the resulting yield.2 Figure 1 shows the magnitude of the disturbance, which is unknown to the experimenter. Clearly, during the course of the day, the disturbance reduces the resulting yield.

Figure 1

This example uses simulated data. The results associated with the temperature settings of 1000C, 1200C, and 1400C are thirty random values from a Normal distribution with mean of 50 and standard deviation of 5 units.

The ten results for the temperature setting of 1600C are randomly sampled from a Normal distribution with mean of 57 and standard deviation of 5 units. That is, the mean is 1.4 standard deviations greater than the other factor levels.

Two scenarios will help investigate the ability of an experiment to detect the difference from the 1600C setting when using the other temperature settings:

  • Scenario 1 uses the approach of running the experiment in the order of increasing temperature over time. Therefore, the ten runs with temperature at 1000C will be run first, followed by the ten runs at 1200C, then the ten runs at 1400C, and finally the ten runs at 1600C.
  • Scenario 2 uses the randomization procedure. The forty simulated values are randomized throughout the study period. Adding the disturbances to the simulated values then returns the "actual" results which would ultimately be analyzed.

It is crucial to keep in mind that these are fake data-their sole purpose is to illustrate what would have been observed under the two scenarios.

Example Scenario 1-No Randomization

In Scenario 1 of the experiment to test the effect of temperature on the yield of a chemical reaction, the first experimental runs use the lowest temperature setting (1000C); subsequent runs build up to the highest temperature setting (1600C) over the course of a day. For this simulation, the nonstationary disturbance (the size of which is unknown to the experimenters) shown in Figure 1 is added to the simulated values. The final column of Table 1 of the Appendix lists the resulting yields (in mg), also represented in Figures 2 and 3.

Figure 2

Figure 3

A one-way Analysis of Variance (ANOVA) tests the null hypothesis that the mean yields resulting from the chemical reactions are equal.3

As Figure 4 shows, the P-value for the null hypothesis of equal means is greater than a significance level of 0.05 (P-value = 0.226 > 0.05). Therefore, we fail to reject the null hypothesis of equal means and conclude that Temperature has no effect on the response. This test therefore could not correctly detect the difference in the response with a temperature setting of 1600C and the other temperature settings.

Figure 4

General Linear Model: Yield (mg) versus Temp (C)

Factor Type Levels Values
Temp (C) fixes 4 100, 120, 140, 160

Analysis of Variance for Yield (mg), using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P
Temp (C) 3 128.34 128.34 42.78 1.52 0.226
Error 36 1014.09 1014.09 28.17    
Total 39 1142.43        

Example Scenario 2-Randomization Used

The experimenter uses the four temperature settings (1000C, 1200C, 1400C, and 1600C) in a random sequence over the course of the day. The same simulated results are employed as in Scenario 1, though the resulting yields will differ owing to the addition of the nonstationary disturbance. In Table 2 of the Appendix, note that the disturbance column is identical to that used in Scenario 1; however, the "simulated values" have been randomized throughout the period of study. Figure 5 shows the time series plot of the resulting yields.

Figure 5

As shown in Figure 6, a statistically significant difference exists between the treatment means (P-value = 0.003 < 0.05). Using John Tukey's multiple comparison test, differences occur at the α = 0.05 significance level between:

Temp = 1200C and 1600C (P-value = 0.0023)
Temp = 1400C and 1600C (P-value = 0.0178)

A suggested, though not as strong, difference also occurs between Temp = 1000C and 1600C:

Temp = 1000C and 1600C (P-value = 0.0649)

No other differences between the Temperature levels appear significant.

Figure 6

General Linear Model: Yield (mg) versus Temp (C)

Factor Type Levels Values
Temp (C) fixed 4 100, 120, 140, 160

Analysis of Variance for Yield (mg), using Adjusted SS for Tests

Source DF SEQ SS Adj SS Adj MS F P
Temp (C) 3 457.28 457.28 152.43 5.68 0.003
Error 36 966.64 966.64 26.85    
Total 39 1423.91        

Tukey Simultaneous Tests
Response Variable Yield (mg)
All Pairwise Comparisons among Levels of Temp (C)
Temp (C) = 100 subtracted from:

Temp (C) Difference
of Means
SE of
Difference
T-Value Adjusted
P-Value
120 -3.028 2.317 -1.307 0.5648
140 -1.258 2.317 -0.543 0.9479
160 5.974 2.317 2.578 0.0649

Temp (C) = 120 subtracted from:

Temp (C) Difference
of Means
SE of
Difference
T-Value Adjusted
P-Value
140 1.770 2.317 0.7638 0.8701
160 9.002 2.317 3.8846 0.0023

Temp (C) = 140 subtracted from:

Temp (C) Difference
of Means
SE of
Difference
T-Value Adjusted
P-Value
160 7.232 2.317 3.121 0.0178

In essence, by randomizing the sequence of runs over the course of the day, the effect of the nonstationary disturbance is "averaged out," leading to a good indication of the actual differences. The main effects plot in Figure 7 shows the resulting treatment means.

Figure 7

Summary

The concern that some nonstationary disturbance may affect the results obtained when running a designed experiment may lead practitioners to make erroneous conclusions. From a practical perspective, randomizing the sequence in which runs are performed is therefore imperative.


References

  1. For more information on experimental designs with hard-to-change variables, see "One Hard-to-Change Factor,"
  2. For information on nonstationary disturbances, see George E. P. Box, "Must We Randomize Our Experiment?" Part B.7. in Box on Quality and Discovery: With Design, Control, and Robustness (New York: John Wiley and Sons, Inc., 2000), 84-89.
  3. For information on the ANOVA procedure and multiple comparison tests, see Keith M. Bower, "Analysis of Variance (ANOVA) Using MINITAB," Scientific Computing & Instrumentation 17, no. 3 (2000): 64-65.

Appendix

Table 1

Temp (C) Time of Day Disturbance Simulation ID Simulated values Yield (Disturbance + Simulated values)
100 9:00 am -2.314 100(1) 50.325 48.01
100 9:10 am -3.334 100(2) 55.921 52.59
100 9:20 am -4.520 100(3) 50.624 46.10
100 9:30 am -4.238 100(4) 60.596 56.36
100 9:40 am -3.182 100(5) 52.601 49.42
100 9:50 am -3.029 100(6) 46.921 43.89
100 10:00 am -1.263 100(7) 51.841 50.58
100 10:10 am 0.264 100(8) 46.234 46.50
100 10:20 am -0.699 100(9) 50.263 49.56
100 10:30 am -0.917 100(10) 54.533 53.62
120 10:40 am -0.970 120(1) 56.248 55.28
120 10:50 am -1.330 120(2) 50.956 49.63
120 11:00 am -2.335 120(3) 48.664 46.33
120 11:10 am -2.119 120(4) 53.125 51.01
120 11:20 am -3.403 120(5) 50.041 46.64
120 11:30 am -3.618 120(6) 59.641 56.02
120 11:40 am -3.255 120(7) 48.663 45.41
120 11:50 am -5.041 120(8) 44.936 39.90
120 12:00 pm -5.579 120(9) 48.531 42.95
120 12:10 pm -4.743 120(10) 43.526 38.78
140 12:20 pm -5.880 140(1) 39.956 34.08
140 12:30 pm -5.311 140(2) 47.240 41.93
140 12:40 pm -5.757 140(3) 50.698 44.94
140 12:50 pm -5.370 140(4) 50.999 45.63
140 1:00 pm -3.998 140(5) 47.570 43.57
140 1:10 pm -3.945 140(6) 49.836 45.89
140 1:20 pm -2.855 140(7) 48.033 45.18
140 1:30 pm -3.464 140(8) 49.423 45.96
140 1:40 pm -3.133 140(9) 52.527 49.39
140 1:50 pm -4.351 140(10) 62.856 58.50
160 2:00 pm -5.429 160(1) 59.223 53.79
160 2:10 pm -5.981 160(2) 54.013 48.03
160 2:20 pm -6.319 160(3) 62.523 56.20
160 2:30 pm -6.760 160(4) 53.557 46.80
160 2:40 pm -7.649 160(5) 58.865 51.22
160 2:50 pm -9.767 160(6) 57.358 47.59
160 3:00 pm -9.643 160(7) 59.269 49.63
160 3:10 pm -9.343 160(8) 48.715 39.37
160 3:20 pm -8.432 160(9) 65.688 57.26
160 3:30 pm -7.294 160(10) 55.322 48.03

Table 2

Temp (C) Time of Day Disturbance Simulation ID Simulated values Yield (Disturbance + Simulated values)
160 9:00 am -2.314 160(9) 65.688 63.37
160 9:10 am -3.334 160(3) 62.523 59.19
100 9:20 am -4.520 100(9) 50.263 45.74
100 9:30 am -4.238 100(7) 51.841 47.60
100 9:40 am -3.182 100(10) 54.533 51.35
140 9:50 am -3.029 140(2) 47.240 44.21
140 10:00 am -1.263 140(8) 49.423 48.16
140 10:10 am 0.264 140(6) 49.836 50.10
100 10:20 am -0.699 100(2) 55.921 55.22
100 10:30 am -0.917 100(5) 52.601 51.68
160 10:40 am -0.970 160(8) 48.715 47.75
120 10:50 am -1.330 120(10) 43.526 42.20
140 11:00 am -2.335 140(7) 48.033 45.70
160 11:10 am -2.119 160(5) 58.865 56.75
140 11:20 am -3.403 140(4) 50.999 47.60
100 11:30 am -3.618 100(3) 50.624 47.01
140 11:40 am -3.255 140(10) 62.856 59.60
120 11:50 am -5.041 120(6) 59.641 54.60
100 12:00 pm -5.579 100(8) 46.234 40.66
120 12:10 pm -4.743 120(4) 53.125 48.38
120 12:20 pm -5.880 120(8) 44.936 39.06
140 12:30 pm -5.311 140(3) 50.698 45.39
120 12:40 pm -5.757 120(5) 50.041 44.28
160 12:50 pm -5.370 160(2) 54.013 48.64
140 1:00 pm -3.998 140(1) 39.956 35.96
160 1:10 pm -3.945 160(6) 57.358 53.41
100 1:20 pm -2.855 100(6) 46.921 44.07
140 1:30 pm -3.464 140(5) 47.570 44.11
160 1:40 pm -3.133 160(4) 53.557 50.42
160 1:50 pm -4.351 160(1) 59.223 54.87
160 2:00 pm -5.429 160(7) 59.269 53.84
120 2:10 pm -5.981 120(9) 48.531 42.55
120 2:20 pm -6.319 120(7) 48.663 42.34
120 2:30 pm -6.760 120(2) 50.956 44.20
160 2:40 pm -7.649 160(10) 55.322 47.67
140 2:50 pm -9.767 140(9) 52.527 42.76
100 3:00 pm -9.643 100(1) 50.325 40.68
120 3:10 pm -9.343 120(1) 56.248 46.91
100 3:20 pm -8.432 100(4) 60.596 52.16
120 3:30 pm -7.294 120(3) 48.664 41.37


© Keith M. Bower. All rights reserved.