AnoGen: A Program for
Generating ANOVA Data Sets
Version 1.3
Jeff Miller
Department of Psychology
University of Otago
Dunedin, New Zealand
August, 1998
1 Introduction
This program was designed for use in teaching the statistical
procedure known as Analysis of Variance (ANOVA).
In brief, it generates appropriate data sets for use as
examples or practice problems, and it computes the
correct ANOVA for each data set.
It handles between-subjects, within-subjects, and mixed designs, and can go up
to six factors, with some restrictions on the numbers of levels, subjects, and
model terms.
The program can be run in either of two modes:
one designed for use by students; the other, by teachers.
The student mode is simpler: The student simply specifies the
experimental design, and AnoGen generates an appropriate
random data set.
The student can then view the data set and answers,
and save them to a file.
Thus, students can fairly simply get
as much computational practice as they want.
The teacher mode is more complicated: The teacher not only
specifies the experimental design, but also controls the
the cell means and error variance to obtain whatever
F values are desired for the example.
Considerable familiarity with ANOVA is needed to use this mode.
2 Step-by-Step Instructions: Student Mode
- Start the program as is appropriate on your computer system
(e.g., by typing AnoGen at a DOS prompt).
- Once AnoGen is running, type S to enter the student mode.
- Specify the design:
- To set the number of within-subjects factors, type W,
and then type the number you want, followed by enter.
- Similarly, type B to set the number of between-subjects factors,
- Similarly, type S to set the number of subjects per group.
A ``group'' is defined by one combination of levels of
the between-subjects factors. For example, if you have
between-subjects factors of Male/Female and Young/Old,
then there are four groups corresonding to the four combinations.
Note that you can set these numbers in any order, and you can
change each one as often as you like.
After you have the settings you want,
type ctrl-Q to move on to the next step.
- Now specify the number of levels of each factor.
Type the letter corresponding to the factor you want to change
(A, B, ...), and then enter the number of levels you want.
Again, after you have the settings
as you want them, type ctrl-Q to move on to the next step.
- Type P to display the problem (i.e., the data set).
Ideally, you would now do the computations by hand, for practice.
(The information given in the problem display is intended to
be self-explanatory, but some explanation is given in
Section .)
- Type S to display the solution (i.e., cell means, ANOVA table, etc).
This is where you check your solution and make sure you've done
it correctly. The solution contains the various parts that I use
in teaching ANOVA using the general linear model.
(More explanation of the information given in the
solution display is given in
Section .)
- If you want, type F to save the problem and solution to a file.
(The main reason to for doing this is to get a printed
version of the problem and solution.)
Enter the name of the file to which you want the information saved.
- Type ctrl-Q to quit when you are done with this problem.
AnoGen will then ask if you want to start over again:
Type Y if you want to do another problem, or N to quit.
Back to table of contents
3 Explanation of Problem Display
Table shows an example of a problem display.
There is one line per subject, and the different groups
correspond to the different levels of the between-subjects factor(s).
For this example, the problem display fits on a single screen;
with larger designs (i.e., more groups or more subjects per group),
the problem display may be split across several screens.
Table 1: An example of a problem display.
This design has
two between-subjects factors (A and B) with two levels each,
and three subjects per group.
|
Group A1B1:
|
Sub 1: | 95
|
Sub 2: | 78
|
Sub 3: | 97 |
Group A2B1:
|
Sub 1: | -19
|
Sub 2: | -37
|
Sub 3: | -10 |
Group A1B2:
|
Sub 1: | 55
|
Sub 2: | 64
|
Sub 3: | 73 |
Group A2B2:
|
Sub 1: | 58
|
Sub 2: | 63
|
Sub 3: | 71 |
Table shows an example of a problem display
for a more complex experimental design.
Note that the different conditions tested within-subjects are listed
across the line, and the different subjects and groups organized
as in the between-subjects design.
Table 2: An example of a problem display.
This design has a within-subjects factor (A) with two levels,
two between-subjects factors (B and C) with two levels each,
and three subjects per group.
|
Group B1C1:
|
| A1 | A2
|
Sub 1: | 77 | 53
|
Sub 2: | 84 | 56
|
Sub 3: | 103 | 41 |
Group B2C1:
|
| A1 | A2
|
Sub 1: | 77 | 65
|
Sub 2: | 54 | 64
|
Sub 3: | 73 | 69 |
Group B1C2:
|
| A1 | A2
|
Sub 1: | 103 | 75
|
Sub 2: | 100 | 78
|
Sub 3: | 97 | 57 |
Group B2C2:
|
| A1 | A2
|
Sub 1: | 72 | 10
|
Sub 2: | 74 | 18
|
Sub 3: | 58 | 2 |
Back to table of contents
4 Explanation of Solution Display
The solution display has several components, as described below.
Some of these components may be omitted if they do not fit well with
the way your instructor teaches the material.
4.1 Design
This shows a list of factors with the number of levels per factor.
Also shown is the number of subjects per group.
4.2 Cell Means
The cell means are given in a table of this form
(these are the means for the problem in Table 2):
|
Cell: | Mean
|
u | 65
|
A1 | 81
|
A2 | 49
|
B1 | 77
|
B2 | 53
|
A1 B1 | 94
|
A1 B2 | 68
|
A2 B1 | 60
|
A2 B2 | 38
|
C1 | 68
|
C2 | 62
|
A1 C1 | 78
|
...
|
The first line (u) shows the overall mean across all conditions.
The next two lines (A1 and A2) show the means of all scores at levels 1 and 2
of factor A, respectively.
The next two lines (B1 and B2) show the means of all scores at levels 1 and 2
of factor B, respectively.
The next line (A1 B1) shows the mean of all scores at level 1 of factor A
and level 1 of factor B, and then the next three lines show means for
the other combinations of levels on these two factors.
And so on.
The model section shows the form of the general linear model
appropriate for this design.
The main effect and interaction terms are denoted by capital letters
(A, B, AB, etc), S is for subjects,
and the subscripts are denoted by lower-case letters (i, j, k, etc).
4.4 Estimation Equations
The estimation equations section shows the equation used to estimate
each term in the linear model.
The period subscript is used to denote averaging across levels of
the factor corresponding to that subscript.
4.5 Decomposition Matrix
The decomposition matrix shows the breakdown of all data values
(numbers of the left sides of the equals signs)
into the estimated values corresponding
to each term in the linear model.
The order of the numbers on the line is the same as the order of the
terms in the model.
4.6 ANOVA Table
The ANOVA table is in a relatively standard format.
The F value is marked with one asterisk if it is significant at the
level of p < .05 and two asterisks if significant at p < .01.
The error term used to compute each F is shown at the far right
side of the table.
Back to table of contents
File translated from TEX by TTH, version 1.50.