PACKAGE |STAT Data Manipulation and Analysis, by Gary Perlman
NAME contab - contingency table and chi-square analysis
SYNOPSIS contab [-bsy] [-i nfactors] [-c entries] [factor names]
OPTIONS
-b
Use blank lines to make frequency tables more readable. This option is recommended when the cells are filled with values from the cell entry format option.
-c cell-entries
Choose table cell entries from:
e
expected values
d
differences of obtained and expected frequencies
p
all percentages
r
row percentages
c
column percentages
t
cell percentages of the total count
-i nfactors
Maximum number of factors in interactions. By default, all interactions are analyzed, however, it may be that only those contingency tables involving one or two factors will be of interest.
-s
Do not print significance tests (only print tables).
-y
Do not apply Yates' correction for continuity for chi-square tests with one degree of freedom.

The following standard help options are supported. The program exits after displaying the help.
-L
Display limits
-O
Display options and values
-V
Display version number and date
DESCRIPTION Hays (1973) warns ``...there is probably no other statistical method that has been so widely misapplied.'' (p. 735). Contingency tables and chi-square are used to summarize and test for association between frequencies of events.

Input Format
The input format is simple. Each cell frequency is preceded by codes of the levels of factors under which it was obtained. For example, if rats are categorized by age and are further categorized by the length of their tails, the contingency data would look like:

young     long      12
old       long      10
young     short     30
old       short      2
young     short      3

Note that the cell frequencies for young short-tailed rats add up to 33.

The most important assumption for the input data is that all frequencies are independent. For example, each subject in an experiment must contribute one and only one count to one cell. Also, if more than a small percentage of expected cell frequencies are small, then the chi-square statistic may be invalid. A text should be consulted.

Output
contab prints a summary of the names and number of levels of factors. For main effects, a frequency table and significance test of deviation from equal cell frequencies is printed. For each two-way interaction, a contingency table and test for independence is printed. Yates' correction for continuity is applied whenever there is one degree of freedom. For total cell frequencies in 2x2 tables up to 100, the Fisher exact test is computed for both one and two tailed cases. A text should be used to interpret the output statistics.

ALGORITHM The calculation of chi-square is standard for designs with more than one degree of freedom and adequate expected cell frequencies. Different texts have different methods for corrections for small designs. The methods used here reflect several sources.

When there is one degree of freedom, small cell frequencies can bias the chi-square test. Yates' correction for continuity reduces the absolute differences of obtained and expected frequencies by up to 0.5, following the discussion by Fisher (1970) in ``Statistical Methods for Research Workers.'' The Fisher Exact test for 2x2 tables is drawn from Bradley (1968) ``Distribution-Free Statistical Tests.'' The two-tailed calculation has only been tested against Bradley's one example.

FILES
UNIX           /tmp/contab.????
MSDOS          contab.tmp
LIMITS Use the -L option to determine the program limits.
SEE ALSO anova for multi-factor analysis of variance.
rankind for analysis of rank ordinal data for independent groups.
UPDATED February 3, 1987