CTI_TEST

The CTI_TEST function constructs a " contingency table" from an array of observed frequencies and tests the hypothesis that the rows and columns are independent using an extension of the chi-square goodness-of-fit test. The result is a two-element vector containing the chi-square test statistic X2 and the one-tailed probability of obtaining a value of X2 or greater.

This routine is written in the IDL language. Its source code can be found in the file cti_test.pro in the lib subdirectory of the IDL distribution.

Calling Sequence

Result = CTI_TEST( Obfreq )

Arguments

Obfreq

An m x n array containing observed frequencies. Obfreq can contain either integer, single-, double-precision floating-point values.

Keywords

COEFF

Set this keyword to a named variable that will contain the Coefficient of Contingency. The Coefficient of Contingency is a non-negative scalar, in the interval [0.0, 1.0], which measures the degree of dependence within a contingency table. The larger the value of COEFF, the greater the degree of dependence.

CORRECTED

Set this keyword to use the "Yate's Correction for Continuity" when computing the Chi-squared test statistic, X2. The Yate's correction always decreases the magnitude of X2. In general, this keyword should be set for small sample sizes.

CRAMV

Set this keyword to a named variable that will contain Cramer's V. Cramer's V is a non-negative scalar, in the interval [0.0, 1.0], which measures the degree of dependence within a contingency table.

DF

Set this keyword to a named variable that will contain the number of degrees of freedom used to compute the probability of obtaining the value of the Chi-squared test statistic or greater. DF = ( n - 1) * ( m - 1) where m and n are the number of columns and rows of the contingency table, respectively.

EXFREQ

Set this keyword to a named variable that will contain an array of m -columns and n -rows containing expected frequencies. The elements of this array are often referred to as the "cells" of the expected frequencies. The expected frequency of each cell is computed as the product of row and column marginal frequencies divided by the overall total of observed frequencies.

RESIDUAL

Set this keyword to a named variable that will contain an array of m -columns and n -rows containing signed differences between corresponding cells of observed frequencies and expected frequencies.

Example

Define a 5-column and 4-row array of observed frequencies.

obfreq = [[748, 821, 786, 720, 672], $

          [ 74,  60,  51,  66,  50], $

          [ 31,  25,  22,  16,  15], $

          [  9,  10,   6,   5,   7]]

Test the hypothesis that the rows and columns of "obfreq" contain independent data at the 0.05 significance level.

result = CTI_TEST(obfreq, COEFF = coeff)

The result should be the two-element vector [14.3953, 0.276181].

The computed value of 0.276181 indicates that there is no reason to reject the proposed hypothesis at the 0.05 significance level. The Coefficient of Contingency returned in the parameter "coeff" (coeff = 0.0584860) also indicates the lack of dependence between the rows and columns of the observed frequencies. Setting the CORRECTED keyword returns the two-element vector [12.0032, 0.445420] and (coeff = 0.0534213) resulting in the same conclusion of independence.

See Also

CORRELATE , M_CORRELATE , XSQ_TEST