XSQ_TEST

The XSQ_TEST function computes the Chi-square goodness-of-fit test between observed frequencies and the expected frequencies of a theoretical distribution. The result is a two-element vector containing the Chi-square test statistic X2 and the one-tailed probability of obtaining a value of X2 or greater.

Expected frequencies of magnitude less than 5 are combined with adjacent elements resulting in a reduction of cells used to formulate the chi-squared test statistic. If the observed frequencies differ significantly from the expected frequencies, the Chi-square test statistic will be large and the fit is poor. This situation requires the rejection of the hypothesis that the given observed frequencies are an accurate approximation to the expected frequency distribution.

This routine is written in the IDL language. Its source code can be found in the file xsq_test.pro in the lib subdirectory of the IDL distribution.

Calling Sequence

Result = XSQ_TEST( Obfreq, Exfreq )

Arguments

Obfreq

An n -element integer, single-, or double-precision floating-point vector containing observed frequencies.

Exfreq

An n -element integer, single-, or double-precision floating-point vector containing expected frequencies.

Keywords

EXCELL

Set this keyword to a named variable that will contain a vector of expected frequencies used to formulate the Chi-square test statistic. If each of the expected frequencies contained in Exfreq , has a magnitude of 5 or greater, then this vector is identical to Exfreq . If Exfreq contains elements of magnitude less than 5, adjacent expected frequencies are combined. The identical combinations are performed on the corresponding elements of Obfreq .

OBCELL

Set this keyword to a named variable that will contain a vector of observed frequencies used to formulate the Chi-square test statistic. The elements of this vector are often referred to as the "cells" of the observed frequencies. The length of this vector is determined by the length of EXCELL described below.

RESIDUAL

Set this keyword to a named variable that will contain a vector of signed differences between corresponding cells of observed frequencies and expected frequencies.

RESIDUAL[i] = OBCELL[i] - EXCELL[i].

The length of this vector is determined by the length of EXCELL described above.

Example

Define the vectors of observed and expected frequencies.

obfreq = [2, 1, 4, 15, 10, 5, 3]

exfreq = [0.5, 2.1, 5.9, 10.3, 10.7, 7.0, 3.5]

Test the hypothesis that the given observed frequencies are an accurate approximation to the expected frequency distribution.

result = XSQ_TEST(obfreq, exfreq)

PRINT, result

IDL prints:

3.05040 0.383920

Since the vector of expected frequencies contains elements of magnitude less than 5, adjacent expected frequencies are combined resulting in fewer cells. The identical combinations are performed on the corresponding elements of observed frequencies. The computed value of 0.383920 indicates that there is no reason to reject the proposed hypothesis at the 0.05 significance level.

See Also

CTI_TEST