PROGRAMMER: Paul von Hippel
*/
/*
PURPOSE: Researchers often wish to test whether two groups differ
in central tendency.
There are a number of tests for doing this, but researchers should
be aware of the
assumptions on which these tests are based:
(1) The pooled t-test assumes normality and equal variances.
(2) Welch's t-test assumes normality but does not assume equal
variances.
(3) The Wilcoxon and the equivalent Mann-Whitney test do not assume normality,
but do assume equal spread and equal shape.
If the population is not normal, the t-tests may approach validity
for large samples,
but are not valid for small samples.
When the researcher wishes to assume neither normality nor equal
variances, none of the
tests above is appropriate. Instead, the Fligner-Policello test
should be considered.
The Fligner-Policello test assumes neither normality nor equal
variances. It does not
even assume that the two distributions have similar shape. It
does make one assumption,
which is satisfied when (but not only when) the distributions
are symmetric.
Unfortunately, the Fligner-Policello test has not, to my knowledge
been implemented in
commercial software. The SAS macro below is intended to make
the Fligner-Policello
test as accessible to SAS users as are the Wilcoxon and t-tests
(which are implemented
in the SAS procedures TTEST and NPAR1WAY).
*/
/*
USE: The macro is called as follows
%fligner_policello (data=mydataset, group=a_vs_b, variable=y)
where mydataset is the name of your data set, a_vs_b is the name
of a variable indicating
which observations are in which group, and y is the name of the
variable that you think
may differ across the two groups.
As an illustration, at the bottom of this file the macro is applied
to data from a study
of plasma glucose values in healthy and lead-poisoned Canada
geese. The data are taken
from Hollander & Wolfe (1999), Table 4.6, who in turn took
them from March et al (1976).
For this data set, the results obtained from the macro agree
with those obtained in
Hollander & Wolfe's (1999, pp. 136-8) illustrative hand calculation.
*/
/*
REFERENCES
Fligner, MA & Policello, GE. (1981). Robust rank procedures
for the Behrens-Fisher problem.
_Journal of the American Statistical Association_ 76(373): 162-168.
Hollander, M & Wolfe, DA. (1999) Nonparametric statistical
methods (2nd ed.). New York: Wiley.
March, GL, John, TM, McKeown, BA, Sileo, L, & George, JC.
(1976). The effects of lead
poisoning on various plasma constituents in the Canada goose.
_J. Wildl. Dis._ 12:14-19.
*/
%macro fligner_policello (data=, group=, variable=);
ods listing close;
proc sort data=&data;
by &group;
run;
proc rank data=&data out=combined_rank_&data ties=mean;
var &variable;
ranks rank_combined;
run;
proc rank data=combined_rank_&data out=grouped_rank_&data ties=mean;
var &variable;
ranks rank_group;
by &group;
run;
data placements;
set grouped_rank_&data;
placement = rank_combined - rank_group;
run;
proc means data=placements mean sum css n;
var placement;
by &group;
ods output Summary=summary_stats;
run;
proc iml;
use summary_stats;
read all var { placement_mean placement_sum placement_css} into
mean_sum_css;
read all var { &group } into group;
read all var { placement_n } into sample_size;
numerator = mean_sum_css[1,2] - mean_sum_css[2,2];
denominator = 2 * sqrt(
mean_sum_css[1,3] + mean_sum_css[2,3]
+ mean_sum_css[1,1]*mean_sum_css[2,1]
);
fligner_policello = numerator / denominator;
asymptotic_one_tailed_p =1-probnorm(abs(fligner_policello));
asymptotic_two_tailed_p = 2*asymptotic_one_tailed_p;
ods listing;
title "For variable &variable...";
print group sample_size;
print fligner_policello;
print "A positive value suggests that the first group has a larger
median. A negative value suggests that the second group has a larger median.";
print asymptotic_one_tailed_p;
print asymptotic_two_tailed_p;
print "NOTE: If both sample sizes are less than 12, the asymptotic
p-value may be a poor approximation.";
print "Correct one-tailed critical values for small samples can
be found in Hollander & Wolfe, _Nonparametric statistical methods_,
2nd ed., Table A7.
(For this statistic, one-tailed p-values are half as large as
two-tailed p-values.)";
quit;
%mend fligner_policello;
data geese;
input health $ plasma;
datalines;
healthy 297
healthy 340
healthy 325
healthy 227
healthy 277
healthy 337
healthy 250
healthy 290
poisoned 293
poisoned 291
poisoned 289
poisoned 430
poisoned 510
poisoned 353
poisoned 318
;
run;
%fligner_policello (data=geese, group=health, variable=plasma);