Brown–Forsythe test

The Brown–Forsythe test is a statistical test for the equality of group variances based on performing an Analysis of Variance (ANOVA) on a transformation of the response variable. When a one-way ANOVA is performed, samples are assumed to have been drawn from distributions with equal variance. If this assumption is not valid, the resulting F-test is invalid. The Brown–Forsythe test statistic is the F statistic resulting from an ordinary one-way analysis of variance on the absolute deviations of the groups or treatments data from their individual medians.[1]

Transformation

The transformed response variable is constructed to measure the spread in each group. Let

where is the median of group j. The Brown–Forsythe test statistic is the model F statistic from a one way ANOVA on zij:

where p is the number of groups, nj is the number of observations in group j, and N is the total number of observations. Also are the group means of the and is the overall mean of the . This F-statistic follows the F-distribution with degrees of freedom and under the null hypothesis.

If the variances are indeed heterogeneous, techniques that allow for this (such as the Welch one-way ANOVA) may be used instead of the usual ANOVA.

Good [1994,2005], noting that the deviations are linearly dependent, has modified the test so as to drop the redundant deviations.

Comparison with Levene's test

Levene's test uses the mean instead of the median. Although the optimal choice depends on the underlying distribution, the definition based on the median is recommended as the choice that provides good robustness against many types of non-normal data while retaining good statistical power (Derrick et.al.,2018). If one has knowledge of the underlying distribution of the data, this may indicate using one of the other choices. Brown and Forsythe performed Monte Carlo studies that indicated that using the trimmed mean performed best when the underlying data followed a Cauchy distribution (a heavy-tailed distribution) and the median performed best when the underlying data followed a χ2 distribution with four degrees of freedom (a sharply skewed distribution). Using the mean provided the best power for symmetric, moderate-tailed, distributions.

See also

References

  1. "plot.hov function | R Documentation". www.rdocumentation.org. DataCamp.

 This article incorporates public domain material from the National Institute of Standards and Technology website https://www.nist.gov.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.