Jump to content

Biweight midcorrelation

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Michael Hardy (talk | contribs) at 22:28, 28 August 2015 (Derivation). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In statistics, biweight midcorrelation (also called bicor) is a measure of similarity between samples. It is median-based, rather than mean-based, thus is less sensitive to outliers, and can be a robust alternative to other similarity metrics, such as Pearson correlation or mutual information [1].

Derivation

Here we find the biweight midcorrelation of two vectors and , with items, representing each item in the vector as and . First, we define as the median of a vector and as the median absolute deviation (MAD), then define and as,

Now we define the weights and as,

[How is the function denoted capital I defined?]

Then we normalize so that the sum of the weights is 1:

Finally, we define biweight midcorrelation as,

Applications

Biweight midcorrelation has been shown to be more robust in evaluating similarity in gene expression networks[2], and is often used for weighted correlation network analysis.

Implementations

Biweight midcorrelation has been implemented in the R statistical programming language as the function bicor as part of the WGCNA package[3].

References

  1. ^ Wilcox, Rand (January 12, 2012). Introduction to Robust Estimation and Hypothesis Testing (3rd ed.). Academic Press. p. 455. ISBN 978-0123869838.
  2. ^ Song, Lin (9 December 2012). "Comparison of co-expression measures: mutual information, correlation, and model based indices". BMC Bioinformatics. 13 (328). doi:0.1186/1471-2105-13-328. PMID 23217028. {{cite journal}}: |access-date= requires |url= (help); Check |doi= value (help)
  3. ^ Langfelder, Peter. "bicor {WGCNA}". Inside R. Revolution Analytics. Retrieved 18 August 2015.