Jump to content

BFR algorithm

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Nbro (talk | contribs) at 16:42, 18 May 2018 (Introduction to the algorithm added from the book "Mining of massive datasets" (by Rajaraman, Anand and Ullman, Jeffrey David).). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

The BFR algorithm, named after its inventors �������������������������������������������������������������������������������������������Bradley, Fayyad and Reina, is a variant of k-means that is designed to cluster data in a high-dimensional Euclidean space. It makes a very strong assumption about the shape of clusters: they must be normally distributed about a centroid. The mean and standard deviation for a cluster may differ for different dimensions, but the dimensions must be independent.