BFR algorithm: Difference between revisions
Appearance
Content deleted Content added
Tioaeu8943 (talk | contribs) Adding short description: "Vector clustering algorithms" |
Tioaeu8943 (talk | contribs) clarify "independent dimensions" |
||
Line 1: | Line 1: | ||
{{Short description|Vector clustering algorithms}} |
{{Short description|Vector clustering algorithms}} |
||
{{more citations needed|date=May 2018}} |
{{more citations needed|date=May 2018}} |
||
The '''BFR algorithm''', named after its inventors Bradley, Fayyad and Reina, is a variant of [[k-means clustering|k-means algorithm]] that is designed to cluster data in a high-dimensional [[Euclidean space]]. It makes a very strong assumption about the shape of clusters: they must be [[Normal distribution|normally distributed]] about a [[centroid]]. The [[mean]] and [[standard deviation]] for a cluster may differ for different dimensions, but the dimensions must be independent.<ref>{{Cite book|title=Mining of Massive Datasets|last=Rajaraman|first=Anand|last2=Ullman|first2=Jeffrey|last3=Leskovec|first3=Jure|publisher=Cambridge University Press|year=2011|isbn=1107015359|location=New York, NY, USA|pages=257–258}}</ref> |
The '''BFR algorithm''', named after its inventors Bradley, Fayyad and Reina, is a variant of [[k-means clustering|k-means algorithm]] that is designed to cluster data in a high-dimensional [[Euclidean space]]. It makes a very strong assumption about the shape of clusters: they must be [[Normal distribution|normally distributed]] about a [[centroid]]. The [[mean]] and [[standard deviation]] for a cluster may differ for different dimensions, but the dimensions must be independent.<ref>{{Cite book|title=Mining of Massive Datasets|last=Rajaraman|first=Anand|last2=Ullman|first2=Jeffrey|last3=Leskovec|first3=Jure|publisher=Cambridge University Press|year=2011|isbn=1107015359|location=New York, NY, USA|pages=257–258}}</ref> In other words, the data must take the shape of axis-aligned ellipses. |
||
==References== |
==References== |
Latest revision as of 14:51, 11 May 2025
This article needs additional citations for verification. (May 2018) |
The BFR algorithm, named after its inventors Bradley, Fayyad and Reina, is a variant of k-means algorithm that is designed to cluster data in a high-dimensional Euclidean space. It makes a very strong assumption about the shape of clusters: they must be normally distributed about a centroid. The mean and standard deviation for a cluster may differ for different dimensions, but the dimensions must be independent.[1] In other words, the data must take the shape of axis-aligned ellipses.
References
[edit]- ^ Rajaraman, Anand; Ullman, Jeffrey; Leskovec, Jure (2011). Mining of Massive Datasets. New York, NY, USA: Cambridge University Press. pp. 257–258. ISBN 1107015359.