BFR algorithm: Difference between revisions

Content deleted Content added

Inline

Revision as of 16:43, 18 May 2018

The BFR algorithm, named after its inventors Bradley, Fayyad and Reina, is a variant of k-means that is designed to cluster data in a high-dimensional Euclidean space. It makes a very strong assumption about the shape of clusters: they must be normally distributed about a centroid. The mean and standard deviation for a cluster may differ for different dimensions, but the dimensions must be independent.

Revision as of 16:42, 18 May 2018 edit Nbro (talk \| contribs) Extended confirmed users 3,319 edits Introduction to the algorithm added from the book "Mining of massive datasets" (by Rajaraman, Anand and Ullman, Jeffrey David).		Revision as of 16:43, 18 May 2018 edit undo Nbro (talk \| contribs) Extended confirmed users 3,319 edits No edit summary Next edit →
Line 1:		Line 1:
	The '''BFR algorithm''', named after its inventors ��Bradley, Fayyad and Reina, is a variant of [[k-means clustering\|k-means]] that is designed to cluster data in a high-dimensional [[Euclidean space]]. It makes a very strong assumption about the shape of clusters: they must be [[Normal distribution\|normally distributed]] about a centroid. The mean and standard deviation for a cluster may differ for different dimensions, but the dimensions must be independent.		The '''BFR algorithm''', named after its inventors Bradley, Fayyad and Reina, is a variant of [[k-means clustering\|k-means]] that is designed to cluster data in a high-dimensional [[Euclidean space]]. It makes a very strong assumption about the shape of clusters: they must be [[Normal distribution\|normally distributed]] about a centroid. The mean and standard deviation for a cluster may differ for different dimensions, but the dimensions must be independent.