Jump to content

Sparse binary polynomial hashing: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Citation bot (talk | contribs)
Altered url. URLs might have been anonymized. Add: publisher, date, title, isbn. | Use this bot. Report bugs. | Suggested by Pppery | #UCB_webform
 
Line 8: Line 8:


[[Category:Bayesian statistics]]
[[Category:Bayesian statistics]]
[[Category:Spam filtering]]
[[Category:Anti-spam]]





Latest revision as of 06:57, 18 May 2024

Sparse binary polynomial hashing (SBPH) is a generalization of Bayesian spam filtering that can match mutating phrases as well as single words.

SBPH is a way of generating a large number of features from an incoming text automatically, and then using statistics to determine the weights for each of those features in terms of their predictive values for spam/nonspam evaluation.

[edit]
  • A paper on the subject as it relates to spam (some article text comes from this document, which is under the GFDL)
  • Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification. No Starch Press. 2005. p. 108. ISBN 978-1-59327-052-0.