Patent · US Active

Spam filtering based on statistics and token frequency modeling

US8364766B2 · kind B2 · utility

5Cited by
12References
15Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 4, 2008
Grant dateJan 29, 2013
Priority date
Expiry dateMar 26, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N7/01
  • WIPO fieldDigital communication
  • WIPO sectorElectrical engineering

Abstract

Embodiments are directed towards classifying messages as spam using a two phased approach. The first phase employs a statistical classifier to classify messages based on message content. The second phase targets specific message types to capture dynamic characteristics of the messages and identify spam messages using a token frequency based approach. A client component receives messages and sends them to the statistical classifier, which determines a probability that a message belongs to a particular type of class. The statistical classifier further provides other information about a message, including, a token list, and token thresholds. The message class, token list, and thresholds are provided to the second phase where a number of spam tokens in a given message for a given message class are determined. Based on the threshold, the client component then determines whether the message is spam or non-spam.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.