Patent · US Active

Linguistic nonsense detection for undesirable message classification

US7809795B1 · kind B1 · utility

43Cited by
3References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 26, 2006
Grant dateOct 5, 2010
Priority date
Expiry dateFeb 1, 2028

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06Q10/107
  • WIPO fieldIT methods for management
  • WIPO sectorElectrical engineering

Abstract

Nonsense words are removed from incoming emails and visually similar (look-alike) characters are replaced with the actual, corresponding characters, so that the emails can be more accurately analyzed to see if they are spam. More specifically, an incoming email stream is filtered, and the emails are normalized to enable more accurate spam detection. In some embodiments, the normalization comprises the removal of nonsense words and/or the replacement of look-alike characters according to a set of rules. In other embodiments, more and/or different normalization techniques are utilized. In some embodiments, the language in which an email is written is identified in order to aid in the normalization. Once incoming emails are normalized, they are then analyzed to detect spam or other forms of undesirable email, such as phishing emails.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.