Patent · US Active

System and method for improving feature selection for a spam filtering model

US8417783B1 · kind B1 · utility

5Cited by
4References
23Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 31, 2006
Grant dateApr 9, 2013
Priority date
Expiry dateJan 13, 2030

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L51/212
  • WIPO fieldDigital communication
  • WIPO sectorElectrical engineering

Abstract

A system and method for removing ineffective features from a spam feature set. In particular, in one embodiment of the invention, the an entropy value is calculated for the feature set based on the effectiveness of the feature set at differentiating between ham and spam. Features are then removed one at a time and the entropy is recalculated. Features which increase the overall entropy are removed and features which decrease the overall entropy are retained. In another embodiment of the invention, the value of certain type of time consuming features (e.g., rules) is determined based on both the information gain associated with the features and the time consumed implementing the features. Those features which have relatively low information gain and which consume a significant amount of time to implement are removed from the feature set.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.