Patent · US Active

Automated data classification

US9483740B1 · kind B1 · utility

19Cited by
31References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 16, 2013
Grant dateNov 1, 2016
Priority date
Expiry dateSep 22, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/285
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for data classification are presented. A plurality of training tokens are identified by at least one server communicatively coupled to a network. Each training token includes a token retrieved from a content source and a classification of the token. For each training token in the plurality of training tokens, a plurality of n-gram sequences are identified, a plurality of features for the plurality of n-gram sequences are generated, and first training data is generated using the token retrieved from the content source, the plurality of features, and the classification of the token. A first classifier is trained with the first training data, and the first classifier is stored into a storage system in communication with the at least one server.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.