Patent · US Active

Identifying and processing a number of features identified in a document to determine a type of the document

US9516089B1 · kind B1 · utility

6Cited by
31References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 16, 2013
Grant dateDec 6, 2016
Priority date
Expiry dateOct 21, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/117
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for document classification are presented. An input document is received (e.g., by at least one server communicatively coupled to a network). A plurality of features are identified in the input document. The plurality of features include sequences of text extracted from the input document. A feature vector of the input document is generated based upon the sequences of text, and the feature vector of the input document is compared to each of a plurality of signature vectors to determine a primary type of the input document. The primary type of the input document is stored into a storage system in communication with the at least one server.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.