Identifying and processing a number of features identified in a document to determine a type of the document
US9516089B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 16, 2013 |
| Grant date | Dec 6, 2016 |
| Priority date | — |
| Expiry date | Oct 21, 2034 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/117
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for document classification are presented. An input document is received (e.g., by at least one server communicatively coupled to a network). A plurality of features are identified in the input document. The plurality of features include sequences of text extracted from the input document. A feature vector of the input document is generated based upon the sequences of text, and the feature vector of the input document is compared to each of a plurality of signature vectors to determine a primary type of the input document. The primary type of the input document is stored into a storage system in communication with the at least one server.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.