Patent · US Active

Regularities and trends discovery in a flow of business documents

US10789281B2 · kind B2 · utility

2Cited by
4References
20Claims
0Family size

Assignee

Inventor

Key dates

Filing dateJun 29, 2017
Grant dateSep 29, 2020
Priority date
Expiry dateApr 11, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/10
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for encoding documents includes building or otherwise providing a condensed dictionary including identifiers for block headers identified in text blocks extracted from a collection of training documents. For at least one test document a set of text content blocks is identified. For each of the text content blocks in the set, a block header is identified. Each block header in the training and test documents includes a sequence includes no more than a predetermined maximum number of characters. An encoding of the test document is generated, based on the identifiers of the block headers identified in the test document that are in the condensed dictionary.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.