Patent · US Active

Automated workflows for identification of reading order from text segments using probabilistic language models

US10713519B2 · kind B2 · utility

2Cited by

13References

20Claims

0Family size

Assignee

Adobe Inc. · US

Inventors

Trung Bui · San Jose, US
Hung Bui · Sunnyvale, US
Shawn A. Gaither · Raleigh, US
Walter Chang · San Jose, US
Michael Kraley · Lexington, US
Pranjal Daga · West Lafayette, US

Key dates

Filing date	Jun 22, 2017
Grant date	Jul 14, 2020
Priority date	—
Expiry date	Oct 20, 2037

Classification

Technology area (CPC G)Physics
CPC primaryG06V30/413
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The present invention is directed towards providing automated workflows for the identification of a reading order from text segments extracted from a document. Ordering the text segments is based on trained natural language models. In some embodiments, the workflows are enabled to perform a method for identifying a sequence associated with a portable document. The methods includes iteratively generating a probabilistic language model, receiving the portable document, and selectively extracting features (such as but not limited to text segments) from the document. The method may generate pairs of features (or feature pair from the extracted features). The method may further generate a score for each of the pairs based on the probabilistic language model and determine an order to features based on the scores. The method may provide the extracted features in the determined order.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.