Document layout extraction
US8250469B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 3, 2007 |
| Grant date | Aug 21, 2012 |
| Priority date | — |
| Expiry date | Jul 10, 2030 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/151
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.