Methods and apparatus for obtaining structured information in fixed layout documents
US9773009B2 · kind B2 · utility
Assignees
Inventors
Key dates
| Filing date | Dec 7, 2012 |
| Grant date | Sep 26, 2017 |
| Priority date | — |
| Expiry date | Jul 1, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/131
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present application discloses a method and an apparatus for obtaining structured information in a fixed layout document to improve the structuring speed for information management of a fixed layout document. The method may comprise: determining initial page number information corresponding to current directory entry of the document; segmenting first article content of a page corresponding to the initial page number information into at least one structured-characters-block; searching in each structured-characters-block for a first structured-characters-block which matches with name strings of the current directory entry, and obtaining first position information about where the first structured-characters-block is located in the first article content; and obtaining initial position information of the current directory entry and end position information of the previous directory entry based on the first position information.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.