Method and device for parsing tables in PDF document
US10592184B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 18, 2018 |
| Grant date | Mar 17, 2020 |
| Priority date | — |
| Expiry date | May 18, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/416
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present application relates to a method performed at an electronic device for parsing tables in a PDF document. The method includes the following steps: receiving the PDF document containing a table area; extracting horizontal lines, vertical lines and text blocks in the table area; determining the types of tables in the table area according to the extracted horizontal lines and vertical lines; if the table is a quasi full-line table, determining the structure of the quasi full-line table in the table area according to the horizontal lines and the vertical lines in the table area with the assistance of the text blocks in the table area; and if the table is a quasi non-line table, determining the structure of the quasi non-line table in the table area according to the text blocks in the table area with the assistance of the horizontal lines and/or the vertical lines in the table area.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.