Patent · US Active

Method and device for parsing tables in PDF document

US10592184B2 · kind B2 · utility

3Cited by
0References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 18, 2018
Grant dateMar 17, 2020
Priority date
Expiry dateMay 18, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/416
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present application relates to a method performed at an electronic device for parsing tables in a PDF document. The method includes the following steps: receiving the PDF document containing a table area; extracting horizontal lines, vertical lines and text blocks in the table area; determining the types of tables in the table area according to the extracted horizontal lines and vertical lines; if the table is a quasi full-line table, determining the structure of the quasi full-line table in the table area according to the horizontal lines and the vertical lines in the table area with the assistance of the text blocks in the table area; and if the table is a quasi non-line table, determining the structure of the quasi non-line table in the table area according to the text blocks in the table area with the assistance of the horizontal lines and/or the vertical lines in the table area.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.