Patent · US Active

Table recognition in portable document format documents

US11200413B2 · kind B2 · utility

10Cited by
32References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 31, 2018
Grant dateDec 14, 2021
Priority date
Expiry dateFeb 20, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/416
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and computer program products for table recognition in PDF documents are provided herein. A computer-implemented method includes discretizing one or more contiguous areas of a PDF document; identifying one or more white-space separator lines within the one or more discretized contiguous areas of the PDF document; detecting one or more candidate table regions within the one or more discretized contiguous areas of the PDF document by clustering the one or more white-space separator lines into one or more grids; and outputting at least one of the candidate table regions as a finalized table in accordance with scores assigned to each of the one or more candidate table regions based on (i) border information and (ii) cell structure information.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.