Patent · US Active

Identifying matching canonical documents consistent with visual query structural information

US8811742B2 · kind B2 · utility

19Cited by
15References
24Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 1, 2011
Grant dateAug 19, 2014
Priority date
Expiry dateJul 26, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/10
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A server system receives a visual query from a client system, performs optical character recognition (OCR) on the visual query to produce text recognition data representing textual characters, including a plurality of textual characters in a contiguous region of the visual query. The server system also produces structural information associated with the textual characters in the visual query. Textual characters in the plurality of textual characters are scored. The method further includes identifying, in accordance with the scoring, one or more high quality textual strings, each comprising a plurality of high quality textual characters from among the plurality of textual characters in the contiguous region of the visual query. A canonical document that includes the one or more high quality textual strings and that is consistent with the structural information is retrieved. At least a portion of the canonical document is sent to the client system.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.