Patent · US Active

Query generation using structural similarity between documents

US8346792B1 · kind B1 · utility

32Cited by
50References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 9, 2010
Grant dateJan 1, 2013
Priority date
Expiry dateJan 8, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/186
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer program products, for generating synthetic queries using seed queries and structural similarity between documents are described. In one aspect, a method includes identifying embedded coding fragments (e.g., HTML tag) from a structured document and a seed query; generating one or more query templates, each query template corresponding to at least one coding fragment, the query template including a generative rule to be used in generating candidate synthetic queries; generating the candidate synthetic queries by applying the query templates to other documents that are hosted on the same web site as the document; identifying terms that match structure of the query templates as candidate synthetic queries; measuring a performance for each of the candidate synthetic queries; and designating as synthetic queries the candidate synthetic queries that have performance measurements exceeding a performance threshold.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.