Query generation using structural similarity between documents
US9436747B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 25, 2015 |
| Grant date | Sep 6, 2016 |
| Priority date | — |
| Expiry date | Jun 25, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/186
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods, systems, and apparatus, including computer program products, for generating synthetic queries using seed queries and structural similarity between documents are described. In one aspect, a method includes identifying embedded coding fragments (e.g., HTML tag) from a structured document and a seed query; generating one or more query templates, each query template corresponding to at least one coding fragment, the query template including a generative rule to be used in generating candidate synthetic queries; generating the candidate synthetic queries by applying the query templates to other documents that are hosted on the same web site as the document; identifying terms that match structure of the query templates as candidate synthetic queries; measuring a performance for each of the candidate synthetic queries; and designating as synthetic queries the candidate synthetic queries that have performance measurements exceeding a performance threshold.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.