Phrase extraction using subphrase scoring
US8166045B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 30, 2007 |
| Grant date | Apr 24, 2012 |
| Priority date | — |
| Expiry date | Mar 30, 2027 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/951
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are extracted from the document collection. Documents are the indexed according to their included phrases, using phrase posting lists. The phrase posting lists are stored in an cluster of index servers. The phrase posting lists can be tiered into groups, and sharded into partitions. Phrases in a query are identified based on possible phrasifications. A query schedule based on the phrases is created from the phrases, and then optimized to reduce query processing and communication costs. The execution of the query schedule is managed to further reduce or eliminate query processing operations at various ones of the index servers.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.