Selection of atoms for search engine retrieval
US9342582B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 10, 2011 |
| Grant date | May 17, 2016 |
| Priority date | — |
| Expiry date | Mar 10, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/41
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods are provided for populating search indexes with atoms identified in documents. Documents that are to be indexed are identified, and for each document, atoms are identified and are categorized as unigrams, n-grams, and n-tuples. A list of atom/document pairs is generated such that an information metric can be computed for each pair. An information metric represents a ranking of the atom in relation to the particular document. Based on the information metric, some atom/document pairs are discarded and others are indexed.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.