Page selection for indexing
US8645288B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 2, 2010 |
| Grant date | Feb 4, 2014 |
| Priority date | — |
| Expiry date | Dec 17, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/9535
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Some implementations provide techniques for selecting web pages for inclusion in an index. For example, some implementations apply regularization to select a subset of the crawled web pages for indexing based on link relationships between the crawled web pages, features extracted from the crawled web pages, and user behavior information determined for at least some of the crawled web pages. Further, in some implementations, the user behavior information may be used to sort a training set of crawled web pages into a plurality of labeled groups. The labeled groups may be represented in a directed graph that indicates relative priorities for being selected for indexing.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.