Patent · US Expired

Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems

US7409383B1 · kind B1 · utility

36Cited by
4References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 31, 2004
Grant dateAug 5, 2008
Priority date
Expiry dateMay 25, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99943
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.