Patent · US Active

Method and system for determining relevance of terms in text documents

US8321398B2 · kind B2 · utility

17Cited by
12References
49Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 1, 2009
Grant dateNov 27, 2012
Priority date
Expiry dateApr 17, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/334
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present invention provides a corpus-independent method for determining relevancy of terms to content of text appearing in a document by analyzing the document itself. Conventional information extraction, or other methods, may be applied to a document to generate a list of terms. The invention analyzes the document using relevancy scoring algorithms to determine a term relevancy score representing the term's relevance to the text contained in the document. The scores, including an aggregate score, may be normalized in the process. Based on relevancy scoring, terms are then ranked and further processed. In this manner relevancy is determined based on the subject document itself and by analyzing the occurrences and locations of the terms within the document. Additional techniques may be applied to relate the relevancy scores generated by the present invention to a corpus or collection of documents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.