Patent · US Active

Document analyzer and metadata generation and use

US7849081B1 · kind B1 · utility

33Cited by
11References
26Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 28, 2007
Grant dateDec 7, 2010
Priority date
Expiry dateJun 10, 2029

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/355
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A document analyzer receives a collection of text-based terms associated with a document. The document analyzer performs a statistical analysis on the text-based terms to identify a distribution of where the text-based terms appear in the document and relative frequency indicating how often the text-based terms appear in the document. The document analyzer utilizes the distribution and relative frequency information derived from the statistical analysis to rank multiple themes associated with the document. For example, a received listing of multiple themes may not be presented in any useful order, although it can be assumed that the themes in the listing are present in the document. Based on application of distribution and relative frequency information derived from the analysis, the document analyzer can identify which themes are most relevant to the document as a whole and/or which of themes correspond to different portions (e.g., pages or sections) of the document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.