Patent · US Active

Document analyzer and metadata generation

US8060506B1 · kind B1 · utility

14Cited by
14References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 2, 2010
Grant dateNov 15, 2011
Priority date
Expiry dateNov 2, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/355
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A document analyzer receives a collection of text-based terms associated with a document. The document analyzer performs a statistical analysis on the text-based terms to identify a distribution of where the text-based terms appear in the document and relative frequency indicating how often the text-based terms appear in the document. The document analyzer utilizes the distribution and relative frequency information derived from the statistical analysis to rank multiple themes associated with the document. For example, a received listing of multiple themes may not be presented in any useful order, although it can be assumed that the themes in the listing are present in the document. Based on application of distribution and relative frequency information derived from the analysis, the document analyzer can identify which themes are most relevant to the document as a whole and/or which of themes correspond to different portions (e.g., pages or sections) of the document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.