Document analyzer and metadata generation and use
US7849081B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 28, 2007 |
| Grant date | Dec 7, 2010 |
| Priority date | — |
| Expiry date | Jun 10, 2029 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/355
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A document analyzer receives a collection of text-based terms associated with a document. The document analyzer performs a statistical analysis on the text-based terms to identify a distribution of where the text-based terms appear in the document and relative frequency indicating how often the text-based terms appear in the document. The document analyzer utilizes the distribution and relative frequency information derived from the statistical analysis to rank multiple themes associated with the document. For example, a received listing of multiple themes may not be presented in any useful order, although it can be assumed that the themes in the listing are present in the document. Based on application of distribution and relative frequency information derived from the analysis, the document analyzer can identify which themes are most relevant to the document as a whole and/or which of themes correspond to different portions (e.g., pages or sections) of the document.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.