Patent · US Active

Methods and systems for the analysis of large text corpora

US9135242B1 · kind B1 · utility

21Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 15, 2013
Grant dateSep 15, 2015
Priority date
Expiry dateNov 15, 2033

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/30
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Computerized methods and systems for the analysis of textual data, including: receiving, from one or more memories at one or more processors, textual data; using the processors, formatting the textual data for analysis and applying a probabilistic topic model to the textual data to extract semantically meaningful topics that collectively describe it; using a keyword weighting module, generating a topic cloud view representing the topics as a tagcloud with each being associated with a plurality of keywords; using a topic ordering module, generating a document distribution view representing a distribution of the textual data across multiple topics; using a document entropy calculation module, generating a document scatterplot view representing how many topics are attributable to the textual data; using a temporal topic trend calculation module, generating a temporal view representing changes in the occurrence of topics over time; and displaying one or more of the views to a user.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.