Patent · US Active

Efficient lexical trending topic detection over streams of data using a modified sequitur algorithm

US8838599B2 · kind B2 · utility

14Cited by
2References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 14, 2010
Grant dateSep 16, 2014
Priority date
Expiry dateNov 16, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/313
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Embodiments are directed towards a Modified Sequitur algorithm (MSA) using pipelining and indexed arrays to identify trending topics within a plurality of documents having user generated content (UGC). The documents are parallelized and distributed across a plurality of network devices, which place at least some of the received documents into a buffer for which the MSA may then be applied to the documents within the buffer to identify n-grams or phrases within the documents' contents. The identified phrases are further analyzed to remove extraneous co-occurrences of phrases, and/or words based on a part of speech analysis. A weighting of the remaining phrases is used to identify trending topic phrases. Links to content in the plurality of UGC documents that is associated with the trending topic phrases may then be displayed to a client device.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.