Patent · US Active

Streaming text data mining method and apparatus using multidimensional subspaces

US8234279B2 · kind B2 · utility

3Cited by
8References
31Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 11, 2005
Grant dateJul 31, 2012
Priority date
Expiry dateApr 23, 2027

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/313
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A streaming text data comparator performs real-time text data mining on streaming text data. The comparator receives a streaming text data document and generates a vector representation of the term frequencies relating to an existing document collection. The comparator then transforms the term frequency vector into a projection in a precomputed multidimensional subspace that represents the original document collection. The comparator further calculates a relationship value representing the similarities or differences between the vector representation and the subspace, and compares the relationship value to a predetermined threshold to determine whether the streaming text data document is related to the original document collection. If the streaming text data document is related, the streaming text data comparator intercalates the new document into the document collection. If the new document is not related, the comparator may store or delete the unrelated document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.