Patent · US Active

Methods, apparatus, systems and computer readable media for use in keyword extraction

US9384287B2 · kind B2 · utility

1Cited by
13References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 15, 2014
Grant dateJul 5, 2016
Priority date
Expiry dateMar 23, 2035

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/2468
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In one embodiment, a method includes: receiving data representing a plurality of corpora, each of the plurality of corpora including a set of documents; receiving data representing terms that appear in the corpora; for each one of the terms, determining a plurality of inverse document frequency values each associated with a respective one of the plurality of corpora; receiving data representing a subset of the terms that also appear in a document; for each term in the subset, determining a term frequency for the term in the document; and for each term in the subset, determining, an augmented term frequency-inverse document frequency value based on: (i) the term frequency, and (ii) the plurality of inverse document frequency values that were determined for the term in the subset.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.