Patent · US Active

Automatic artist and content breakout prediction

US10366334B2 · kind B2 · utility

16Cited by
7References
9Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 21, 2016
Grant dateJul 30, 2019
Priority date
Expiry dateJan 28, 2037

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04L67/06
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems and computer program products for clustering pages into headline clusters are provided by collecting web data, identifying pages from the web data, tokenizing unique words in each page, recognizing unique entities in each page, detecting media links in each page, and constructing a plurality of vector representations of each page. A first dimension of each vector representation includes the unique words tokenized in each page, a second dimension of each vector representation includes the unique entities recognized in each page, and a third dimension of each vector representation includes the media links detected in each page. The vector representations are, in turn, clustered.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.