Automatic artist and content breakout prediction
US10366334B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 21, 2016 |
| Grant date | Jul 30, 2019 |
| Priority date | — |
| Expiry date | Jan 28, 2037 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L67/06
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods, systems and computer program products for clustering pages into headline clusters are provided by collecting web data, identifying pages from the web data, tokenizing unique words in each page, recognizing unique entities in each page, detecting media links in each page, and constructing a plurality of vector representations of each page. A first dimension of each vector representation includes the unique words tokenized in each page, a second dimension of each vector representation includes the unique entities recognized in each page, and a third dimension of each vector representation includes the media links detected in each page. The vector representations are, in turn, clustered.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.