System and methods for URL entity extraction
US9286378B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 31, 2012 |
| Grant date | Mar 15, 2016 |
| Priority date | — |
| Expiry date | Jan 19, 2034 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04L63/1441
- WIPO fieldDigital communication
- WIPO sectorElectrical engineering
Abstract
To allow for detection of links to spam within a social networking system by identifying entities based on uniform resource locators (URLs). Identifiers that include at least one parent entity identifier representing a parent entity and at least one child entity identifier representing a child entity are extracted from URLs. An identifier sequence including the at least one parent entity identifier is designated as an attributable sequence when the child entity accounts for a value of traffic received by the parent entity that does not satisfy a threshold. In an embodiment, the child entity may be identified as an actionable target. The URLs may be classified within a hierarchical structure based on the identifiers. The hierarchical structure may comprise at least one parent node representing the parent entity and at least one child node representing the child entity. The hierarchical structure may be a tree.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.