Identifying duplicate electronic content based on metadata
US8266115B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 14, 2011 |
| Grant date | Sep 11, 2012 |
| Priority date | — |
| Expiry date | Jan 14, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/951
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for de-duplicating electronic content based on comparing metadata. In one aspect, a method includes comparing first metadata associated with a first item of electronic content to second metadata associated with a second item of electronic content, and generating a score based on the comparison. The method also includes establishing that the first and second items of electronic content comprise potentially duplicate content when the score is greater than a predetermined threshold value, and providing information identifying either the first or second items of electronic content for display when establishing that the first and second items of electronic content comprise potentially duplicate content.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.