Patent · US Active

Identifying duplicate electronic content based on metadata

US8266115B1 · kind B1 · utility

68Cited by
4References
12Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 14, 2011
Grant dateSep 11, 2012
Priority date
Expiry dateJan 14, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/951
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for de-duplicating electronic content based on comparing metadata. In one aspect, a method includes comparing first metadata associated with a first item of electronic content to second metadata associated with a second item of electronic content, and generating a score based on the comparison. The method also includes establishing that the first and second items of electronic content comprise potentially duplicate content when the score is greater than a predetermined threshold value, and providing information identifying either the first or second items of electronic content for display when establishing that the first and second items of electronic content comprise potentially duplicate content.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.