Patent · US Active

Identifying duplicate electronic content based on metadata

US8280861B1 · kind B1 · utility

16Cited by
4References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 30, 2011
Grant dateOct 2, 2012
Priority date
Expiry dateSep 30, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/951
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for de-duplicating electronic content based on comparing metadata. In one aspect, a method includes comparing first metadata associated with a first item of electronic content to second metadata associated with a second item of electronic content, and generating a score based on the comparison. The method also includes establishing that the first and second items of electronic content comprise potentially duplicate content when the score is greater than a predetermined threshold value, and providing information identifying either the first or second items of electronic content for display when establishing that the first and second items of electronic content comprise potentially duplicate content.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.