Apparatus and method to sequentially deduplicate groups of files comprising the same file name but different file version numbers
US8719240B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 19, 2009 |
| Grant date | May 6, 2014 |
| Priority date | — |
| Expiry date | Mar 7, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/9014
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method to sequentially deduplicate data, wherein the method receives a plurality of computer files, wherein each of the plurality of computer files comprises a label comprising a file name, a file type, a version number, and file size, and stores that plurality of computer files in a deduplication queue. The method then identifies a subset of the plurality of computer files, wherein each file of the subset comprises the same file name but a different version number, and wherein the subset comprises a maximum count of version numbers, and wherein the subset comprises a portion of the plurality of computer files. The method deduplicates the subset using a hash algorithm, and removes the subset from said deduplication queue. During the deduplicating, the method receives new computer files comprising the same file name, stores those new computer files to the deduplication queue, but does not add those new computer files to the subset.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.