Method and system for efficiently handling small files in a single instance storage data store
US8572055B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 30, 2008 |
| Grant date | Oct 29, 2013 |
| Priority date | — |
| Expiry date | Feb 23, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/137
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, system and apparatus for efficient storage of small files in a segment-based deduplication scheme by allocating multiple small files to a single data segment is provided. A mechanism for distinguishing between large files (e.g., files that are on the order of the size of a segment or larger) and smaller files, and starting a new segment at the beginning of a large file is also provided. A file attribute-based system for determining an identity of a small file at which to begin a new segment and then allocating subsequent small files to that segment and contiguous segments until a next small file having an appropriate attribute subsequently is encountered to begin a new segment is further provided. In one aspect of the present invention a filename hash is used for file attribute analysis to determine when a new segment should begin. Using such a mechanism, multiple small files can be allocated to a data segment and at the same time continue to provide for efficient storage of large files within separate data segments. The file attribute analysis further provides for an increase in deduplication rate for subsequently provided copies of the small files (e.g., in a backup) si…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.