Patent · US Active

Hash-based duplicate data element systems and methods

US11789916B2 · kind B2 · utility

0Cited by

0References

18Claims

0Family size

Assignee

Bank of America Corporation · US

Inventors

Linda Haddad · San Francisco, US
Casey Andrew Augustine · Charlotte, US
Katherine Jameson · New York, US
Lauren K. Alleman · Alameda, US
Neha Joshi · Chicago, US

Key dates

Filing date	Dec 14, 2021
Grant date	Oct 17, 2023
Priority date	—
Expiry date	Jan 21, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG06F16/93
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method for reducing a storage of duplicated documents is provided. Methods may include hashing each document stored in the centralized data repository by executing a hashing algorithm on the document, outputting a hash-value and adding the hash-value and a hash pointer to a hash table. Methods may further include crawling the hash table to identify duplicate hash-values. For each hash-value recorded on the hash table two or more times, methods may include combining two or more duplicate hash-values into a cluster and for each cluster identifying, on the hash table, a unique hash-value. For the unique hash-value, methods may include maintaining the unique hash-value on the hash table and maintaining the document corresponding to the unique hash-value in the memory address. For each remaining duplicate hash-value stored in the cluster, deleting the corresponding document from the memory address and store the reference pointer at the memory address.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.