Data retention management for partitioned datasets
US12072868B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 7, 2021 |
| Grant date | Aug 27, 2024 |
| Priority date | — |
| Expiry date | Jan 18, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/2228
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods are disclosed to implement a data storage system that manages data retention for partitioned datasets. A received data retention policy specifies to selectively delete data from a dataset based on a set of data retention attributes. If the data retention attributes are part of the dataset's partition key, a first type of data deletion job is configured to selectively delete entire partitions of the dataset. Otherwise, the system will generate a retention attribute index for the dataset, which will be used by a second type of data deletion job to selectively delete individual records within the partitions. In embodiments, the retention attribute index is implemented as Bloom filters that track retention attribute values in each partition. Advantageously, the disclosed system is able to automatically configure deletion jobs for any dataset schema that avoids full scans of the dataset partitions.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.