Method and apparatus for screening TB-scale incremental data
US11789639B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 30, 2023 |
| Grant date | Oct 17, 2023 |
| Priority date | — |
| Expiry date | Mar 30, 2043 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY02D10/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and an apparatus for screening TB-scale of incremental data. In the present application, according to the memory capacity of the device, the raw data is divided into a plurality of raw data blocks, and the data is cleaned. By adopting a single-block index sorting algorithm, the de-duplicating ordering in the data blocks is completed without dropping operation, and the processed data blocks and a matrix hash index table are respectively generated and saved as initial data after completion. For the subsequent incremental data, the inter-block index-sorting algorithm is adopted, and the processed data blocks and the matrix hash index table are loaded in turn. The data is preliminarily screened on the basis of the matrix hash index table, and an incremental binary search algorithm is used for fine screening. Finally, the indexing and de-duplication screening of all data are completed.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.