Large-scale, high-dimensional similarity clustering in linear time with error-free retrieval
US10216829B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 19, 2017 |
| Grant date | Feb 26, 2019 |
| Priority date | — |
| Expiry date | Jan 19, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for a processing device to determine whether to assign a data item to at least one cluster of data items is disclosed. The processing device may identify a signature of the data item, the signature including a set of elements. The processing device may select a subset of the set of elements to form at least one partial signature. The processing device may combine the selected subset of elements into at least one token. The processing device may determine whether the at least one token is present in a memory. The memory may be configured to contain an existing set of tokens. The processing device may determine whether to assign the data item to at least one cluster based on whether the at least one token is present in the memory.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.