Method and system for automatically tagging data
US11397716B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 19, 2020 |
| Grant date | Jul 26, 2022 |
| Priority date | — |
| Expiry date | Nov 19, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/2365
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods relate to auto-tagging of data in a data lake or a data storage. Generating a statistical summary of the data lake and interactively receiving data in a selected column of an exemplar data addresses an issue of efficiently and accurately auto-tagging data in a data lake. The present disclosure automatically generates a statistical summary of the data lake using a lightweight off-line processing. A graphical user interface interactively receives an exemplar data file with a selection of a column in the exemplar data file. A list of candidate data-tagging patterns is generated based on the statistical summary and updates the list by removing candidate data-tagging patterns that under-generalize the data. The present disclosure determines a data-tagging pattern by selecting a candidate data-tagging profile from the list based on having the least number of matching columns in the data lake.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.