Patent · US Active

Method and system for automatically tagging data

US11397716B2 · kind B2 · utility

1Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 19, 2020
Grant dateJul 26, 2022
Priority date
Expiry dateNov 19, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/2365
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods relate to auto-tagging of data in a data lake or a data storage. Generating a statistical summary of the data lake and interactively receiving data in a selected column of an exemplar data addresses an issue of efficiently and accurately auto-tagging data in a data lake. The present disclosure automatically generates a statistical summary of the data lake using a lightweight off-line processing. A graphical user interface interactively receives an exemplar data file with a selection of a column in the exemplar data file. A list of candidate data-tagging patterns is generated based on the statistical summary and updates the list by removing candidate data-tagging patterns that under-generalize the data. The present disclosure determines a data-tagging pattern by selecting a candidate data-tagging profile from the list based on having the least number of matching columns in the data lake.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.