Semantic data type classification in rectangular datasets
US11556514B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 24, 2021 |
| Grant date | Jan 17, 2023 |
| Priority date | — |
| Expiry date | Apr 17, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/413
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Provided is a method, computer program product, and system for automatically predicting unknown semantic data types in a rectangular dataset using a holistic knowledge of said dataset. A processor may receive one or more rectangular datasets, the one or more rectangular datasets comprising a plurality of columns having a set of known semantic data types. The processor may extract a set of features from the plurality of columns, where the set of features is used to determine a relationship among each column of the plurality of columns. The processor may construct a set of training data based on the extracted set of features. Using the training data, the processor may train a machine learning model to predict a semantic data type of a target column in a rectangular dataset having an unknown semantic data type.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.