Method and system for identifying labels of unlabelled column data
US12380136B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 2, 2024 |
| Grant date | Aug 5, 2025 |
| Priority date | — |
| Expiry date | Jul 2, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/0475
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
As discussed earlier, labelling techniques that are available for labelling of unlabelled tabular data use some semi supervised models for identification purposes. However, they require sample labeled data for training purposes. Further, the same labelling model/technique cannot be used for all data types. Present disclosure provides method and system for identifying labels of unlabeled column data. The system uses a hybrid approach i.e., it uses language models, regular expressions and known dictionaries for labelling of unlabelled tabular data. For performing labelling, system first classifies received unlabelled tabular data into one or more data buckets. The system then uses appropriate techniques, based on data types, for identification of labels of unlabeled data present in data buckets. Thereafter, system uses feedback mechanism which will impart maturity to system over time. Finally, once system is matured, system can identify labels for all types of data.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.