Text classification system based on feature selection and method thereof
US11960521B2 · kind B2 · utility
Inventors
Key dates
| Filing date | Jan 16, 2023 |
| Grant date | Apr 16, 2024 |
| Priority date | — |
| Expiry date | Jan 16, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/313
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present disclosure discloses a text classification system based on feature selection and a method thereof in the technical field of natural language processing and short text classification, comprising: acquiring a text classification data set; dividing the text classification data set into a training text set and a test text set, and then pre-processing the training text set and the test text set; extracting feature entries from the pre-processed training text set through improved chi-square statistics to form feature subsets; using TF-IWF algorithm to give the weight to the extracted feature entries; based on the weighted feature entries, establishing a short text classification model based on a support vector machine; and classifying the pre-processed test text set by the short text classification model. The present disclosure solves the problem that the short text content is sparse to some extent, thereby improving the performance of short text classification.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.