Detecting abusive language using character N-gram features
US11010687B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 29, 2016 |
| Grant date | May 18, 2021 |
| Priority date | — |
| Expiry date | Jul 22, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/211
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Methods and apparatus for detecting abusive language are disclosed. In one embodiment, a set of character N-grams is ascertained for a set of text. Feature values for a plurality of features of the set of text are determined, based, at least in part, on the set of character N-grams. A computer-generated model is applied to the feature values for the plurality of features to generate a score for the set of text, where the model includes a plurality of weights, each of the weights corresponding to one of the features. It may then be determined whether the set of text includes abusive language based, at least in part, on the score.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.