Patent · US Active

Detecting abusive language using character N-gram features

US11010687B2 · kind B2 · utility

5Cited by

4References

20Claims

0Family size

Assignee

VERIZON MEDIA INC. · US

Inventors

Yashar Mehdad · Sunnyvale, US
Joel Tetreault · New York, US

Key dates

Filing date	Jul 29, 2016
Grant date	May 18, 2021
Priority date	—
Expiry date	Jul 22, 2038

Classification

Technology area (CPC G)Physics
CPC primaryG06F40/211
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Methods and apparatus for detecting abusive language are disclosed. In one embodiment, a set of character N-grams is ascertained for a set of text. Feature values for a plurality of features of the set of text are determined, based, at least in part, on the set of character N-grams. A computer-generated model is applied to the feature values for the plurality of features to generate a score for the set of text, where the model includes a plurality of weights, each of the weights corresponding to one of the features. It may then be determined whether the set of text includes abusive language based, at least in part, on the score.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.