Patent · US Active

Models for classifying documents

US11341194B1 · kind B1 · utility

0Cited by
1References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 22, 2019
Grant dateMay 24, 2022
Priority date
Expiry dateNov 22, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/353
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Some embodiments provide a method for defining a content relevance model for determining whether a content segment is relevant to a particular category. The method receives a first set of content segments that contain content relevant to the particular category and a second set of content segments that contain content not relevant to the particular category. The method identifies a set of key word sets more likely to appear in the first set of content segments than the second set of content segments. The method defines a content relevance model that comprises a set of groups of word sets and a score for each group, each of the groups of word sets comprising a key word set from the set of key word sets and at least one word set found in a context of the key word set in at least one of the received content segments.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.