Patent · US Active

Systems and methods for generating labeled short text sequences

US11797594B2 · kind B2 · utility

0Cited by
1References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 10, 2020
Grant dateOct 24, 2023
Priority date
Expiry dateJan 11, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/30
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A set of documents related to a particular topic, industry, or entity are received. Sentences are extract from each document. The sentences are grouped into tuples of one, two, or three consecutive sentences (i.e., short text sequences). The sentence tuples are clustered based on vector representations of the sentences. For each cluster, a set of tuples that best represents or best fits the cluster is selected. These sentence tuples are fed to an ontology to determine ontological entities associated with each tuple. These determined ontological entities are associated with the clusters corresponding to each tuple. The sentence tuples associated with each cluster are labeled based on the ontological entities associated with the cluster. The labeled sentence tuples may then be used for a variety of purposes such as training a model to determine the topic of short text sequences.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.