Learning thematic similarity metric from article text units
US10831793B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 23, 2018 |
| Grant date | Nov 10, 2020 |
| Priority date | — |
| Expiry date | Feb 14, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/044
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method of estimating a thematic similarity of sentences, comprising receiving a corpus of a plurality of documents describing a plurality of topics where each document comprises a plurality of sentences arranged in a plurality of sections, constructing sentence triplets for at least some of the sentences, each sentence triplet comprising a respective sentence, a respective positive sentence selected randomly from the section comprising the respective sentence and a respective negative sentence selected randomly from another section, training a first neural network with the sentence triplets to identify sentence-sentence vectors mapping each sentence with a shorter distance to its respective positive sentence compared to the distance to its respective negative sentence and outputting the first neural network for estimating thematic similarity between a pair of sentences by computing a distance between the sentence-sentence vectors produced for each sentence of the pair by the first neural network.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.