Language-agnostic understanding
US10657332B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 21, 2017 |
| Grant date | May 19, 2020 |
| Priority date | — |
| Expiry date | Jun 12, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/58
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Exemplary embodiments relate to techniques to classify or detect the intent of content written in a language for which a classifier does not exist. These techniques involve building a code-switching corpus via machine translation, generating a universal embedding for words in the code-switching corpus, training a classifier on the universal embeddings to generate an embedding mapping/table; accessing new content written in a language for which a specific classifier may not exist, and mapping entries in the embedding mapping/table to the universal embeddings. Using these techniques, a classifier can be applied to the universal embedding without needing to be trained on a particular language. Exemplary embodiments may be applied to recognize similarities in two content items, make recommendations, find similar documents, perform deduplication, and perform topic tagging for stories in foreign languages.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.