Domain concept discovery and clustering using word embedding in dialogue design
US11048870B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 14, 2017 |
| Grant date | Jun 29, 2021 |
| Priority date | — |
| Expiry date | Nov 17, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/30
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method performs automated domain concept discovery and clustering using word embeddings by receiving a set of documents for natural language processing for a domain, representing a plurality of entries in the set of documents as continuous vectors in a high dimensional continuous space, applying a clustering algorithm based on a mutual information optimization criterion to form a set of clusters, associating each entry of the plurality of entries with each cluster in the set of clusters through formalizing an evidence based model of each cluster given each entry, calculating a mutual information metric between each entry and each cluster using the evidence based model, and identifying a nominal center of each cluster by maximizing the mutual information.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.