Methods and systems for performing end-to-end spoken language analysis
US11107462B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 30, 2018 |
| Grant date | Aug 31, 2021 |
| Priority date | — |
| Expiry date | Feb 28, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/22
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Exemplary embodiments relate to improvements in spoken language understanding (SLU) systems. Conventionally, SLU systems include an automatic speech recognition (ASR) component configured to receive an input of audio data and to generate a textual representation of the audio data. Conventional SLU systems also include a natural language understanding (NLU) component configured to receive a text-based transcript and perform language-based tasks such as domain classification, intent determination, and slot-filling. However, these two components are typically trained separately based on different metrics. In real-world situations, errors in the ASR component propagate to the NLU component, which degrades the performance of the overall system. Exemplary embodiments described herein perform SLU in an end-to-end manner that infers semantic meaning directly from audio features without an intermediate text representation. This may allow for more a more accurate translation performed in a more resource-efficient manner (particularly in terms of processing resources).
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.