Patent · US Active

Low-latency multi-speaker speech recognition

US11475898B2 · kind B2 · utility

1Cited by

826References

45Claims

0Family size

Assignee

Apple Inc. · US

Inventors

Masood Delfarah · Cupertino, US
Ossama Abdelhamid Mohamed Abdelhamid · Toronto, CA
Kyuyeon Hwang · Cupertino, US
Donald R. McAllaster · Shrewsbury, US
Sabato Marco Siniscalchi · Cupertino, US

Key dates

Filing date	Aug 7, 2019
Grant date	Oct 18, 2022
Priority date	—
Expiry date	Aug 2, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L21/0272
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Systems and processes for operating an intelligent automated assistant are provided. In one example, a method includes receiving mixed speech data representing utterances of a target speaker and utterances of one or more interfering audio sources. The method further includes obtaining a target speaker representation, which represents speech characteristics of the target speaker; and determining, using a learning network, probability distributions of phonetic elements directly from the mixed speech data. The inputs of the learning network include the mixed speech data and the target speaker representation. An output of the learning network includes the probability distributions of phonetic elements. The method further includes generating text corresponding to the utterances of the target speaker based on the probability distributions of the phonetic elements; and providing a response to the target speaker based on the text corresponding to the utterances of the target speaker.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.