Patent · US Active

Convolutional neural network with phonetic attention for speaker verification

US11276410B2 · kind B2 · utility

1Cited by

0References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Yong Zhao · Beijing, CN
Tianyan Zhou · Bellevue, US
Jinyu Li · Beijing, CN
Yifan Gong · Sammamish, US
Jian Wu · Bellevue, US
Zhuo Chen · Markham, CA

Key dates

Filing date	Nov 13, 2019
Grant date	Mar 15, 2022
Priority date	—
Expiry date	Sep 7, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L17/14
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Embodiments may include reception of a plurality of speech frames, determination of a multi-dimensional acoustic feature associated with each of the plurality of speech frames, determination of a plurality of multi-dimensional phonetic features, each of the plurality of multi-dimensional phonetic features determined based on a respective one of the plurality of speech frames, generation of a plurality of two-dimensional feature maps based on the phonetic features, input of the feature maps and the plurality of acoustic features to a convolutional neural network, the convolutional neural network to generate a plurality of speaker embeddings based on the plurality of feature maps and the plurality of acoustic features, aggregation of the plurality of speaker embeddings into a first speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, and determination of a speaker associated with the plurality of speech frames based on the first speaker embedding.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.