Patent · US Active

Optimizing personal VAD for on-device speech recognition

US12347438B2 · kind B2 · utility

0Cited by

1References

20Claims

0Family size

Assignee

Google LLC · US

Inventors

Shaojin Ding · Mountain View, US
Rajeev Rikhye · Mountain View, US
Qiao Liang · Mountain View, US
Yanzhang He · Mountain View, US
Quan Wang · Hoboken, US
Arun Narayanan · Rochester Hills, US
Tom O'malley · Mountain View, US
Ian C. McGraw · Menlo Park, US

Key dates

Filing date	Mar 17, 2023
Grant date	Jul 1, 2025
Priority date	—
Expiry date	Jan 12, 2044

Classification

Technology area (CPC G)Physics
CPC primaryG10L17/18
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A computer-implemented method includes receiving a sequence of acoustic frames corresponding to an utterance and generating a reference speaker embedding for the utterance. The method also includes receiving a target speaker embedding for a target speaker and generating feature-wise linear modulation (FiLM) parameters including a scaling vector and a shifting vector based on the target speaker embedding. The method also includes generating an affine transformation output that scales and shifts the reference speaker embedding based on the FiLM parameters. The method also includes generating a classification output indicating whether the utterance was spoken by the target speaker based on the affine transformation output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.