Patent · US Active

Optimizing personal VAD for on-device speech recognition

US12347438B2 · kind B2 · utility

0Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 17, 2023
Grant dateJul 1, 2025
Priority date
Expiry dateJan 12, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L17/18
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer-implemented method includes receiving a sequence of acoustic frames corresponding to an utterance and generating a reference speaker embedding for the utterance. The method also includes receiving a target speaker embedding for a target speaker and generating feature-wise linear modulation (FiLM) parameters including a scaling vector and a shifting vector based on the target speaker embedding. The method also includes generating an affine transformation output that scales and shifts the reference speaker embedding based on the FiLM parameters. The method also includes generating a classification output indicating whether the utterance was spoken by the target speaker based on the affine transformation output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.