Patent · US Active

Contextualized streaming end-to-end speech recognition with trie-based deep biasing and shallow fusion

US12087306B1 · kind B1 · utility

2Cited by

2References

20Claims

0Family size

Assignee

Meta Platforms, Inc. · US

Inventors

Duc Hoang Le · Sunnyvale, US
FNU Mahaveer · Foster City, US
Gil Keren · Passau, DE
Christian Fuegen · Sunnyvale, US
Yatharth Saraf · San Francisco, US

Key dates

Filing date	Nov 24, 2021
Grant date	Sep 10, 2024
Priority date	—
Expiry date	Jun 30, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/183
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

In one embodiment, a method includes receiving a user's utterance comprising a word in a custom vocabulary list of the user, generating a previous token to represent a previous audio portion of the utterance, and generating a current token to represent a current audio portion of the utterance by generating a bias embedding by using the previous token to query a trie of wordpieces representing the custom vocabulary list, generating first probabilities of respective first candidate tokens likely uttered in the current audio portion based on the bias embedding and the current audio portion, generating second probabilities of respective second candidate tokens likely uttered after the previous token based on the previous token and the bias embedding, and generating the current token to represent the current audio portion of the utterance based on the first probabilities of the first candidate tokens and the second probabilities of the second candidate tokens.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.