Patent · US Active

Automatic collection of speaker name pronunciations

US9240181B2 · kind B2 · utility

3Cited by

7References

20Claims

0Family size

Assignee

Cisco Technology, Inc. · US

Inventors

Aparna Khare · San Jose, US
Neha Agrawal · Bengaluru, IN
Sachin Kajarekar · Cupertino, US
Matthias Paulik · San Jose, US

Key dates

Filing date	Aug 20, 2013
Grant date	Jan 19, 2016
Priority date	—
Expiry date	Mar 5, 2034

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/025
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

An audio stream is segmented into a plurality of time segments using speaker segmentation and recognition (SSR), with each time segment corresponding to the speaker's name, producing an SSR transcript. The audio stream is transcribed into a plurality of word regions using automatic speech recognition (ASR), with each of the word regions having a measure of the confidence in the accuracy of the translation, producing an ASR transcript. Word regions with a relatively low confidence in the accuracy of the translation are identified. The low confidence regions are filtered using named entity recognition (NER) rules to identify low confidence regions that a likely names. The NER rules associate a region that is identified as a likely name with the name of the speaker corresponding to the current, the previous, or the next time segment. All of the likely name regions associated with that speaker's name are selected.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.