Patent · US Active

Method and system of indexing speech data

US8831946B2 · kind B2 · utility

16Cited by
24References
24Claims
0Family size

Assignee

Inventor

Key dates

Filing dateJul 23, 2007
Grant dateSep 9, 2014
Priority date
Expiry dateApr 21, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/26
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system of indexing speech data. The method includes indexing word transcripts including a timestamp for a word occurrence; and indexing sub-word transcripts including a timestamp for a sub-word occurrence. A timestamp in the index indicates the time and duration of occurrence of the word or sub-word in the speech data, and word and sub-word occurrences can be correlated using the timestamps. A method of searching speech transcripts is also provided in which a search query in the form of a phrase to be searched includes at least one in-vocabulary word and at least one out-of-vocabulary word. The method of searching includes extracting the search terms from the phrase, retrieving a list of occurrence of words for an in-vocabulary search term from an index of words having timestamps, retrieving a list of occurrences of sub-words for an out-of-vocabulary search term from an index of sub-words having timestamps, and merging the retrieved lists of occurrences of words and sub-words according to their timestamps.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.