Method and system of indexing speech data
US8831946B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Jul 23, 2007 |
| Grant date | Sep 9, 2014 |
| Priority date | — |
| Expiry date | Apr 21, 2030 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/26
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and system of indexing speech data. The method includes indexing word transcripts including a timestamp for a word occurrence; and indexing sub-word transcripts including a timestamp for a sub-word occurrence. A timestamp in the index indicates the time and duration of occurrence of the word or sub-word in the speech data, and word and sub-word occurrences can be correlated using the timestamps. A method of searching speech transcripts is also provided in which a search query in the form of a phrase to be searched includes at least one in-vocabulary word and at least one out-of-vocabulary word. The method of searching includes extracting the search terms from the phrase, retrieving a list of occurrence of words for an in-vocabulary search term from an index of words having timestamps, retrieving a list of occurrences of sub-words for an out-of-vocabulary search term from an index of sub-words having timestamps, and merging the retrieved lists of occurrences of words and sub-words according to their timestamps.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.