Patent · US Active

Metadata-based diarization of teleconferences

US12217760B2 · kind B2 · utility

0Cited by
10References
35Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 30, 2022
Grant dateFeb 4, 2025
Priority date
Expiry dateDec 29, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L21/028
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for audio processing includes receiving a recording of a teleconference among multiple participants over a network, including an audio stream containing speech uttered by the participants and information outside the audio stream. The method further includes processing the audio stream to identify speech segments interspersed with intervals of silence, extracting speaker identifications from the information outside the audio stream in the received recording, labeling a first set of the identified speech segments from the audio stream with the speaker identifications, extracting acoustic features from the speech segments in the first set, learning a correlation between the speaker identifications labelled to the segments in the first set and the extracted acoustic features, and labeling a second set of the identified speech segments using the learned correlation, to indicate the participants who spoke during the speech segments in the second set.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.