Speaker segmentation and clustering for video summarization
US10535371B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 13, 2016 |
| Grant date | Jan 14, 2020 |
| Priority date | — |
| Expiry date | Feb 16, 2037 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG11B27/30
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques are provided for video summarization, based on speaker segmentation and clustering, to identify persons and scenes of interest. A methodology implementing the techniques according to an embodiment includes extracting audio content from a video stream and detecting one or more segments of the audio content that include the voice of a single speaker. The method also includes grouping the one or more detected segments into an audio cluster associated with the single speaker and providing a portion of the audio cluster to a user. The method further includes receiving an indication from the user that the single speaker is a person of interest. Segments of interest are then extracted from the video stream, where each segment of interest is associated with a scene that includes the person of interest. The extracted segments of interest are then combined into a summarization video.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.