Systems and methods for multimodal multilabel tagging of video
US10965999B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 2, 2020 |
| Grant date | Mar 30, 2021 |
| Priority date | — |
| Expiry date | Mar 2, 2040 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH04N21/8456
- WIPO fieldAudio-visual technology
- WIPO sectorElectrical engineering
Abstract
Multimodal multilabel tagging of video content may include labeling the video content with topical tags that are identified based on extracted features from two or more modalities of the video content. The two or more modalities may include (i) a video modality for the object, images, and/or visual elements of the video content, (ii) a text modality for the speech, dialog, and/or text of the video content, and/or (iii) an audio modality for non-speech sounds and/or sound characteristics of the video content. Combinational multimodal multilabel tagging may include combining two or more features from the same or different modality in order to increase the contextual understanding of the features and generate contextually relevant tags. Video content may be labeled with global tags relating to overall topics of the video content, and different sets of local tags relating to topics at different segments of the video content.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.