Patent · US Active

Automated generation and presentation of textual descriptions of video content

US10999566B1 · kind B1 · utility

5Cited by

1References

17Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Hooman Mahyar · Los Angeles, US
Vimal Bhat · Redmond, US
Jatin Jain · Redmond, US
Udit Bhatia · Boston, US
Roya Hosseini · Bellevue, US

Key dates

Filing date	Sep 6, 2019
Grant date	May 4, 2021
Priority date	—
Expiry date	Sep 6, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/48
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation of textual descriptions of video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames and first audio content, determining, using a first neural network, a first action that occurs in the first set of frames, and determining a first sound present in the first audio content. Some methods may include generating a vector representing the first action and the first sound, and generating, using a second neural network and the vector, a first textual description of the first segment, where the first textual description includes words that describe events of the first segment.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.