Mixup image captioning
US11334769B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 7, 2020 |
| Grant date | May 17, 2022 |
| Priority date | — |
| Expiry date | Oct 1, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/1914
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
In an approach to augmenting caption datasets, one or more computer processors sample a ratio lambda from a probability distribution based on a pair of datapoints contained in a dataset, wherein each datapoint in the pair of datapoints comprises an image and an associated caption; extend the dataset by generating one or more new datapoints based on the sampled ratio lambda for each pair of datapoints in the dataset, wherein the sampled ratio lambda incorporates an interpolation of features associated with the pair of datapoints into the generated one or more new datapoints; identify one or more objects contained within a subsequent image utilizing an image model trained utilizing the extended dataset; generate a subsequent caption for one or more identified objects contained within the subsequent image utilizing a language generating model trained utilizing the extended dataset.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.