Automatically segmenting images based on natural language phrases
US10410351B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 29, 2018 |
| Grant date | Sep 10, 2019 |
| Priority date | — |
| Expiry date | Aug 29, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06T2207/20101
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The invention is directed towards segmenting images based on natural language phrases. An image and an n-gram, including a sequence of tokens, are received. An encoding of image features and a sequence of token vectors are generated. A fully convolutional neural network identifies and encodes the image features. A word embedding model generates the token vectors. A recurrent neural network (RNN) iteratively updates a segmentation map based on combinations of the image feature encoding and the token vectors. The segmentation map identifies which pixels are included in an image region referenced by the n-gram. A segmented image is generated based on the segmentation map. The RNN may be a convolutional multimodal RNN. A separate RNN, such as a long short-term memory network, may iteratively update an encoding of semantic features based on the order of tokens. The first RNN may update the segmentation map based on the semantic feature encoding.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.