Utilizing visual and textual aspects of images with recommendation systems
US12008331B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 23, 2021 |
| Grant date | Jun 11, 2024 |
| Priority date | — |
| Expiry date | Jan 10, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/0464
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Described herein are systems and methods for generating an embedding—a learned representation—for an image. The embedding for the image is derived to capture visual aspects, as well as textual aspects, of the image. An encoder-decoder is trained to generate the visual representation of the image. An optical character recognition (OCR) algorithm is used to identify text/words in the image. From these words, an embedding is derived by performing an average pooling operation on pre-trained embeddings that map to the identified words. Finally, the embedding representing the visual aspects of the image is combined with the embedding representing the textual aspects of the image to generate a final embedding for the image.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.