Patent · US Active

Utilizing visual and textual aspects of images with recommendation systems

US12008331B2 · kind B2 · utility

0Cited by

2References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Xun Luan · Santa Clara, US
Aman Gupta · Sunnyvale, US
Sirjan Kafle · San Diego, US
Ananth Sankar · Palo Alto, US
Di Wen · Lo Wu, CN
Saurabh Kataria · Rochester, US
Ying Xuan · Sunnyvale, US
Sakshi Verma · Austin, US
Bharat Kumar Jain · Hyderabad, IN
Xue Xia · Los Angeles, US
Bhargavkumar Kanubhai Patel · Sayla, IN
Vipin Gupta · Bengaluru, IN
Nikita Gupta · Mountain View, US

Key dates

Filing date	Dec 23, 2021
Grant date	Jun 11, 2024
Priority date	—
Expiry date	Jan 10, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/0464
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Described herein are systems and methods for generating an embedding—a learned representation—for an image. The embedding for the image is derived to capture visual aspects, as well as textual aspects, of the image. An encoder-decoder is trained to generate the visual representation of the image. An optical character recognition (OCR) algorithm is used to identify text/words in the image. From these words, an embedding is derived by performing an average pooling operation on pre-trained embeddings that map to the identified words. Finally, the embedding representing the visual aspects of the image is combined with the embedding representing the textual aspects of the image to generate a final embedding for the image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.