Patent · US Active

Automated transformation of information from images to textual representations, and applications therefor

US12197412B2 · kind B2 · utility

1Cited by

13References

27Claims

0Family size

Assignee

Tungsten Automation Corporation · US

Inventors

Steve Thompson · Oceanside, US
Veronika Levdik · Podgorica, ME
Iurii Vymenets · Saint Petersburg, RU
Donghan Lee · Yongin-si, KR

Key dates

Filing date	Jul 3, 2024
Grant date	Jan 14, 2025
Priority date	—
Expiry date	Jul 3, 2044

Classification

Technology area (CPC G)Physics
CPC primaryG06V30/414
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Recent developments in machine learning (commonly coined “artificial intelligence” or “AI”) have vastly expanded applications for this technology, such as myriad “chat” agents adept at understanding natural human language. While state of the art generative models can parse text queries from a user and provide comprehensive, accurate responses (including generating images depicting desired content), current implementations struggle with understanding all information present in images of documents, especially images of business documents. In particular, generative models fail to understand structured and semi-structured information, e.g., as indicated by graphical information such as lines, geometric relationships (e.g., indicated by tables, graphs, figures, etc.), formatting, and other contextual information that human readers easily and implicitly understand. The disclosed inventive concepts transform structured and semi-structured information along with textual content into a textual representation that allows generative models to better understand textual content and non-textual structured information present in document images.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.