Patent · US Active

Machine learning based information extraction

US12333838B2 · kind B2 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 17, 2022
Grant dateJun 17, 2025
Priority date
Expiry dateJan 16, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/10
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Computer-readable media, methods, and systems are disclosed for applying machine learning mechanisms to classify and validate documents based on expense rule sets and external data validation services. Document images associated with expenses are received in connection with a reimbursable event. For each received document image data associated with the received document image is transmitted to an optical character recognition image processor that can recognize contents and associated coordinates. OCR data is received and transmitted to a text tokenizer. Tokenized text is received corresponding to expense details, and the tokenized text and coordinates are sent to a text feature generator. Text feature vectors are received and transmitted to a document classifier and a document classification received. Document fields are extracted and based thereon a document is validates and a corresponding reimbursement instruction generated.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.