Document classification of files on the client side before upload
US11948383B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 6, 2021 |
| Grant date | Apr 2, 2024 |
| Priority date | — |
| Expiry date | Nov 14, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for classifying a document in real-time is disclosed. The method includes identifying one or more sections of the document likely to contain text based on a contrast between dark space and light space in an image of the document. Optical character recognition is performed within the identified sections of the document to identify a set of words within each identified section of the document. The sets of words are extracted from the identified sections of the document, and a subset of the sets of words is selected for classifying the document based on a preconfigured option. The document is then classified by inputting the selected subset of words into one or more machine learning models. The method includes transmitting the document and the determined classification of the document to an external server.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.