Region adjacent subgraph isomorphism for layout clustering in document images
US11256760B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 28, 2018 |
| Grant date | Feb 22, 2022 |
| Priority date | — |
| Expiry date | Sep 6, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/9024
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer system and computerized method that groups documents with similar image layout together. A document similarity metric based on locally connected subgraphs is employed. Region adjacency graphs are generated from word segments extracted from document images. Fuzzy attributed graph isomorphism is performed on subgraphs checking node and edge attribute similarity. Document similarity is then calculated on a normalized score between matching subgraphs of different documents. Unsupervised clustering of document layouts is performed to generate clusters of documents with similar structure.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.