Patent · US Expired

Vision-based document segmentation

US7428700B2 · kind B2 · utility

19Cited by

8References

40Claims

0Family size

Assignee

Microsoft Corporation · US

Inventors

Ji-Rong Wen · Beijing, CN
Shipeng Yu · Sunnyvale, US
Deng Cai · Beijing, CN
Wei-Ying Ma · Redmond, US

Key dates

Filing date	Jul 28, 2003
Grant date	Sep 23, 2008
Priority date	—
Expiry date	Feb 2, 2024

Classification

Technology area (CPC G)Physics
CPC primaryG06F40/143
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Vision-based document segmentation identifies one or more portions of semantic content of a document. The one or more portions are identified by identifying a plurality of visual blocks in the document, and detecting one or more separators between the visual blocks of the plurality of visual blocks. A content structure for the document is constructed based at least in part on the plurality of visual blocks and the one or more separators, and the content structure identifies the one or more portions of semantic content of the document. The content structure obtained using the vision-based document segmentation can optionally be used during document retrieval.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.