Patent · US Expired

Method and apparatus for identifying words described in a portable electronic document

US5832530A · kind A · utility

142Cited by
27References
22Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 27, 1997
Grant dateNov 3, 1998
Priority date
Expiry dateJun 27, 2017

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/414
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and apparatus for identifying words stored in a portable electronic document. A digital computation apparatus stores a page of a document including characters in text segments that have not been identified as words. A word identifying mechanism analyzes the text segments of the page and stores the text segments as text objects in a linked list. The word identifying mechanism identifies words from the text objects in the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using position data associated with the text segments. The identified words are stored in a word list and are sorted if necessary. A method of the present invention receives a text segment from a page of a document having multiple text segments and associated position data, including x and y coordinates for each text segment. A text object is created for each text segment, and the text objects are entered into a linked list. Words are then identified from the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using the associated position data. Words that are identified in the text objects are added to a word…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.