Patent · US Expired

Extracting ordered list of words from documents comprising text and code fragments, without interpreting the code fragments

US6470362B1 · kind B1 · utility

6Cited by
16References
21Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 16, 1997
Grant dateOct 22, 2002
Priority date
Expiry dateMay 16, 2017

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/284
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer implemented method is applied to convert a formatted document or text to an ordered list of words. The formatted document is first partitioned into first and second data structures stored in a memory of a computer. The first data structure stores text fragments, and the second data structure stores code fragments of the formatted document. Adjacent text fragments are concatenated to form possible ordered word lists. Possible words are matched against a dictionary of representative words. A best ordered word list having the fewest number of words is selected from the possible ordered word lists.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.