Patent · US Expired

Automatic segmentation of continuous text using statistical approaches

US5806021A · kind A · utility

131Cited by
9References
9Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 4, 1996
Grant dateSep 8, 1998
Priority date
Expiry dateSep 4, 2016

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/284
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An automatic segmenter for continuous text segments such text in a rapid, consistent and semantically accurate manner. Two statistical methods for segmentation of continuous text are used. The first method, called "forward-backward matching", is easy and fast but can produce occasional errors in long phrases. The second method, called "statistical stack search segmenter", utilizes statistical language models to generate more accurate segmentation output at an expense of two times more execution time than the "forward-backward matching" method. In some applications where speed is a major concern, "forward-backward matching" can be used, while in other applications where highly accurate output is desired, "statistical stack search segmenter" is ideal.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.