Automatic segmentation of continuous text using statistical approaches
US5806021A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Sep 4, 1996 |
| Grant date | Sep 8, 1998 |
| Priority date | — |
| Expiry date | Sep 4, 2016 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/284
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An automatic segmenter for continuous text segments such text in a rapid, consistent and semantically accurate manner. Two statistical methods for segmentation of continuous text are used. The first method, called "forward-backward matching", is easy and fast but can produce occasional errors in long phrases. The second method, called "statistical stack search segmenter", utilizes statistical language models to generate more accurate segmentation output at an expense of two times more execution time than the "forward-backward matching" method. In some applications where speed is a major concern, "forward-backward matching" can be used, while in other applications where highly accurate output is desired, "statistical stack search segmenter" is ideal.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.