Contextual tagger utilizing deterministic finite state transducer
US5610812A · kind A · utility
Assignee
Inventors
Key dates
| Filing date | Jun 24, 1994 |
| Grant date | Mar 11, 1997 |
| Priority date | — |
| Expiry date | Jun 24, 2014 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/268
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system for assigning part-of-speech tags to English text includes an improved contextual tagger which utilizes a deterministic finite state transducer to improve tagging speed such that large documents can have its sentences accurately tagged as to parts of speech to permit fast grammar checking, spell checking, information retrieval, text indexing and optical character recognition. The subject system performs by first acquiring a set of rules by examining a training corpus of tagged text. Then, these rules are transformed into a deterministic finite-state transducer through the utilization of non-deterministic transducers, a composer and a determiniser. In order to tag an input sentence, the sentence is initially tagged by first assigning each word in the sentence with its most likely part of speech tag regardless of the surrounding words in the sentences. The deterministic finite-state transducer is then applied on the resulting sequence of part of speech tags using the surrounding words and obtains the final part of speech tags. The Subject System requires an amount of time to compute the part-of-speech tags which is proportional to the number of words in the input sentence an…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.