Patent · US Active

Search-based word segmentation method and device for language without word boundary tag

US8131539B2 · kind B2 · utility

9Cited by
3References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 7, 2008
Grant dateMar 6, 2012
Priority date
Expiry dateJan 5, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/53
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The present invention discloses a search-based segmentation method and device for a language without a word boundary tag. The inventive method includes the steps of: a. providing at least one search engine with a segment of a text including at least one segment; b. searching for the segment through the at least one search engine, and returning search results; and c. selecting a word segmentation approach for the segment in accordance with at least part of the returned search results. The invention solves the problems of word segmentation for a language without a word boundary tag, and thus combat the limitations of the prior art in terms of flexibility, dependence upon coverage of dictionaries, available training data corpuses, processing of a new word, etc.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.