Patent · US Active

Determining word boundary likelihoods in potentially incomplete text

US8364709B1 · kind B1 · utility

228Cited by
2References
26Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 22, 2010
Grant dateJan 29, 2013
Priority date
Expiry dateJul 1, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/90335
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining word boundary likelihoods in potentially incomplete text. In one aspect, a method includes selecting query sequences from the query, each query sequence being at least a portion of a word n-gram, the word n-gram being a subsequence of up to n words selected from the second sequence of words of the query, and for each query sequence: determining one or more query sequence keys for the query sequence; determining at least one of a word boundary count and a non-word boundary count for each query sequence key, each word-boundary count and non-word boundary count being dependent on the context of the query sequence; and associating, in a data storage device, the at least one word boundary count and non-word boundary counts with each query sequence key.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.