Patent · US Active

Identifying parallel bilingual data over a network

US8249855B2 · kind B2 · utility

15Cited by
11References
14Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 7, 2006
Grant dateAug 21, 2012
Priority date
Expiry dateJan 14, 2027

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/951
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A set of candidate documents, each of which may be part of a bilingual, parallel set of documents, are identified. The set of documents illustratively includes textual material in a source language. It is then determined whether parallel text can be identified. For each document in the set of documents, it is first determined whether the parallel text resides within the document itself. If not, the document is examined for links to other documents, and those linked documents are examined for bilingual parallelism with the selected documents. If not, named entities are extracted from the document and translated into the target language. The translations are used to query search engines to retrieve the parallel correspondent for the selected documents.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.