Patent · US Active

Machine translation method for PDF file

US8108202B2 · kind B2 · utility

5Cited by
5References
13Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 29, 2008
Grant dateJan 31, 2012
Priority date
Expiry dateNov 7, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/143
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Disclosed is a machine translation method for a PDF file. A machine translation device extracts source language text and non-text from the input source language PDF file through image transformation, corrects the extracted source language text by using the source language text extracted from text information, restores a part that is contextually separated by the non-text from among the extracted source language text, generates a source language XML/HTML file by rearranging the extracted text and non-text so as to satisfy the contextual flow of the source language PDF file, separates source language text from a tag of the source language XML/HTML file, generates target language text by using translation knowledge and a transformation engine specified for the technical field corresponding to the source language PDF file, inserts the translated target language text other than source language text into XML/HTML file, and transforms the generated target language XML/HTML file into a target language PDF file to be output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.