Patent · US Active

Searching multilingual documents based on document structure extraction

US10691734B2 · kind B2 · utility

2Cited by
6References
13Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 21, 2017
Grant dateJun 23, 2020
Priority date
Expiry dateAug 5, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/58
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An approach is provided for searching multilingual documents. Structure components are extracted from multilingual documents. Based on the extracted components, the documents are grouped into classifications including respective sets of documents expressed in different respective natural languages. A natural language in a query is detected. One of the documents is selected based on the document having content indicated by the query and the natural language of the document matching the detected natural language. Structure components of the selected document are extracted. Based on the extracted structure components of the selected document, one of the classifications is identified as including the selected document. Other document(s) in the classification are identified and presented as having content that matches the content of the selected document. The natural language(s) of the other document(s) are each different from the natural language of the selected document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.