Patent · US Active

Privacy-preserving text language identification using homomorphic encryption

US9288039B1 · kind B1 · utility

36Cited by

4References

18Claims

0Family size

Assignee

Xerox Corporation · US

Inventors

Nicolas Monet · Seongnam-si, KR
Johan Clier · Meylan, FR

Key dates

Filing date	Dec 1, 2014
Grant date	Mar 15, 2016
Priority date	—
Expiry date	Dec 1, 2034

Classification

Technology area (CPC H)Electricity
CPC primaryH04L9/008
WIPO fieldDigital communication
WIPO sectorElectrical engineering

Abstract

A system and method for text language identification allow private information of a server and a client to be kept secret from each other. An encrypted score for each of a plurality of languages is received by the server from the client. The encrypted scores are generated by homomorphic addition of encrypted frequencies of n-grams in a list of n-grams extracted from text. The unencrypted list is not provided to the server. The encrypted frequencies of the n-grams in the list are extracted using encrypted resources which, for each of the plurality of languages, include an encrypted frequency for each of a set of n-grams. At the server, the encrypted scores are decrypted to generate unencrypted scores and information is provided to the client based on the unencrypted scores from which the client is able to identify a language for the text.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.