Patent · US Active

System and method for modelling and profiling in multiple languages

US9026542B2 · kind B2 · utility

0Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 23, 2010
Grant dateMay 5, 2015
Priority date
Expiry dateJun 6, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/337
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for generating feature vectors of documents in different languages are provided. The feature vectors provide scores associated with keywords defined in a base language for use by a profiler for generating or updating a user profile. The system and method use a plurality of keyword sets comprising: a base language keyword set comprising a plurality of base language keywords each associated with a respective identifier (ID); and a second language keyword set comprising a plurality of second language keywords each corresponding in meaning to a respective one of the base language keywords and associated with the ID of the corresponding base language keyword. One of a plurality of tokenizers is selected to parse a document based on the language of the document and to generate the feature vector using the keyword set of the corresponding language.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.