Patent · US Active

Applying a structured language model to information extraction

US8706491B2 · kind B2 · utility

14Cited by

24References

19Claims

0Family size

Assignee

Microsoft Corporation · US

Inventors

Ciprian Chelba · Palo Alto, US
Milind Mahajan · Redmond, US

Key dates

Filing date	Aug 24, 2010
Grant date	Apr 22, 2014
Priority date	—
Expiry date	Feb 6, 2031

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/22
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.