Patent · US Expired

Architecture of a framework for information extraction from natural language documents

US6553385B2 · kind B2 · utility

109Cited by

12References

13Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

David E. Johnson · Cedar Park, US
Thomas Hampp-Bahnmueller · Stuttgart, DE

Key dates

Filing date	Sep 1, 1998
Grant date	Apr 22, 2003
Priority date	—
Expiry date	Sep 1, 2018

Classification

Technology area (CPC Y)Emerging Cross-Sectional Technologies
CPC primaryY10S707/99948
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A framework for information extraction from natural language documents is application independent and provides a high degree of reusability. The framework integrates different Natural Language/Machine Learning techniques, such as parsing and classification. The architecture of the framework is integrated in an easy to use access layer. The framework performs general information extraction, classification/categorization of natural language documents, automated electronic data transmission (e.g., E-mail and facsimile) processing and routing, and plain parsing. Inside the framework, requests for information extraction are passed to the actual extractors. The framework can handle both pre- and post processing of the application data, control of the extractors, enrich the information extracted by the extractors. The framework can also suggest necessary actions the application should take on the data. To achieve the goal of easy integration and extension, the framework provides an integration (outside) application program interface (API) and an extractor (inside) API.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.