Patent · US Expired

System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations

US7139752B2 · kind B2 · utility

134Cited by
26References
27Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 30, 2003
Grant dateNov 21, 2006
Priority date
Expiry dateJan 17, 2025

Classification

  • Technology area (CPC Y)Emerging Cross-Sectional Technologies
  • CPC primaryY10S707/99935
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of coupled annotators for tokenizing document data for identifying and annotating a particular type of semantic content. Operating the at least one text analysis engine generates a plurality of views of a document, where each of the plurality of views are derived from a different tokenization of the document. The method further includes storing the plurality of views in a common data structure associated with the document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.