System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations
US7139752B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | May 30, 2003 |
| Grant date | Nov 21, 2006 |
| Priority date | — |
| Expiry date | Jan 17, 2025 |
Classification
- Technology area (CPC Y)Emerging Cross-Sectional Technologies
- CPC primaryY10S707/99935
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of coupled annotators for tokenizing document data for identifying and annotating a particular type of semantic content. Operating the at least one text analysis engine generates a plurality of views of a document, where each of the plurality of views are derived from a different tokenization of the document. The method further includes storing the plurality of views in a common data structure associated with the document.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.