Preserving conceptual distance within unstructured documents
US9424298B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 7, 2014 |
| Grant date | Aug 23, 2016 |
| Priority date | — |
| Expiry date | Feb 18, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/40
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, system and computer-usable medium are disclosed for preserving conceptual distance within unstructured documents by characterizing conceptual relationships. Natural language processing is applied to content in a plurality of documents to identify topics and subjects. Analytic analysis is then applied to the identified topics and subjects to identify concepts. The content in each of the plurality of documents is partitioned into a first structured hierarchy, preserving at least one structure in each document inherent in the each document. Access is then provided to the content through a first index based upon utilizing the first structured hierarchy and through a second index utilizing a second structured hierarchy. The conceptual relationship criteria are based upon a directed graph with weights based upon a similarity and a distance based upon concepts.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.