Handling queries in document systems using segment differential based document text-index modelling
US11157477B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 28, 2018 |
| Grant date | Oct 26, 2021 |
| Priority date | — |
| Expiry date | Oct 22, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/219
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method, computer system, and computer program product for segment differential-based document text-index modeling are provided. The embodiment may include receiving, by a processor, a document with a valid document ID and version ID tuple. The embodiment may also include determining the received document is a new version of a previously stored document and consequently multiplexing versions of the document into a single indexed document. The embodiment may further include segmenting the received document and building a token vector. The embodiment may also include calculating a difference between the received new version of the document and the previously stored document using information obtained from the segmentation. The embodiment may further include in response to the calculated difference being below a pre-configured threshold value, discarding the received new version.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.