System and method for multithreaded text indexing for next generation multi-core architectures
US8661037B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 9, 2010 |
| Grant date | Feb 25, 2014 |
| Priority date | — |
| Expiry date | Mar 19, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F16/325
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for indexing documents in a data storage system includes generating a single document hash table in storage memory for a single document using an index construction in a multithreaded and scalable configuration wherein multiple threads are each assigned work to reduce synchronization between threads. The single document hash table includes partitioning the single document and indexing strings of partitioned portions of the single document to create a minor hash table for each document sub-part; generating a document level hash table from the minor hash tables; updating a stream level hash table for the strings which maps every string to a global identifier; and generating a term reordered array from the document level hash table.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.