Patent · US Active

System and method for multithreaded text indexing for next generation multi-core architectures

US8661037B2 · kind B2 · utility

3Cited by
6References
22Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 9, 2010
Grant dateFeb 25, 2014
Priority date
Expiry dateMar 19, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/325
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for indexing documents in a data storage system includes generating a single document hash table in storage memory for a single document using an index construction in a multithreaded and scalable configuration wherein multiple threads are each assigned work to reduce synchronization between threads. The single document hash table includes partitioning the single document and indexing strings of partitioned portions of the single document to create a minor hash table for each document sub-part; generating a document level hash table from the minor hash tables; updating a stream level hash table for the strings which maps every string to a global identifier; and generating a term reordered array from the document level hash table.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.