Patent · US Active

System, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations

US8510280B2 · kind B2 · utility

3Cited by
2References
26Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 30, 2009
Grant dateAug 13, 2013
Priority date
Expiry dateNov 6, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/2456
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations are provided. Rows allocated to processing modules involved in a join operation are redistributed among the processing modules by a hash redistribution of the join attributes. Receipt by a processing module of an excessive number of redistributed rows having a skewed value on the join attribute is detected by a processing module which notifies other processing modules of the skewed value. Processing modules then terminate redistribution of rows having a join attribute value matching the skewed value and either store such rows locally or duplicate the rows. The processing module that has received an excessive number of redistributed rows removes rows having a skewed value of the join attribute from a redistribution spool allocated thereto and duplicates the rows to each of the processing modules. The join operation is completed by performing a local join at each processing module and merging the results of the local join operations.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.