Patent · US Active

Distributed database job data skew detection

US10713250B2 · kind B2 · utility

0Cited by
4References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 13, 2015
Grant dateJul 14, 2020
Priority date
Expiry dateAug 6, 2036

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/3346
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for identifying whether data skew is causing delays in a map phase and/or a reduce phase of a query of a distributed database. The system and method identify the values of various metrics relating to a database query. These metrics include map phase and reduce phase durations and various related metrics. The system and method gather statistics of multiple queries to determine correlation levels between the metrics and the map phase and reduce phase durations. Based on the statistics, the system and method determine whether one or both of the map and reduce phases for a query/response are taking longer than expected. If the durations are longer than expected, the system identifies the delay as caused by data skew and informs the originator of the query.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.