Patent · US Active

Design-time information based on run-time artifacts in transient cloud-based distributed computing clusters

US10635700B2 · kind B2 · utility

4Cited by
5References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 2, 2018
Grant dateApr 28, 2020
Priority date
Expiry dateJun 26, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N5/022
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Transient computing clusters can be temporarily provisioned in cloud-based infrastructure to run data processing tasks. Such tasks may be run by services operating in the clusters that consume and produce data including operational metadata. Techniques are introduced for tracking data lineage across multiple clusters, including transient computing clusters, based on the operational metadata. In some embodiments, operational metadata is extracted from the transient computing clusters and aggregated at a metadata system for analysis. Based on the analysis of the metadata, operations can be summarized at a cluster level even if the transient computing cluster no longer exists. Further relationships between workflows, such as dependencies or redundancies, can be identified and utilized to optimize the provisioning of computing clusters and tasks performed by the computing clusters.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.