Patent · US Active

Predicting deep learning scaling

US11593655B2 · kind B2 · utility

0Cited by

0References

20Claims

0Family size

Assignee

BAIDU USA LLC · US

Inventors

Joel Hestness · Mountain View, US
Gregory Diamos · San Jose, US
Hee Woo Jun · Sunnyvale, US
Sharan Narang · San Francisco, US
Newsha Ardalani · Santa Clara, US
Md Mostofa Ali Patwary · Gilroy, US
Yanqi Zhou · San Jose, US

Key dates

Filing date	Nov 30, 2018
Grant date	Feb 28, 2023
Priority date	—
Expiry date	Sep 24, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/045
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

As deep learning application domains grow, a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements is extremely beneficial. Presented herein are large-scale empirical study of error and model size growth as training sets grow. Embodiments of a methodology for this measurement are introduced herein as well as embodiments for predicting other metrics, such as compute-related metrics. It is shown herein that power-law may be used to represent deep model relationships, such as error and training data size. It is also shown that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.