Hardware-software co-design for accelerating deep learning inference
US11443173B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Apr 24, 2019 |
| Grant date | Sep 13, 2022 |
| Priority date | — |
| Expiry date | Jul 14, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/0464
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Embodiments disclose an artificial intelligence chip and a convolutional neural network applied to the artificial intelligence chip comprising a processor, at least one parallel computing unit, and a pooling computation unit. The method comprises: dividing a convolution task into convolution subtasks and corresponding pooling subtasks; executing convolution subtasks at different parallel computing units, and performing convolution, batch normalization, and non-linear computing operation in a same parallel computing unit; sending an execution result of each parallel computing unit from executing the convolution subtask to the pooling computation unit for executing the corresponding pooling subtask; merging executing results of the pooling computation unit from performing pooling operations on the executing results outputted by respective convolution subtasks to obtain an execution result of the convolution task. This can reduce data transport, such that operations of the convolutional neural network may be accomplished with lower power consumption and less time in an edge device.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.