Illusion of large on-chip memory by networked computing chips for neural network inference

Publication Date

1-1-2021

Document Type

Article

Publication Title

Nature Electronics

Volume

4

Issue

1

DOI

10.1038/s41928-020-00515-3

First Page

71

Last Page

80

Abstract

Hardware for deep neural network (DNN) inference often suffers from insufficient on-chip memory, thus requiring accesses to separate memory-only chips. Such off-chip memory accesses incur considerable costs in terms of energy and execution time. Fitting entire DNNs in on-chip memory is challenging due, in particular, to the physical size of the technology. Here, we report a DNN inference system—termed Illusion—that consists of networked computing chips, each of which contains a certain minimal amount of local on-chip memory and mechanisms for quick wakeup and shutdown. An eight-chip Illusion system hardware achieves energy and execution times within 3.5% and 2.5%, respectively, of an ideal single chip with no off-chip memory. Illusion is flexible and configurable, achieving near-ideal energy and execution times for a wide variety of DNN types and sizes. Our approach is tailored for on-chip non-volatile memory with resilience to permanent write failures, but is applicable to several memory technologies. Detailed simulations also show that our hardware results could be scaled to 64-chip Illusion systems.

Funding Number

A1892b0026

Funding Sponsor

Defense Advanced Research Projects Agency

Department

Electrical Engineering

Share

COinS