Enhancing DNN Computational Efficiency via Decomposition and Approximation

Ori Schweitzer, Uri Weiser, Freddy Gabbay

Research output: Contribution to journalArticlepeer-review

Abstract

The increasing computational demands of emerging deep neural networks (DNNs) are fueled by their extensive computation intensity across various tasks, placing a significant strain on resources. This paper introduces DART, an adaptive microarchitecture that enhances area, power, and energy efficiency of DNN accelerators through approximated computations and decomposition, while preserving accuracy. DART improves DNN efficiency by leveraging adaptive resource allocation and simultaneous multi-threading (SMT). It exploits two prominent attributes of DNNs: resiliency and sparsity, of both magnitude and bit-level. Our microarchitecture decomposes the Multiply-and-Accumulate (MAC) into fine-grained elementary computational resources. Additionally,DART employs an approximate representation that leverages dynamic and flexible allocation of decomposed computational resources through SMT (Simultaneous Multi-Threading), thereby enhancing resource utilization and optimizing power consumption.We further improve efficiency by introducing a new Temporal SMT (tSMT) technique, which suggests processing computations from temporally adjacent threads by expanding the computational time window for resource allocation. Our simulation analysis, using a systolic array accelerator as a case study, indicates that DART can achieve more than 30% reduction in area and power, with an accuracy degradation of less than 1% in state-of-the-art DNNs in vision and natural language processing (NLP) tasks, compared to conventional processing elements (PEs) using 8-bit integer MAC units.

Original languageEnglish
JournalIEEE Access
DOIs
StateAccepted/In press - 2024

Keywords

  • Approximate Computing
  • Computer Architectures
  • Deep Neural Networks
  • Machine Learning Accelerators

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'Enhancing DNN Computational Efficiency via Decomposition and Approximation'. Together they form a unique fingerprint.

Cite this