Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Energy-Efficient Hardware Design for Machine Learning with In-Memory Computing

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • الموضوع:
      2024
    • Collection:
      Columbia University: Academic Commons
    • نبذة مختصرة :
      Recently, machine learning and deep neural networks (DNNs) have gained a significant amount of attention since they have achieved human-like performance in various tasks, such as image classification, recommendation, and natural language processing. As the tasks get more complicated, people build bigger and deeper networks to obtain high accuracy, and this brings challenges to existing hardware on fast and energy-efficient DNN computation due to the memory wall problem. First, traditional hardware spends a significant amount of energy on moving the data between memory and ALU units. Second, the traditional memory blocks only support row-by-row access, and this limits the computation speed and energy efficiency. In-memory computing (IMC) is one promising measure to solve the aforementioned problems in DNN computation. This approach combines the memory blocks with the computation units to enable high computation throughput and low energy consumption. On the macro level, both digital and analog-mixed-signal (AMS) IMC macros achieve high performance in the multiply-and-accumulation (MAC) computation. The AMS designs have high energy efficiency and highcompute density, and the digital designs have PVT robustness and technology scalability. On the architecture level, specialized hardware accelerators that integrate these IMC macros outperform the traditional hardware accelerators in end-to-end DNN inference. Beyond the IMC, other approaches also reduce energy consumption. For example, sparsity-aware training reduces the arithmetic energy by adding more zeros to the weights and zero-gating the multiplication and/or addition. Weight and activation compression reduces the off-chip memory access energy. This thesis presents new circuit and architecture designs for efficient DNN inference with in-memory computing architectures. First, this thesis presents two SRAM-based analog-mixed signal IMC macros. One is a macro with custom 10T1C cells for binary/ternary MAC operation. The other one, MACC-SRAM, is a ...
    • Relation:
      https://doi.org/10.7916/6b0f-aq66
    • الرقم المعرف:
      10.7916/6b0f-aq66
    • الدخول الالكتروني :
      https://doi.org/10.7916/6b0f-aq66
    • الرقم المعرف:
      edsbas.70953927