Summary:
Researchers in the UCLA Department of Electrical and Computer Engineering have developed a data-converter-free in-memory computing circuit that greatly reduces data processing latencies.
Background:
Low latency and low power computing circuitry is in high demand for artificial intelligence (AI) / machine learning (ML) applications. Compute in memory (CIM), where computational steps are performed directly within memory arrays, can both reduce energy consumption and processing latency by minimizing data movement between memory and other computer components. Today’s analog CIM methods promise great reductions in power consumption but are not flexible enough for most ML workloads, and are plagued by a need for analog-to-digital converters, chip components that are expensive to design and take up large amounts of area on the circuit and large amounts of power. Digital CIM methods offer more programming flexibility but their bulky adder trees sacrifice any energy reduction and speed increase promises of CIM. To develop next-generation computational systems, especially in applications running on edge devices that demand very low latency processing, there is an urgent need to create new CIM systems.
Innovation:
Researchers led by Dr. Sudhakar Pamarti in the Electrical and Computer Engineering Department have developed an approach to enable stochastic computing in memory (SCIM) for rapid in-situ data processing. Stochastic computing (SC) represents numbers as fractions of 1s in a random binary stream. Operands stored in memory in fixed-point format are converted into the SC format on the fly by employing stochastic number converters (SNC) embedded within the rows of a high density memory array. The SC streams representing the multiply and accumulate (MAC) outputs are computed directly within the memory array without any analog-to-digital converters or bulky digital adder trees, and converted back to fixed-point as needed. Compared to standard fixed-point computational units, the compact SC MAC circuitry and the embedded SNCs can enable parallelization far more effectively, with attendant reductions in data movement and latency. The researchers have demonstrated the utility of this design, in Silicon, by achieving a 25x improvement in processing speed of camera data. Overall, this technology is a significant innovation in computer chip design with myriad applications in high-speed computation.
Potential Applications:
• High-speed computation/sensors
• Radar and real-time imaging
• Autonomous vehicle data processing
• Real-time financial modeling
Advantages:
• Reduced latency
• Reduced power consumption
• Compact footprint
Development-To-Date:
A prototype chip has been fabricated and validated on object tracking data calculations.
Reference:
UCLA Case No. 2024-121