TY - JOUR

T1 - A 40-nm CMOS Multifunctional Computing-in-Memory (CIM) Using Single-Ended Disturb-Free 7T 1-Kb SRAM

AU - Wang, Chua Chin

AU - Tolentino, Lean Karlo S.

AU - Huang, Chia Yi

AU - Yeh, Chia Hung

N1 - Funding Information:
This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant MOST 108-2218-E-110-002 and Grant MOST 109-2218-E-110-007.
Publisher Copyright:
© 1993-2012 IEEE.

PY - 2021/12/1

Y1 - 2021/12/1

N2 - This investigation proposes a computing-in-memory (CIM) design to circumvent the von Neumann bottleneck which causes limited computation throughput for effective artificial intelligence (AI) applications. The proposed CIM performs multiple operations such as single-instruction basic Boolean operations, addition, and signed number multiplication, and multiple functions such as normal mode and retention mode for the built-in self-test (BIST). Its 2T-Switch requires only two transistors to be utilized for static random-access memory (SRAM) array; thus, the arithmetic unit can be chosen easily and the area overhead is minimized. Its ripple carry adder and multiplier (RCAM) unit based on single-ended disturb-free 7T 1-Kb SRAM was developed using the full swing-gate diffusion input (FS-GDI) technology that has full voltage swing resolution, low power consumption, and less chip area cost. Its Auto-Switching Write Back Circuit restores addition and multiplication operations automatically to assigned memory address. The CIM is implemented using the TSMC 40-nm CMOS process, where the core area is $432.81 \times 510.265\,\,\mu \text{m}^{2}$. Among the related works, the proposed CIM performs the most number of operations and functions.

AB - This investigation proposes a computing-in-memory (CIM) design to circumvent the von Neumann bottleneck which causes limited computation throughput for effective artificial intelligence (AI) applications. The proposed CIM performs multiple operations such as single-instruction basic Boolean operations, addition, and signed number multiplication, and multiple functions such as normal mode and retention mode for the built-in self-test (BIST). Its 2T-Switch requires only two transistors to be utilized for static random-access memory (SRAM) array; thus, the arithmetic unit can be chosen easily and the area overhead is minimized. Its ripple carry adder and multiplier (RCAM) unit based on single-ended disturb-free 7T 1-Kb SRAM was developed using the full swing-gate diffusion input (FS-GDI) technology that has full voltage swing resolution, low power consumption, and less chip area cost. Its Auto-Switching Write Back Circuit restores addition and multiplication operations automatically to assigned memory address. The CIM is implemented using the TSMC 40-nm CMOS process, where the core area is $432.81 \times 510.265\,\,\mu \text{m}^{2}$. Among the related works, the proposed CIM performs the most number of operations and functions.

KW - Computing-in-memory (CIM)

KW - disturb-free

KW - full swing-gate diffusion input (FS-GDI)

KW - static random-access memory (SRAM)

KW - von Neumann bottleneck

UR - http://www.scopus.com/inward/record.url?scp=85117352121&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85117352121&partnerID=8YFLogxK

U2 - 10.1109/TVLSI.2021.3115970

DO - 10.1109/TVLSI.2021.3115970

M3 - Article

AN - SCOPUS:85117352121

SN - 1063-8210

VL - 29

SP - 2172

EP - 2185

JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems

IS - 12

ER -