This paper presents a computation in memory (CIM) architecture and circuit design featured with single command to execute addition, signed multiplication, and multi-function to resolve poor computation throughput caused by von Neumann bottleneck. The proposed CIM takes advantage of 2T-Switch circuit which needs only 2 switches to select the required computation units such that the area on silicon is reduced. RCAM (ripple carry adder and multiply) unit realized with full swing gate diffusion input (FS-GDI) in a single-ended disturb-free 7T SRAM further reduces the power consumption and active circuit area. Auto-switching write-back circuit consisting of BL auto-switching circuit, Data switching circuit, and WL auto-switching circuit facilitates the automatic restore of addition and multiplication to designated memory addresses. The proposed CIM is realized using 40-nm CMOS process to demonstrated 12.18/28.19 fJ/bit normalized write/read energy at 100 MHz system clock rate.