TY - JOUR
T1 - 7 TOPS/W Cellular Neural Network Processor Core for Intelligent Internet-of-Things
AU - Villemur, Martin
AU - Julian, Pedro
AU - Figliolia, Tomas
AU - Andreou, Andreas G.
N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2020/7/1
Y1 - 2020/7/1
N2 - We discuss the architecture, implementation and testing of a simplicial Cellular Neural Network (CNN) vector processor core aimed at vision oriented intelligent Internet-of-Things (IoT) devices. The architecture comprises a linear array of 64 processing elements (PE), each connected to a 4 neighbor clique operating on 8-bit input and state data. A 3-bit simplicial parameter, allows multilevel function approximation and extends the functionality over previously reported chips. Input data vectors are stored in two 64 × 64 × 8 -bit data caches. The chip is synthesized from a custom designed ultra low voltage CMOS library and fabricated in a 55nm CMOS technology. Dynamic voltage/frequency scaling allows operation at power supplies between 0.5 and 1.2 Volts allowing for a tradeoff between speed and power. The fabricated chip achieves an overall performance of 7.05 TOPS/W at 732fps, with a dynamic energy efficiency of 12.2fJ per operation (OP) at 1.2 Volts.
AB - We discuss the architecture, implementation and testing of a simplicial Cellular Neural Network (CNN) vector processor core aimed at vision oriented intelligent Internet-of-Things (IoT) devices. The architecture comprises a linear array of 64 processing elements (PE), each connected to a 4 neighbor clique operating on 8-bit input and state data. A 3-bit simplicial parameter, allows multilevel function approximation and extends the functionality over previously reported chips. Input data vectors are stored in two 64 × 64 × 8 -bit data caches. The chip is synthesized from a custom designed ultra low voltage CMOS library and fabricated in a 55nm CMOS technology. Dynamic voltage/frequency scaling allows operation at power supplies between 0.5 and 1.2 Volts allowing for a tradeoff between speed and power. The fabricated chip achieves an overall performance of 7.05 TOPS/W at 732fps, with a dynamic energy efficiency of 12.2fJ per operation (OP) at 1.2 Volts.
KW - Sensors
KW - Arrays
KW - Cellular neural networks
KW - Vector processors
KW - Image edge detection
UR - https://ieeexplore.ieee.org/document/8790983/
U2 - 10.1109/TCSII.2019.2933723
DO - 10.1109/TCSII.2019.2933723
M3 - Article
SN - 1558-3791
VL - 67-II
SP - 1324
EP - 1328
JO - IEEE Transactions on Circuits and Systems II: Express Briefs
JF - IEEE Transactions on Circuits and Systems II: Express Briefs
IS - 7
M1 - 7
ER -