基于协同演化算法的HLS延时优化Latency Optimization for HLS Based on Co-evolutionary Algorithm
张仕;蔡蕊;余晓菲;严宣辉;黄晞;蒋建民;
摘要(Abstract):
HLS的应用让高级程序语言描述的算法能够快速实现于硬件电路,从而极大提高集成电路的设计效率.但是HLS提供的丰富配置让程序的可用优化配置数会随着程序的规模呈指数级增加.基于此,提出一种基于协同演化计算的HLS延时优化方法,该方法以Top-Function和具体外围应用需求为输入,通过程序分析得到基本可配置元素;以随机生成配置为初始种群,采用协同演化方法,生成基于延时最小的配置.实验部分详细分析了基于179ART的实验,提出的优化方法令其延时减少达50%以上.最后,又通过3组实验展示了该方法的有效性.
关键词(KeyWords): 延时优化;FPGA;HLS;协同演化算法;最优化
基金项目(Foundation): 国家自然科学基金资助项目(61772004)
作者(Authors): 张仕;蔡蕊;余晓菲;严宣辉;黄晞;蒋建民;
参考文献(References):
- [1]HARTMUT F W SADROZINSKI,WU JINYUAN.Applications of field-programmable gate arrays in scientific research[M].Boca Raton:CRC Press,2010:1-5.
- [2]DAOUD L,ZYDEK D,SELVARAJ H.A survey of high level synthesis languages,tools,and compilers for reconfigurable high performance computing[C]∥ICSS 2013,Springer,2014:483-492.
- [3]WIMMEEUS,KRISTOF Vab Beeck,TOON Goedeme,et al.An overview of today's high-level synthesis tools[J].Design Automation for Embedded Systems,2012,16(3):31-51.
- [4]CONG J,LIU B,NEUENDORFFER S,et al.High-level synthesis for FPGAs:from prototyping to deployment[J].Trans Comp Aided Des Integ Cir Sys,2011,30(4):473-491.
- [5]ANDREW Canis,JONGSOK Choi,MARK Aldham,et al.Legup:high-level synthesis for FPGA-based processor/accelerator systems[C]∥Proceeding of FPGA 2011,New York:ACM Press,2011:33-36.
- [6]CHEN Ding Kai,SU Hong Men,YEW Pen Chuang.The impact of synchronization and granularity on parallel system[C]∥Proceeding of ISCA 1990,New York:ACM Press,1990:239-248.
- [7]SU H M,YE P C.On data synchronization for multiprocessors[C]∥Proceeding of ISCA 1989,New York:ACM Press,1989:416-423.
- [8]Xilinx.Introduction to fpga design with vivado high-level synthesis[EB/OL].(2013-07-01).http://www.xilinx.com/support/documentation/sw—manuals/ug998-vivado-intro-fpga-design-hls.pdf.
- [9]Xilinx.Vivado design suite user guide(high-level synthesis)[EB/OL].(2013-07-01).http://www.xilinx.com/support/documentation/sw—manuals/xilinx2016—2/ug973-vivado-release-notes-install-license.pdf.
- [10]ZHANG Chen,LI Peng,SUN Guang Yu,et al.Optimizing FPGA-based accelerator design for deep convolutional neural networks[C]∥Proceeding of FPGA 2015,New York:ACM Press,2015:161-170.
- [11]OLIVER Knodel,ANDY Georgi,PATRICK Lehmann,et al.Integration of a Highly Scalable,multi-FPGA-Based hardware accelerator in common cluster infrastructure[C]∥Proceeding of ICPP 2013,IEEE.2013:893-900.
- [12]JOSH Monson,BRAD L Hutchings,MIKE Wirthlin.Implementing high-performance,low-power FPGA-based optical flow accelerators in C[C]∥Proceeding of ASAP 2013,IEEE,2013:363-369.
- [13]ANTONIE Morvan,STEVEN Derrien,PATRICE Quinton.Polyhedral bubble insertion:a method to improve nested loop pipelining for high-level synthesis[J].IEEE Transactions on Computer-Aided Design of Intergated Circuits and Systems,2013,32(3):339-352.
- [14]LI Peng,ZHANG Peng,LOUIS-NOEL Pouchet,et al.Resource-aware throughput optimization for high-level synthesis[C]∥Proceedings of FPGA 2015,New York:ACM Press,2015:200-209.
- [15]WANG Yu Xin,LI Peng,CONG Jason.Theory and algorithm for generalized memory partitioning in high-level synthesis[C]∥Proceeding of FPGA 2014,New York:ACM Press,2014:199-208.
- [16]YUKO Hara,HIROYUKI Tomiyama,SHINYA Honda,et al.Proposal and quantitative analysis of the chstone benchmark program suite for practical c-based high-level synthesis[J].Journal of Information Processing,Information Processing Society of Japan,2009,17(10),242-254.
- [17]JAIN R,MUJUMDAR A,SHARMA A,et al.Empirical evaluation of some high-level synthesis scheduling heuristics[C]∥Proceeding of 28th Design Automation Conference,New York:ACM Press,1991:686-689.
- [18]GUPTA S,SAVOIU N,DUTT N,et al.Using global code motions to improve the quality of results for high-level synthesis[J].IEEE TCAD,2004,23(2):302-311.
- [19]CONG Jason,ZHANG Zhi Ru.An efficient and versatile scheduling algorithm based on SDC formulation[C]∥Proceeding of 43th DAC,New York:ACM Press,2006:433-438.