张云泉  研究员  

研究方向:

所属部门:高性能计算机研究中心

导师类别:博导计算机软件与理论

联系方式:zyq@ict.ac.cn

个人网页:

简       历:

  张云泉,博士,中国科学院计算技术研究所研究员,博士生导师,并行软件实验室主任。担任全国政协委员、中国计算机学会会士、高性能计算专业委员会秘书长、ACM中国高性能计算专家委员会主席、中国软件行业协会常务理事、中国智能计算产业联盟执行理事长、中国工业和应用数学学会副秘书长。曾任国家超算济南中心主任、中国计算机学会常务理事、国家自然科学基金委第十四届信息科学部专家评审组成员、中国计算机学会YOCSEF学术委员会主席。主要研究方向为高性能计算、并行算法与软件、并行计算模型等,已在国内外学术刊物上发表论文二百余篇,包括PPoPP(国内首次连续两年在PPoPP上发表学术论文)、SC(国内首次同一年度两篇论文)等CCF A类会议和Proceeding of the IEEETPDSTACOA类期刊。出版专著三本,译著八本,专利4项,软著8项。

  中国高性能计算机TOP100排行榜创始人和发布者,提出并系统阐述算力经济,PACCPCACM中国IPCC大赛创始人。一百多次担任国际会议程序委员会委员和共同主席。中科院《数据与计算发展前沿》副主编,一级学报《中国图象图形学报》、中国计算机学会会刊《计算机科学》、《计算机工程与科学》和《计算机科学与探索》编委。光明日报科普专家委员会委员,国家基因库高级顾问,青海省大数据与云计算咨询专家委员会委员,贵州省农业大数据专家委会会委员,内蒙古环球智库大数据发展中心专家委员会委员,郑州市智慧城市专家委员会委员,吕粱市大数据专家咨询委员会委员和政府转型专家顾问,江西抚州市算力专家咨询委员会执行主任。九三学社中央科技专委委员,中央科普工委委员,中央促进技术创新工作委员会委员。IEEE TCACM TACOIEEE TPDSJPDCParallel ComputingConcurrency and Computation Practice and ExperienceSCIENCE CHINA等著名国际期刊审稿人,ICPADS’08ICS’10EuroPar’11FGC’11IPDPS’11CGC’11SC’11ICPADS’12CCGrid’12EuroPar’12FGC’12IPDPS’13CGO’13CCGrid’14IPDPS’1450多次国际会议程序委员会委员, IEEE CSE 2010IEEE HPCC 2013FCST2015NPC2015HPC China 2016等程序委员会共同主席, ISC 2011 HPC in Asia Workshop共同主席, ISC’12 Steering Committee MemberScalCom 2015大会共同主席。

  

  19919-19957月,北京理工大学计算机科学与技术系计算机应用专业,获工学学士学位;

  19959-20007月,中国科学院软件研究所计算机软件与理论专业硕博连读,获工学博士学位;

  20007-200112月,中科院软件研究所并行软件研究开发中心,并行算法与并行软件,助研;

  20021-20073月,中科院软件研究所并行软件研究开发中心,并行算法与并行软件,副研究员;

  2003520138月,中科院软件研究所并行计算实验室,并行算法与并行软件,常务副主任;

  20074-20138月,中科院软件研究所并行计算实验室,并行算法与并行软件,研究员;

  20076-20138月,中科院软件研究所并行计算实验室,并行算法与并行软件,博士生导师。

  20108-20138月,中科院软件所与AMD公司 “APU软件联合研究开发中心主任。

  20115-20138月,中科院软件所与美国Argonne国家实验室数学与计算机科学部(MCS“PPCT 联合实验室a JOINT LAB FOR Parallel Processing and computing techniques Research)中方主任。

  20138月至今,中科院计算所计算机体系结构国家重点实验室,研究员,博士生导师。 

主要论著:

[1] Kun Li, Liang Yuan, Yunquan Zhang, Gongwei Chen. An Accurate and Efficient Large-scale Regression Method through Best Friend Clustering. IEEE TPDS 2022. (CCF A)

[2] Hang Cao, Liang Yuan, He Zhang, Yunquan Zhang, Baodong Wu, Kun Li, Shigang Li, Yongjun Xu, Minghua Zhang, Pengqi Lu, Junmin Xiao. AGCM-3DLF: Accelerating Atmospheric General Circulation Model via 3D Parallelization and Leap-Format. IEEE TPDS 2023. (CCF A)

[3] Honghui Shang, Li Shen, Yi Fan, Zhiqian Xu, Chu Guo, Jie Liu, Wenhao Zhou, Huan Ma, Rongfen Lin, Yuling Yang, Fang Li, Zhuoya Wang, Yunquan Zhang, and Zhenyu Li. Large-Scale Simulation of Quantum Computational Chemistry on a New Sunway Supercomputer. SC 2022. (CCF A)

[4] Honghui Shang, Xin Chen, Xingyu Gao, Rongfen Lin, Lifang Wang, Fang Li, Qian Xiao, Lei Xu, Qiang Sun, Leilei Zhu, Fei Wang, Yunquan Zhang, and Haifeng Song. TensorKMC: Kinetic Monte Carlo Simulation of 50 Trillion Atoms Driven by Deep Learning on a New Generation of Sunway Supercomputer. SC 2021. (CCF A)

[5] Honghui Shang, Fang Li, Yunquan Zhang, Ying Liu, Libo Zhang, Mingchuan Wu, Yangjun Wu, Di Wei, Huimin Cui, Xin Liu, Fei Wang, Yuxi Ye, Yingxiang Gao, Shuang Ni, Xin Chen, and Dexun Chen. Accelerating all-electron ab initio simulation of raman spectra for biological systems.  SC 2021. (CCF A)

[6] Honghui Shang, Fang Li, Yunquan Zhang, Libo Zhang, You Fu, Yingxiang Gao, Yangjun Wu, Xiaohui Duan, Rongfen Lin, Xin Liu, Ying Liu, and Dexun Chen. Extreme-scale ab initio quantum raman spectra simulations on the leadership HPC system in China.  SC 2021. (CCF A)

[7] Liang Yuan, Hang Cao, Yunquan Zhang, Kun Li, Pengqi Lu, Yue Yue. Temporal Vectorization for Stencils. SC 2021. (CCF A)

[8] Kun Li, Liang Yuan, Yunquan Zhang, Yue Yue. Reducing Redundancy in Data Organization and Arithmetic Calculation for Stencil Computations. SC 2021. (CCF A)

[9] Mingchuan Wu, Yangjun Wu, Honghui Shang, Ying Liu, Huimin Cui, Fang Li, Xiaohui Duan, Yunquan Zhang, and Xiaobing Feng. Scaling Poisson Solvers on Many Cores via MMEwald. IEEE TPDS 2021. (CCF A)

[10] Kun Li, Liang Yuan, Yunquan Zhang, Yue Yue, Hang Ca. An Efficient Vectorization Scheme for Stencil Computation. IPDPS 2022. (CCF B)

[11] Zhihao Li, Haipeng Jia, Yunquan Zhang, Tun Chen, Liang Yuan, Luning Cao, and Xiao Wang. Automatic Generation of High-Performance FFT Kernels on Arm and x86 CPUs. IEEE TPDS 2020. (CCF A)

[12] Hang Cao, Liang Yuan, He Zhang, Baodong Wu, Shigang Li, Pengqi Lu, Yunquan Zhang, Yongjun Xu, and Minghua Zhang. A Highly Efficient Dynamical Core of Atmospheric General Circulation Model based on Leap-Format. IPDPS 2020 (CCF B)

[13] Honghui Shang, Lei Xu, Baodong Wu, Xinming Qin, Yunquan Zhang, Jinlong Yang. The dynamic parallel distribution algorithm for hybrid density-functional calculations in HONPAS package. Comput. Phys. Commun. 2020 (SCIIF 3.9)

[14] Kun LiShigang LiShan Huang, Yifeng Chen, Yunquan Zhang, FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations. J. Supercomput. 76(7): 5501-5520 (2020) SCI) 

[15] Daning Cheng, Shigang LiYunquan ZhangWP-SGD: Weighted parallel SGD for distributed unbalanced-workload training system. J. Parallel Distributed Comput. 145: 202-216 (2020)。(SCI IF 2.296) 

[16] Xinming Qin, Honghui Shang, Lei Xu, Wei Hu, Jinlong Yang, Shigang Li, Yunquan Zhang. The static parallel distribution algorithms for hybrid density-functional calculations in HONPAS package. Int. J. High Perform. Comput. Appl. 34(2) 2020. (SCIIF 2.3)

[17] Zhihao Li, Haipeng Jia, Yunquan Zhang, Tun Chen, Liang Yuan, Luning Cao, and Xiao Wang. Auto t: A template-based  t codes auto-generation framework for arm and x86 cpus. In Proceedings of the International Confer- ence for High Performance Computing, Networking, Storage and Analysis, SC ’19, pages 25:1–25:15, New York, NY, USA, 2019. ACM.

[18] Kun Li, Honghui Shang, Yunquan Zhang, Shigang Li, Baodong Wu, Dong Wang, Libo Zhang, Fang Li, Dexun Chen, and Zhiqiang Wei. Openkmc: A kmc design for hundred-billion-atom simulation using millions of cores on sunway taihulight. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’19, pages 68:1–68:16, New York, NY, USA, 2019. ACM.

[19] Liang Yuan, Chen Ding, Wesley Smith, Peter Denning, and Yunquan Zhang. A relational theory of locality. ACM Trans. Archit. Code Optim., 16(3):33:1– 33:26, August 2019.

[20] Liang Yuan, Shan Huang, Yunquan Zhang, and Hang Cao. Tessellating star stencils. In Proceedings of the 48th International Conference on Parallel Processing, ICPP 2019, pages 43:1–43:10, New York, NY, USA, 2019. ACM.

[21] Zhihao Li, Haipeng Jia, Yunquan Zhang, Shice Liu, Shigang Li, Xiao Wang, and Hao Zhang. E cient parallel optimizations of a high-performance sift on gpus. Journal of Parallel and Distributed Computing, 124:78 – 91, 2019.

[22] Kun Li, Shigang Li, Shan Huang, et al. FastNBL: fast neighbor lists establishment for molecular dynamics simulation based on bitwise operations[J]. The Journal of Supercomputing, 2019: 1-20.

[23] Xinming Qin, Honghui Shang, Lei Xu, Wei Hu, Jinlong Yang, Shigang Li and Yunquan ZhangThe static parallel distribution algorithms for hybrid density-functional calculations in HONPAS package Int. J. High Perform. Comput. 2019) 

[24] S. Li, Y. Zhang, and T. Hoefler. Cache-oblivious mpi all-to-all communications based on morton order. IEEE TPDS, 29(3):542–555, March 2018. (CCF A)

[25] Zhihao Li, Haipeng Jia, Yunquan Zhang, Shice Liu, Shigang Li, Xiao Wang, and Hao Zhang. Efficient parallel optimizations of a high-performance sift on gpus. JPDC, (CCF B)

[26] Shigang Li, Baodong Wu, Yunquan Zhang, Xianmeng Wang, Jianjiang Li, Changjun Hu, Jue Wang, Yangde Feng, and Ningming Nie. Massively scaling the metal microscopic damage simulation on sunway taihulight supercomputer, ICPP 2018, (CCF B)

[27] Junmin Xiao, Shigang Li, Baodong Wu, He Zhang, Kun Li, Erlin Yao, Yunquan Zhang, and Guangming Tan. Communication-avoiding for dynamical core of atmospheric general circulation model., ICPP 2018 (CCF B)

[28] Y. Zhang and T. Cao and S. Li and X. Tian and L. Yuan and H. Jia and A. V. Vasilakos. Parallel Processing Systems for Big Data: A Survey. Proceedings of the IEEE. 2016,PP(99):1-23

[29] Yunquan Zhang, Shigang Li, Shengen Yan, Huiyang Zhou: A Cross-Platform SpMV Framework on Many-Core Architectures. TACO 13(4): 33:1-33:25 (2016)

[30] Wang, Qian and Zhang, Xianyi and Zhang, Yunquan and Yi, Qing. AUGEM: Automatically Generate High Performance Dense Linear Algebra Kernels on x86 CPUs. Proceedings of SC13:  International Conference for High Performance Computing, Networking, Storage and Analysis. 2013,1—12

[31] Liang Yuan, Yunquan Zhang, Peng Guo, Shan HuangTessellating StencilsSC 2017, Colorado Convention CenterNovember 12-17, 2017.

[32] Yan, Shengen and Li, Chao and Zhang, Yunquan and Zhou, Huiyang. yaSpMV: Yet Another SpMV Framework on GPUs. Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2014,107-118

[33] Yan, Shengen and Long, Guoping and Zhang, Yunquan. StreamScan: Fast Scan Algorithms for GPUs Without Global Barrier Synchronization. Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2013,229--238

[34] Shigang Li and Yunquan Zhang and Torsten Hoefler. Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures. PPoPP’17(poster). 2016,

[35] Lama, Palden and Li, Yan and Aji, Ashwin M. and Balaji, Pavan and Dinan, James and Xiao, Shucai and Zhang, Yunquan and Feng, Wu-chun and Thakur, Rajeev and Zhou, Xiaobo. pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments. Distributed Computing Systems (ICDCS), 2013 IEEE 33rd International Conference on. 2013,145-154

[36] Xiangzheng Sun, Yunquan Zhang, Ting Wang, Xianyi Zhang, Liang Yuan, Li Rao: Optimizing SpMV for Diagonal Sparse Matrices on GPU. ICPP 2011: 492-501.

[37] Liang Yuan, Chen Ding, Daniel tefankovic, Yunquan Zhang: Modeling the Locality in Graph Traversals. ICPP 2012: 138-147

[38] Mengran Fan, Haipeng Jia, Yunquan Zhang, Xiaojing An, Ting Cao:Optimizing Image Sharpening Algorithm on GPU. ICPP 2015: 230-239.

[39] Baodong Wu, Shigang Li, Yunquan Zhang, Ningming Nie: Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation. Computer Physics Communications 211: 113-123 (2017)

[40] Shigang Li, Changjun Hu, Junchao Zhang, Yunquan Zhang:Automatic tuning of sparse matrix-vector multiplication on multicore clusters. SCIENCE CHINA Information Sciences 58(9): 1-14 (2015).

[41] Yan Li, Yunquan Zhang, Haipeng Jia, Guoping Long, Ke Wang: Automatic FFT Performance Tuning on OpenCL GPUs. ICPADS 2011: 228-235.

[42] Xiangzheng Sun, Yunquan Zhang, Ting Wang, Guoping Long, Xianyi Zhang, Yan Li: CRSD: Application Specific Auto-tuning of SpMV for Diagonal Sparse Matrices. Euro-Par (2) 2011: 316-327.

[43] Xianyi Zhang, Qian Wang, Yunquan Zhang, Model-driven Level 3 BLAS Performance Optimization on Loongson 3A Processor, ICPADS 2012, Singapore.

[44] Liang Yuan, Yunquan Zhang: A Locality-based Performance Model for Load-and-Compute Style Computation. CLUSTER 2012: 566-571

[45] Haipeng Jia, Yunquan Zhang, Guoping Long, Jianliang Xu, Shengen Yan, Yan Li: GPURoofline: A Model for Guiding Performance Optimizations on GPUs. Euro-Par 2012: 920-932

[46] Zhang, Yunquan. Perspectives of China's HPC system development: a view from the 2009 China HPC TOP100 list. Frontiers of Computer Science in China. 2010,4(4):437--444

[47] Zhang, Yunquan and Chen, Guoliang and Sun, Guangzhong and Miao, Qiankun. Models of parallel computation: a survey and classification. Frontiers of Computer Science in China. 2007,1(2):156--165

[48] Zhang, Yun-Quan. DRAM(h): A parallel computation model for high performance numerical computing. Chinese Journal of Computers. 2003,26(12):1660--1670

[49] Zhang, Yun-Quan. Memory Complexity Analysis on Numerical Programs. Chinese Journal of Computers. 2000,23(4):363–373

科研项目:

[1] 国家自然科学基金重点项目,面向气候和湍流模拟的百万量级异构众核可扩展并行算法与优化方法,主持 

[2] 国家自然科学基金面上项目,面向众核体系架构的并行计算模型与性能自适应优化研究,主持 

[3] 国家自然科学基金面上项目,众核体系架构并行计算模型与算法自适应调优框架研究,主持 

[4] 北京市自然科学基金-海淀原始创新联合基金重点研究专题,面向深度学习的GPU虚拟化关键方法与技术研究,主持 


获奖及荣誉:

[1] 1998年中科院科技进步二等奖

[2] 2000年中科院院长奖学金优秀奖

[3] 2000年国家科技进步奖二等奖

[4] 2016年中国计算机学会科学技术二等奖

[5] 2017年首届CCF青竹奖

[6] 2017年中科院科教成果一等奖

[7] 2017年中科院杰出科学与技术成就奖

[8] 2019年国家科技进步奖二等奖

[9] 2021ACM戈登贝尔奖提名