Paving the Way for RDMA Towards High-Performance Data Computing
时 间:2018年6月08日(周五)上午10:00
地 点:计算所948会议室
报告人:Dr. Xiaoyi Lu, The Ohio State University
摘要:The increasing demands of high-performance data computing and communication have been driving the networking speed increase from 1Gb/s to 100Gb/s or higher. The traditional Sockets-based TCP/IP protocols can no longer keep up with the increasing performance demand. Consequently, the advanced capabilities of RDMA (i.e., Remote Direct Memory Access) enabled networks are paving the way for designing novel high-performance communication and I/O protocols in data centers, blurring the boundary between local and remote data access. However, fully utilizing RDMA-capable networks for the end applications is still full of challenges. In this talk, I will first examine the challenges in designing RDMA-based communication and I/O protocols over high-speed networks (e.g., InfiniBand, RoCE). Then, I will discuss how we co-design different components with RDMA in a broad range of systems from the areas of HPC Cloud (MPI-on-Cloud), Big Data Analytics (Hadoop/Spark/Memcached), and Deep Learning (TensorFlow) to overcome these challenges. In-depth case studies will show that how RDMA-based designs can benefit not only performance, but also other aspects such as scalability, fault-tolerance, and availability in these systems.
报告人简介:Dr. Xiaoyi Lu is a Research Scientist of the Department of Computer Science and Engineering at the Ohio State University, USA. His current research interests include high performance interconnects and protocols, Big Data Analytics, Parallel Computing Models, Virtualization, Cloud Computing, and Deep Learning frameworks. He has already published more than 100 papers in major International conferences, workshops, and journals with multiple Best (Student) Paper Awards or Nominations. He has delivered more than 100 times of invited talks, tutorials, and presentations worldwide. Recently, Dr. Lu is leading the research and development of RDMA-based accelerations for Apache Hadoop, Spark, HBase, and Memcached, and OSU HiBD micro-benchmarks, which are publicly available from http://hibd.cse.ohio-state.edu. These libraries are currently being used by more than 285 organizations from 34 countries. He is leading the research and development of the MVAPICH2-Virt (high-performance and scalable MPI for HPC cloud) project. More details about Dr. Lu are available at http://www.cse.ohio-state.edu/~luxi.