A paper is accepted in HPDC 2019: UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems. This year, HPDC only accepted 22 papers out of 106. 11 papers have gone through shepherding. This paper got accepted directly. The first author of this paper is one of my Ph.D. students, Haiyang Shi. Congratulations to Haiyang and other co-authors!
Paper Info
[HPDC'19] UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems
Haiyang Shi, Xiaoyi Lu, Dipti Shankar, and Dhabaleswar K. Panda.
In Proceedings of the 28th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2019. (Acceptance Rate: 20.7%, 22/106)
Abstract
Distributed storage systems typically need data to be stored redundantly to
guarantee data durability and reliability. While the conventional approach
towards this objective is to store multiple replicas, today’s unprecedented
data growth rates encourage modern distributed storage systems to employ
Erasure Coding (EC) techniques, which can achieve better storage efficiency.
Various hardware-based EC schemes have been proposed in the community to
leverage the advanced compute capabilities on modern data center and cloud
environments. Currently, there is no unified and easy way for distributed
storage systems to fully exploit multiple devices such as CPUs, GPUs, and
network devices (i.e., multi-rail support) to perform EC operations in
parallel; thus, leading to the under-utilization of the available compute
power. In this paper, we first introduce an analytical model to analyze the
design scope of efficient EC schemes in distributed storage systems. Guided by
the performance model, we propose UMR-EC, a Unified and Multi-Rail Erasure
Coding library that can fully exploit heterogeneous EC coders. Our proposed
interface is complemented by asynchronous semantics with optimized
metadata-free scheme and EC rate-aware task scheduling that can enable a
highly-efficient I/O pipeline. To show the benefits and effectiveness of
UMR-EC, we re-design HDFS 3.x write/read pipelines based on the guidelines
observed in the proposed performance model. Our performance evaluations show
that our proposed designs can outperform the write performance of replication
schemes and the default HDFS EC coder by 3.7x - 6.1x and 2.4x - 3.3x,
respectively, and can improve the performance of read with failure recoveries
up to 5.1x compared with the default HDFS EC coder. Compared with the fastest
available CPU coder (i.e., ISA-L), our proposed designs have an improvement of
up to 66.0% and 19.4% for write and read with failure recoveries, respectively.
A paper is accepted in IPDPS 2019: C-GDR: High-Performance Container-aware GPUDirect MPI Communication Schemes on RDMA Networks. Congratulations to all the authors.
Paper Info
[IPDPS'19] C-GDR: High-Performance Container-aware GPUDirect MPI Communication Schemes on RDMA Networks
Jie Zhang, Xiaoyi Lu, Ching-Hsiang Chu, and Dhabaleswar K. Panda.
In Proceedings of the 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2019.
Abstract
In recent years, GPU-based platforms have received significant success for
parallel applications. In addition to highly optimized computation kernels on
GPUs, the cost of data movement on GPU clusters plays critical roles in
delivering high performance for end applications. Many recent studies have been
proposed to optimize the performance of GPU- or CUDA-aware communication
runtimes and these designs have been widely adopted in the emerging GPU-based
applications. These studies mainly focus on improving the communication
performance on native environments, i.e., physical machines, however GPU-based
communication schemes on cloud environments are not well studied yet. This
paper first investigates the performance characteristics of state-of-the-art
GPU-based communication schemes on both native and container-based
environments, which show a significant demand to design high-performance
container-aware communication schemes in GPU-enabled runtimes to deliver
near-native performance for end applications on clouds. Next, we propose the
C-GDR approach to design high-performance Container-aware GPUDirect
communication schemes on RDMA networks. C-GDR allows communication runtimes to
successfully detect process locality, GPU residency, NUMA, architecture
information, and communication pattern to enable intelligent and dynamic
selection of the best communication and data movement schemes on GPU-enabled
clouds. We have integrated C-GDR with the MVAPICH2 library. Our evaluations
show that MVAPICH2 with C-GDR has clear performance benefits on container-based
cloud environments, compared to default MVAPICH2-GDR and Open MPI. For
instance, our proposed C- GDR can outperform default MVAPICH2-GDR schemes by up
to 66% on micro-benchmarks and up to 26% on HPC applications over a
container-based environment.
Xiaoyi will serve as a General Co-Chair for Bench 2019.
Please submit your papers to Bench 2019. Call for Papers
Xiaoyi will serve as a TPC Co-Chair for The 5th IEEE International Workshop on High-Performance Big Data and Cloud Computing (HPBDC 2019).
Please submit your papers to HPBDC 2019. Call for Papers
Xiaoyi will serve as TPCs for the following conferences in 2019!
A collaborative grant for research on large-scale hybrid memory systems is funded by NSF. I'm the PI from the OSU side.
Congratulations to our great team!
Thanks a lot for NSF's support!!!

Grant Information: SPX: Collaborative Research: Memory Fabric: Data Management for Large-scale Hybrid Memory Systems
I start my new faculty position from this month! I'm looking forward to working with self-motivated students who are interested in doing system research.
More importantly, we are happy to announce that PADSYS Lab Starts the great journey from The Ohio State University (OSU)!
(Courtesy: Celebration by Nick Youngson CC BY-SA 3.0 Alpha Stock Images)