Publications

  • AntMan: Dynamic Scaling on GPU Clusters for Deep Learning
    Wencong Xiao, Shiru Ren, Yong Li, Yang Zhang, Pengyang Hou, Zhi Li, Yihui Feng, Wei Lin, Yangqing Jia
    The 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’20)
    [To appear]

  • Distributed Graph Computation Meets Machine Learning
    Wencong Xiao, Jilong Xue, Youshan Miao, Zhen Li, Cheng Chen, Ming Wu, Wei Li, Lidong Zhou
    IEEE Transactions on Parallel & Distributed Systems (TPDS)
    [pdf]

  • An Empirical Study on Program Failures of Deep Learning Jobs
    Ru Zhang, Wencong Xiao, Hongyu Zhang, Yu Liu, Haoxiang Lin, Mao Yang
    The 42nd International Conference on Software Engineering (ICSE 2020, Distinguished Paper Award!)
    [pdf]

  • PRmalloc: Leveraging Predictability for Deep Learning Memory Allocation
    Wencong Xiao, Shiru Ren, Tongxuan Liu, Yong Li
    Workshop on AI Systems at SOSP 2019
    [pdf][poster]

  • AliGraph: An Industrial Graph Neural Network Platform
    Kun Zhao, Wencong Xiao, Baole Ai, Wenting Shen, Xiaolin Zhang, Yong Li, Wei Lin
    Workshop on AI Systems at SOSP 2019
    [pdf][poster]

  • Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads
    Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, Fan Yang
    2019 USENIX Annual Technical Conference (ATC ’19)
    [pdf][slides][trace]

  • SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity through Low-Bit Quantization
    Shijie Cao, Lingxiao Ma, Wencong Xiao, Chen Zhang, Yunxin Liu, Lintao Zhang, Lanshun Nie, Zhi Yang
    IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’19)
    [pdf]

  • Efficient and Effective Sparse LSTM on FPGA with Bank-Balanced Sparsity
    Shijie Cao, Chen Zhang, Zhuliang Yao, Wencong Xiao, Lanshun Nie, Dechen Zhan, Yunxing Liu, Ming Wu, Lintao Zhang
    27th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA ’19)
    [pdf][slides]

  • Balanced Sparsity for Efficient DNN Inference on GPU
    Zhuliang Yao, Shijie Cao, Wencong Xiao, Chen Zhang, Lanshun Nie
    33rd AAAI Conference on Artificial Intelligence (AAAI ’19)
    [pdf][poster]

  • Scheduling CPU for GPU-based Deep Learning Jobs
    Wencong Xiao, Zhenhua Han, Hanyu Zhao, Xuan Peng, Quanlu Zhang, Fan Yang, Lidong Zhou
    ACM Symposium on Cloud Computing 2018 (SoCC ’18 poster)
    [pdf][poster]

  • Gandiva: Introspective Cluster Scheduling for Deep Learning
    Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, Fan Yang, Lidong Zhou
    The 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18)
    [pdf][slides][poster]

  • BeamRaster: A Practical Fast Massive MU-MIMO System with Pre-computed Precoders
    Meng Meng, Wencong Xiao, Tong He, Yuechen Tao, Kun Tan, Jiansong Zhang, Wenjie Wang
    IEEE Transactions on Mobile Computing (TMC)
    [pdf]

  • Multi-tenant GPU Clusters for Deep Learning Workloads: Analysis and Implications
    Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, Fan Yang
    Microsoft Research Technical Report (MSR-TR-2018-13)
    [pdf]

  • Optimization Mapping for Deep Learning
    Wencong Xiao, Cheng Chen, Youshan Miao, Jilong Xue, Ming Wu
    The 26th ACM Symposium on Operating Systems Principles AI Systems Workshop (SOSP ’17 AISys)
    [pdf][poster]

  • All You Need to Know about Scheduling Deep Learning Jobs
    Wencong Xiao, Fan Yang, Lidong Zhou
    The 26th ACM Symposium on Operating Systems Principles Student Research Competition (SOSP ’17 SRC)
    [pdf][poster]

  • KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC
    Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, Lintao Zhang
    The 26th ACM Symposium on Operating Systems Principles (SOSP ’17)
    [pdf]

  • Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter
    Yuanwei Lu, Guo Chen, Zhenyuan Ruan, Wencong Xiao, Bojie Li, Jiansong Zhang, Yongqiang Xiong, Peng Cheng, Enhong Chen
    The 1st Asia-Pacific Workshop on Networking (APNet ’17)
    [pdf]

  • TuX2: Distributed Graph Computation for Machine Learning
    Wencong Xiao, Jilong Xue, Youshan Miao, Cheng Chen, Zhen Li, Ming Wu, Wei Li, Lidong Zhou
    The 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’17)
    [pdf][slides]

  • GRAM: Scaling Graph Computation to the Trillions
    Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, Lidong Zhou
    ACM Symposium on Cloud Computing 2015 (SoCC ’15)
    [pdf]