sAI Zhang

Research Scientist, Meta Reality Labs

 Incoming Assistant Professor, New York University 

About Me

I am working as a research scientist at Meta Reality Labs. Previously I was a postdoctoral fellow at Harvard University, and I received my PhD degree of Computer Science at the same university. Before coming to Harvard, I received my bachelor's degree and master's degree in Electrical Engineering and Statistics from University of Toronto.


My chinese name is 张赛骞.


I am incredibly excited to announce that I will be joining New York University in Fall 2024 as a tenure-track Assistant Professor of Electrical Engineering and Computer Science! I am looking for highly motivated PhD students and visiting students/researchers. Interested candidates are strongly encouraged to contact me by email, together with resume and transcripts.

Research Overview

Overall, my research interest lies in algorithm/hardware codesign for efficient deep neural network (DNN) implementation.

Application & Algorithm:

  • Efficient DNN computing, pruning, quantization, NAS

  • Recent interest: parameter efficient finetuning for LLM, efficient self-supervised learning, privacy for AI

Hardware architecture:

  • Domain-specific accelerator for compute-intensive AI applications, New compute paradigm for DNN

  • Recent interest: AI accelerator for on-device transfer learning and contrastive learning


  • Multi-agent reinforcement learning and its application

  • Federated learning

  • AI compiler


[4/2024] Check our our latest survey ( on Parameter Efficient Fine-tuning (PEFT) for Large Models. This work is done with my talented intern students. From algorithm design to hardware efficiency and system implementation, this comprehensive survey covers multiple aspects of PEFT research nowadays. 

[3/2024] One paper with my intern student wenshuo, got accepted in NAACL'24!

[2/2024] Two papers accepted in ISQED'24.

[11/2023] Serving as TPC for DAC'24 and ISQED'24.

[10/2023] Our paper "Co-Designing AI Models and DRAMs for On-Device Training" is accepted by HPCA 2024!

[9/2023] Our paper on efficient reinforcement learning, which I co-authored with my high school mentee, Gavin An, has been accepted for publication in JEI.

[6/2023] I gave two talks on DNN hardware and algorithm codesign at Tsinghua University and Peking University.

[5/2023] Our paper "Co-Designing AI Models and DRAMs for On-Device Training" is submitted to Arxiv. This paper proposes an algorithm/hardware codesign solution for efficient on-chip transfer learning which completely eliminates the off-chip DRAM traffic during the training process.

[3/2023] My high school mentee, Gavin An, has successfully finished his AI project on efficient reinforcement learning. A paper is submitted to JEI.

[9/2022] Start working at Meta!

[7/2022] Our paper “Hyperspherical Federated Learning" is accepted by ECCV 2022!

[7/2022] I gave an invited talk at AI times on multi-agent reinformcent learning and its applications.

[6/2022] I gave an invited talk at IEEE Dallas Circuits and Systems Conference (DCAS), 2022.

[4/2022] I started my AI memtorship at Veritas AI!

[2/2022] I started my postdoc study at Harvard!

[12/2021] I successfully defended my PhD!

[11/2021] Our paper “Learning Advanced Client Selection Strategy for Federated Learning" is accepted by AAAI 2022!

[10/2021] Our paper “FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding" is accepted by IEEE HPCA 2022!

[06/2021] Finished my internship at Microsoft, such a great place for research!

[03/2021] I started my virtual internship at Microsoft Research, Redmond.

[03/2021] I gave a guest lecture on DNN accelerator design on Havard Course ES201, hosted by Prof. Demba Ba.

[01/2021] One paper got accepted by IEEE International Symposium on Circuits & Systems (ISCAS), 2021.

[12/2020] I presented (virtually) our work "Succinct and Robust Multi-Agent Communication With Temporal Message Control" in NeurIPS 2020.

[11/2020] I presented (virtually) our work "Term quantization: furthering quantization at run time" in SC 2020.

[11/2020] Our paper “Training for Multi-resolution Inference Using Reusable Quantization Terms" is accepted by ACM ASPLOS 2021!

[09/2020] Our paper "Succinct and Robust Multi-Agent Communication With Temporal Message Control" is accepted by NeurIPS 2020!

[08/2020] I presented our work "Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN" in ICPP 2020.

[06/2020] Our paper "Term Revealing: Furthering Quantization at Run Time on Quantized DNNs" is accepted by ACM/IEEE SC 2020!

[05/2020] One paper "Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN" is accepted by ACM ICPP 2020!

[02/2020] One paper is accepted by IEEE Symposium on Security and Privacy (S&P) Deep Learning and Security workshop, 2020.