sAI Zhang

 Assistant Professor 

Tandon School of Engineering and Courant Institute

New York University 

 

 

About Me

I am an Assistant Professor of Electrical Engineering and Computer Science at New York University. Previously I worked as a senior research scientist at Meta Reality Labs. I received my PhD degree of Computer Science at the Harvard University. Before coming to Harvard, I received my bachelor's degree and master's degree in Electrical Engineering and Statistics from University of Toronto.

 

My chinese name is 张赛骞.

 

I am looking for highly motivated PhD students and visiting students/researchers. Interested candidates are strongly encouraged to contact me by email, together with resume and transcripts.

Research Overview

I am a researcher whose work lies on the boundary between deep learning and hardware system design. My goal is to develop efficient algorithms and hardware for the compute-intensive ML applications. Additionally, I am passionate about designing optimized AI algorithms and efficient hardware implementations for AR/VR computing.

Application & Algorithm:

  • Efficient DNN computing, pruning, quantization, NAS

  • Recent interest: Efficient LM KV caching, LM quantization, AI Privacy for AR/VR

Hardware Architecture:

  • Domain-specific accelerator for compute-intensive AI applications, New computation paradigm for DNN

  • Recent interest: AI accelerator for Efficieitn AR/VR

Others:

  • Multi-agent reinforcement learning and its application

  • AI compiler

News

[9/2024] One paper got accepted at ASPDAC'25!

[8/2024] Serving as the session chair of ISLPED'24.

[7/2024] One paper got accepted at MLCAD'24!

[7/2024] I am serving as a PC of HPCA'25.

[6/2024] I delivered a talk on Efficient LLM and Accelerator Design to Andes Technology.

[6/2024] One paper get accepted at ICPP'24! 

[5/2024] Our paper on AR/VR system simulation got accepted at ACM TODAES! 

[5/2024] Two papers get accepted at ISLPED'24! 

[4/2024] Check our our latest survey on Parameter Efficient Fine-tuning (PEFT) for Large Models. This work is done with my talented intern students. From algorithm design to hardware efficiency and system implementation, this comprehensive survey covers multiple aspects of PEFT research nowadays. 

[3/2024] One paper with my intern student wenshuo, got accepted in NAACL'24!

[2/2024] Two papers accepted in ISQED'24.

[11/2023] Serving as TPC for DAC'24 and ISQED'24.

[10/2023] Our paper "Co-Designing AI Models and DRAMs for On-Device Training" is accepted by HPCA 2024!

[9/2023] Our paper on efficient reinforcement learning, which I co-authored with my high school mentee, Gavin An, has been accepted for publication in JEI.

[6/2023] I gave two talks on DNN hardware and algorithm codesign at Tsinghua University and Peking University.

[5/2023] Our paper "Co-Designing AI Models and DRAMs for On-Device Training" is submitted to Arxiv. This paper proposes an algorithm/hardware codesign solution for efficient on-chip transfer learning which completely eliminates the off-chip DRAM traffic during the training process.

[3/2023] My high school mentee, Gavin An, has successfully finished his AI project on efficient reinforcement learning. A paper got accepted in JEI!

[9/2022] Start working at Meta!

[7/2022] Our paper “Hyperspherical Federated Learning" is accepted by ECCV 2022!

[7/2022] I gave an invited talk at AI times on multi-agent reinformcent learning and its applications.

[6/2022] I gave an invited talk at IEEE Dallas Circuits and Systems Conference (DCAS), 2022.

[4/2022] I started my AI memtorship at Veritas AI!

[2/2022] I started my postdoc study at Harvard!

[12/2021] I successfully defended my PhD!

[11/2021] Our paper “Learning Advanced Client Selection Strategy for Federated Learning" is accepted by AAAI 2022!

[10/2021] Our paper “FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding" is accepted by IEEE HPCA 2022!

[06/2021] Finished my internship at Microsoft, such a great place for research!

[03/2021] I started my virtual internship at Microsoft Research, Redmond.

[03/2021] I gave a guest lecture on DNN accelerator design on Havard Course ES201, hosted by Prof. Demba Ba.

[01/2021] One paper got accepted by IEEE International Symposium on Circuits & Systems (ISCAS), 2021.

[12/2020] I presented (virtually) our work "Succinct and Robust Multi-Agent Communication With Temporal Message Control" in NeurIPS 2020.

[11/2020] I presented (virtually) our work "Term quantization: furthering quantization at run time" in SC 2020.

[11/2020] Our paper “Training for Multi-resolution Inference Using Reusable Quantization Terms" is accepted by ACM ASPLOS 2021!

[09/2020] Our paper "Succinct and Robust Multi-Agent Communication With Temporal Message Control" is accepted by NeurIPS 2020!

[08/2020] I presented our work "Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN" in ICPP 2020.

[06/2020] Our paper "Term Revealing: Furthering Quantization at Run Time on Quantized DNNs" is accepted by ACM/IEEE SC 2020!

[05/2020] One paper "Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN" is accepted by ACM ICPP 2020!

[02/2020] One paper is accepted by IEEE Symposium on Security and Privacy (S&P) Deep Learning and Security workshop, 2020.