The course will focus on recent advancements in the design of efficient neural networks, specifically on how to create and optimize AI models for improved performance, scalability, and resource efficiency. Students will explore key techniques like model compression, pruning, quantization, and model distillation for CNN, RNN, Transformer and LLM, aimed at reducing computational complexity and memory usage while maintaining accuracy. Additionally, the course will cover efficient training and inference methods, including distributed computing, parallelism, and low-precision computation, which are crucial for deploying AI on resource-limited devices such as smartphones or edge computing systems. Students will also study advanced hardware architectures of AI system, AI compiler and hardware accelerators, gaining insights into the hardware implementation of neural network computations on these specialized systems.
Lecture Time: Wednesday 5:00-7:30pm EST (Zoom)
Lecture Location: Room 275, NYU Global Center for Academic and Spiritual Life (GCASL)
Readings: Course slides and papers
Suggested readings: Goodfellow, Ian. "Deep learning." (2016). https://www.deeplearningbook.org/
Evaluation Breakdown:
Assignments (30%): total three of them, each counts 10%
In-course quiz (10%)
In-course presentation (10%)
Midterm (25%)
Final project (25%)
Proposal (1 page) 5%
Final presentation 15%
Final report 10%
Late Submission Policy:
Contact:
When you send your email, please start the email title with three categories to indicate the type of the question: [Algorithm], [System], [Others].
For other inquiries, personal matters, or emergencies, you can email me at sai.zhang@nyu.edu

Date | Topic | Logistics |
Jan 21 | Lecture 0: Course Introduction | |
Jan 28 | Lecture 2: Convolutional Neural Networks | |
Feb 4 | Lecture 3: Intro to Transformers and Large Models | |
Feb 11 | Lecture 4: Neural Network Pruning | Assignment 1 is posted on Brightspace |
Feb 18 | Lecture 5: Neural Network Quantization | |
Feb 25 | Lecture 6: Distillation, Low Rank Decomposition and Neural Architecture Search | Assignment 1 due Assignment 2 is on Brightspace |
Mar 4 | Lecture 7: Efficient Algorithm for Large Model | |
Mar 11 | Lecture 8: Efficient Neural Network Training | Assignment 2 due Assignment 3 is on Brightspace |
Mar 18 | Spring Break | Project proposal due |
Mar 25 | In-class midterm exam | |
Apr 1 | Lecture 9: Distributed Machine Learning for Training and Inference | Assignment 3 due |
Apr 8 | Lecture 10: Machine Learning Compiler and System | |
Apr 15 | Lecture 11: AI Accelerator Introduction and CNN Accelerators | |
Apr 22 | Lecture 12: Guest Lecture and Transformer & NN Training Accelerators | |
Apr 29 | Lecture 13: Guest Lecture and Efficient AR/VR Computing | |
May 13 | Final Presentation |