The course will focus on recent advancements in the design of efficient neural networks, specifically on how to create and optimize AI models for improved performance, scalability, and resource efficiency. Students will explore key techniques like model compression, pruning, quantization, and model distillation for CNN, RNN, Transformer and LLM, aimed at reducing computational complexity and memory usage while maintaining accuracy. Additionally, the course will cover efficient training and inference methods, including distributed computing, parallelism, and low-precision computation, which are crucial for deploying AI on resource-limited devices such as smartphones or edge computing systems. Students will also study advanced hardware architectures of AI system, AI compiler and hardware accelerators, gaining insights into the hardware implementation of neural network computations on these specialized systems.
Lecture Time: Friday 5:00-7:30pm EST (Zoom)
Lecture Location: Jacobs Hall, 6 Metrotech, Room 475
Readings: Course slides and papers
Suggested readings: Goodfellow, Ian. "Deep learning." (2016). https://www.deeplearningbook.org/
Evaluation Breakdown:
Assignments (30%): total three of them, each counts 10%
In-course quiz (15%)
In-course presentation (5%)
Midterm (25%)
Final project (25%)
Proposal (1 page) 5%
Final presentation 10%
Final report 10%
Late Submission Policy:
In-course Presentation Instruction
Contact:
When you send your email, please start the email title with three categories to indicate the type of the question: [Algorithm], [System], [Others].
For other inquiries, personal matters, or emergencies, you can email me at sai.zhang@nyu.edu
Date | Topic | Logistics |
Jan 23 | Lecture 1: Course Introduction, Basic topics of DNN (Recording) | |
Jan 30 | Lecture 2: Convolutional Neural Networks | |
Feb 6 | Lecture 3: Intro to Transformers and Large Models | |
Feb 13 | Lecture 4: Neural Network Pruning | Assignment 1 |
Feb 20 | Lecture 5: Neural Network Quantization | |
Feb 27 | Lecture 6: Distillation, Low Rank Decomposition and Neural Architecture Search | Assignment 1 due Assignment 2 is out |
Mar 6 | Lecture 7: Efficient Algorithm for Large Model | |
Mar 13 | Lecture 8: Efficient Neural Network Training | Assignment 2 due Assignment 3 is out |
Mar 20 | Spring Break | Project proposal due |
Mar 27 | In-class midterm exam | |
Apr 3 | Lecture 9: Distributed Machine Learning for Training and Inference, ML Compiler | Assignment 3 due |
Apr 10 | Lecture 10: CNN Dataflow & Hardware Accelerators | |
Apr 17 | Lecture 11: Transformer & LLM Accelerators | |
Apr 24 | Lecture 12: Hardware Accelerator for DNN Training | |
May 1 | Lecture 13: New Computation Paradigms | |
May 8 | Final Presentation |