CSE8803 DLT: Deep Learning for Text Data (2024 Fall)

Logistics
Learning Objective
Schedule
Grading
More Resources

Logistics

Instructor: Chao Zhang
Teaching Assistant: Yinghao Li (yinghaoli@gatech.edu); Haorui Wang (hwang984@gatech.edu)
Piazza: https://piazza.com/gatech/fall2024/cse8803dlt
Office Hours:
- Instructor Office Hour: Tue 12:15-1PM, Open area outside Klaus 1447
- TA Office Hour: Thu 12:15-1PM, Klaus 3121

Learning Objective

This course will introduce state-of-the-art machine learning techniques for mainstay problems in text data analysis, with particular emphasis on deep learning methods and large language models that have recently achieved enormous success. Students will learn about trending problems in this field, key methods for solving these problems, and their advantages and disadvantages. The students are also expected to read, present, and discuss research papers, as well as conduct a research oriented course project.

The learning objective is that by the end of this course, the students are able to formulate their text analysis problems at hand, choose appropriate models for the problems, and even come up with innovative solutions for solving open research problems in this field. The course will be useful for students who want to solve practical problems involving text data, and also for those who want to do edge-cutting research in text mining, natural language processing, artificial intelligence, and NLP-driven interdisciplinary research.

Prerequisites for this course: the students should be familiar with machine learning and have taken a relevant course before (e.g., CX4240, CSE6740, CS4641); (2) the students should be comfortable with reading research papers and giving presentations; (3) the students should have solid programming skills—the course project can be programming demanding.

Schedule

Date	Topic	Presentation	Due
08/20/2024	Course Overview		Piazza Signup; Paper Pickup
08/22/2024	Machine Learning Review
08/27/2024	Embedding & Representation Learning		Paper Presentation Signup Open Aug 27
08/29/2024	Project Guideline and Examples		Paper Presentation Signup Close
09/03/2024	Module 1: Transformers
09/05/2024	Attention & Transformer	P1 and P2	HW1 Out
09/10/2024	Mixture-Of-Experts	P3 and P4
09/12/2024	Fast Attention	P5 and P6
09/17/2024	Module 2: Language Model Pre-Training		HW1 Due
09/19/2024	Encoder-Only & Encoder-Decoder (BERT, T5)	P7 and P8
09/24/2024	Decoder-Only (GPT3, LLaMa)	P9 and P10
09/26/2024	Scalable Training	P11 and P12	HW2 Out
10/01/2024	Scalable Inference	P13 and P14
10/03/2024	Module 3: LLM Instruction Fine-Tuning
10/08/2024	Prompting Techniques	P15 and P16
10/10/2024	Project Checkpoint		HW2 Due
10/15/2024	No Class (Fall Break)
10/17/2024	Instruction Fine-Tuning	P17 and P18
10/22/2024	Efficient Fine-Tuning	P19 and P20	HW3 Out
10/24/2024	Module 4: LLM Alignment
10/29/2024	Reward Modeling	P21 and P22
10/31/2024	RLHF Algorithms	P23 and P24
11/05/2024	RL from AI Feedback	P25 and P26	Project Pre-Signup Open
11/07/2024	Module 5: Multimodal LLM & LLM Agent		HW3 Due
11/12/2024	Multimodal LLMs	P27 and P28	Project Signup Open Nov.11
11/14/2024	LLM Agents	P29 and P30
11/19/2024	Project Presentations
11/21/2024	Project Presentations
11/26/2024	Project Presentations
11/28/2024	No Class
12/03/2024	No Class
12/08/2024			Project Report Due

Disclaimer: The instructor reserves the right to modify the planned schedule and grading policy as needed during the course.

Grading

Homework (30%)

There will be three assignments. Each one will test your understanding of the taught methods or the presented papers.
Late policy: Assignments are due at 11:59PM of the due date. You will be allowed 2 total late days without penalty for the entire semester. Once those days are used, you will be penalized according to the following policy:
- Homework is worth full credit before the due time.
- It is worth 75% credit for the next 24 hours.
- It is worth 50% credit for the second next 24 hours.
- It is worth zero credit after that.
Follow the Georgia Tech Academic Honor Code.

Paper Presentation (25%)

We have 30 papers to study, and you will need to pick one paper from the list to present. The paper list and presentation sign-up sheet is available here (will be open for signup on Aug 27 at 3PM ET).
Each presentation is 20 minutes, plus 10 minutes for Q&A and discussion. Each presentation can be done by up to three presenters.
You need to post your slides by 9pm EST the night before your presentation.
The presentation will be graded by the instructor according to the following criteria: quality of slides, presentation clearness, and question addressing. Your presentation should cover at least the following aspects: 1) What is the problem and background? 2) What are the main challenges of the problem? 3) How does the proposed method work? 4) What are the experimental results and observations?
If you miss the presentation, unfortunately you will receive zero credit.
Useful tips for presentation:
- Presentation Tips by Jeff Radel
- Oral Presentation Advice by Mark D. Hill

Project (45%)

You need to complete a project on deep learning for text data. Your project needs to be clear about 1) the problem you are attempting to solve; 2) a survey of existing literature for the problem; 3) the technical method you propose in order to solve the problem; 4) the results and conclusion you attain.

Each project needs to be completed in a team of 2-5 people. Team members need to clearly claim their contributions in the project report.
You will need to do the following:
- Presentation (20%): group-wise project presentation
- Final report (25%): a complete and final project report
The presentation schedule is available here (will be open for signup Nov 16 at 3pm)
Here are some project guidelines and resources you may find useful.

More Resources

Speech and Language Processing, by Dan Jurafsky and James H. Martin
Deep Learning for NLP
Deep Learning, by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
Dive into Deep Learning, by Aston Zhang, Zack C. Lipton, Mu Li, and Alex Smola

Other resources, such as deep learning toolboxes and datasets, will be provided throughout the course.