Chao Zhang

Table of Contents

Quick Links

Assistant Professor
School of Computational Science and Engineering
College of Computing
Georgia Institute of Technology

Office: CODA 1309
Address: 756 W Peachtree St NW, Atlanta, GA 30308
Email: chaozhang@gatech.edu

Research

My research focuses on developing machine learning and data-driven models to address practical and challenging problems in science and engineering. I am particularly interested in the following topics:

  • Large Language Models – Studying how to use language as a general interface and computation to solve different tasks.
  • Learning from Weak Supervision – Teaching machines to learn from incomplete and limited data.
  • Uncertainty Quantification and Decision Making – Developing machine learning models that can handle and account for uncertainties to make informed decisions.
  • Spatiotemporal Dynamics and Design – Using machine learning to simulate and forecast spatiotemporal dynamics (e.g., molecular simulation) and optimize and design spatiotemporal systems.

On the application side, I am passionate about interdisciplinary research and enjoy developing data-driven solutions to accelerate scientific discovery through close collaboration with domain experts. The techniques I develop are motivated by applications in material science, biomedical science, transportation, and public health.

Acknowledgment: My work has been generously supported by research funding/gift from NSF (IIS CAREER-2144338, IIS-2106961, IIS-2008334), ONR MURI , Kolon, HomeDepot, and Adobe. My work has also been recognized by an NSF CAREER Award, a Facebook Faculty Award, an Amazon AWS Machine Learning Research Award, a Google Faculty Research Award, a Kolon Faculty Fellowship, an ACM SIGKDD Dissertation Runner-up Award, and several paper awards from IMWUT (UbiComp), ECML/PKDD, and ML4H.

Projects

Below are the main research projects at my group and some recent representative works:

Publications

(* denotes equal contribution)

2023

  • Local Boosting for Weakly-Supervised Learning
    Rongzhi Zhang, Yue Yu, Jiaming Shen, Xiquan Cui, Chao Zhang
    ACM SIGKDD Conference on Knowledge Discovery and Pattern Mining (KDD), 2023
  • DyGen: Fine-Tuning Language Models with Noisy Labels by Dynamics-Enhanced Generative Modeling
    Yuchen Zhuang, Yue Yu, Lingkai Kong, Xiang Chen, Chao Zhang
    ACM SIGKDD Conference on Knowledge Discovery and Pattern Mining (KDD), 2023
  • When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting
    Harshavardhan Kamarthi, Lingkai Kong, Alexander Rodríguez, Chao Zhang, B. Aditya Prakash
    ACM SIGKDD Conference on Knowledge Discovery and Pattern Mining (KDD), 2023
  • Cold-start Data Selection for Better Few-shot Fine-tuning of Pretrained Language Models
    Yue Yu, Rongzhi Zhang, Ran Xu, Jieyu Zhang, Jiaming Shen and Chao Zhang
    Annual Meeting of the Association for Computational Linguistics (ACL), 2023
  • Zero-Shot Text Classification by Training Data Creation with Progressive Dense Retrieval
    Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen and Chao Zhang
    Findings of Annual Meeting of the Association for Computational Linguistics (ACL), 2023
  • Graph Reasoning for Question Answering with Triplet Retrieval
    Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang and Bing Yin
    Findings of Annual Meeting of the Association for Computational Linguistics (ACL), 2023
  • Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites
    Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang, Tuo Zhao
    Annual Meeting of the Association for Computational Linguistics (ACL), 2023
  • Extracting Shopping Interest-Related Product Types from the Web
    Yinghao Li, Colin Lockard, Prashant Shiralkar and Chao Zhang
    Findings of Annual Meeting of the Association for Computational Linguistics (ACL), 2023
  • Autoregressive Diffusion Model for Graph Generation
    Lingkai Kong, Jiaming Cui, Haotian Sun, Yuchen Zhuang, B. Aditya Prakash, Chao Zhang
    International Conference on Machine Learning (ICML), 2023
  • SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process
    Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha
    International Conference on Machine Learning (ICML), 2023
  • Unsupervised Event Chain Mining from Multiple Documents
    Yizhu Jiao, Ming Zhong, Jiaming Shen, Yunyi Zhang, Chao Zhang and Jiawei Han
    The Web Conference (WWW), 2023
  • Mutually-paced Knowledge Distillation for Cross-lingual Temporal Knowledge Graph Reasoning
    Ruijie Wang, Zheng Li, Jingfeng Yang, Tianyu Cao, Chao Zhang, Bing Yin, Tarek Abdelzaher
    The Web Conference (WWW), 2023
  • Neighborhood-regularized Self-Training for Learning with Few Labels
    Ran Xu, Yue Yu, Hejie Cui, Xuan Kan, Yanqiao Zhu, Joyce C. Ho, Chao Zhang and Carl Yang.
    AAAI Conference on Artificial Intelligence (AAAI), 2023.
  • A General-Purpose Material Property Data Extraction Pipeline from Large Polymer Corpora Using Natural Language Processing
    Pranav Shetty, Arunkumar Chitteth Rajan, Christopher Kuenneth, Sonkakshi Gupta, Lakshmi Prerana Panchumarti, Lauren Holm, Chao Zhang, Rampi Ramprasad
    npj Comput Materials 9(52), 2023

2022

2021

2020

2019

2018

2017

Earlier

Awards

  • 2022 ML4H Outstanding Paper Award
  • 2022 NSF Career Award
  • 2021 Facebook Faculty Research Award
  • 2021 Kolon Faculty Fellowship
  • 2020 Amazon AWS Machine Learning Research Award
  • 2020 Google Faculty Research Award
  • 2019 ACM SIGKDD Dissertation Award Runner-up
  • 2018 ACM IMWUT Distinguished Paper Award
  • 2015 ECML/PKDD Best Student Paper Runner-up Award
  • 2013 Chiang Chen Overseas Graduate Fellowship

Software

  • SDE-Net: Efficient uncertainty estimation for deep neural networks
  • CHMM: BERT-conditional hidden Markov model for multi-source weakly-supervised learning
  • COSINE: Language model fine-tuning with weak supervision
  • BOND: Distantly-supervised named entity recognition
  • STEAM: Automatic taxonomy expansion
  • TaxoGen: Unsupervised topic taxonomy construction from text corpus
  • WestClass: Weakly-supervised text classification
  • GeoBurst: Unsupervised spatiotemporal event detection

Teaching

Students

Prospective students: I am always looking for strong and motivated students to join our group. If you are interested in working with me, you can either email me or fill out this form.

Current:

  • Rui Feng: Ph.D. Student in CS
  • Lingkai Kong: Ph.D. Student in CSE
  • Yinghao Li: Ph.D. Student in ML
  • Haorui Wang: Ph.D. Student in CSE
  • Kuan Wang: Ph.D. Student in CSE
  • Yue Yu: Ph.D. Student in CSE
  • Rongzhi Zhang: Ph.D. Student in ML
  • Yuchen Zhuang: Ph.D. Student in ML
  • Binghong Chen: Ph.D. Student in CSE (co-advised with Prof. Le Song)
  • Pranav Shetty: Ph.D. Student in ML (JP Morgan AI Ph.D. Fellowship, co-advised with Prof. Rampi Ramprasad)
  • Vidit Jain: M.S. Student in CS
  • Mukund Rungta: M.S. Student in CS
  • Junyang Zhang: M.S. Student in CS
  • Haotian Sun: M.S. Student in CS

Alumni:

  • Yanbo Xu: Ph.D., Graduated in 2023 (First Employment: Microsoft Research)
  • Piyush Patil: M.S. Student in CS
  • Mengyang Liu: M.S. Student in CSE
  • Isaac Rehg: M.S. in CS
  • Wendi Ren: M.S. in CSE
  • Ruijia Wang: M.S. in CSE
  • Yi Rong: Visiting Ph.D. Student