Jiawei Liu

I’m a Ph.D. candidate (2021-Present) in the area of Programming Language/Formal Methods/Software Engineering at University of Illinois Urbana-Champaign, advised by Lingming Zhang. My research aims to improve software quality and developer productivity.

📬 Shortest path to find me: jiawei6@illinois.edu

Papers

ISSTA’25 / Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging

Siyuan Feng*, Jiawei Liu*, Ruihang Lai, Charlie F. Ruan, Yong Yu, Lingming Zhang, Tianqi Chen

Proceedings of the 34th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2025 (To Appear)

PAPER Bib Artifact

@inproceedings{feng2025productively,
  title = {Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging},
  author = {Feng, Siyuan and Liu, Jiawei and Lai, Ruihang and Ruan, Charlie F. and Yu, Yong and Zhang, Lingming and Chen, Tianqi},
  booktitle = {Proceedings of the 34th ACM SIGSOFT International Symposium on Software Testing and Analysis},
  year = {2025},
}

ICLR’25 Oral / BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Terry Yue Zhuo, Vu Minh Chien, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul and 23 more authors

The Thirteenth International Conference on Learning Representations. 2025

PAPER Bib

@inproceedings{zhuo2025bigcodebench,
  title = {BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions},
  author = {Zhuo, Terry Yue and Chien, Vu Minh and Chim, Jenny and Hu, Han and Yu, Wenhao and Widyasari, Ratnadira and Yusuf, Imam Nur Bani and Zhan, Haolan and He, Junda and Paul, Indraneil and Brunner, Simon and GONG, Chen and Hoang, James and Zebaze, Armel Randy and Hong, Xiaoheng and Li, Wen-Ding and Kaddour, Jean and Xu, Ming and Zhang, Zhihan and Yadav, Prateek and Jain, Naman and Gu, Alex and Cheng, Zhoujun and Liu, Jiawei and Liu, Qian and Wang, Zijian and Lo, David and Hui, Binyuan and Muennighoff, Niklas and Fried, Daniel and Du, Xiaoning and de Vries, Harm and Werra, Leandro Von},
  booktitle = {The Thirteenth International Conference on Learning Representations},
  year = {2025},
  url = {https://openreview.net/forum?id=YrycTjllL0},
}

NeurIPS’24 / SelfCodeAlign: Self-Alignment for Code Generation

Yuxiang Wei, Federico Cassano, Jiawei Liu, Yifeng Ding, Naman Jain, Zachary Mueller, Harm Vries, Leandro Von Werra, Arjun Guha, Lingming Zhang

The Thirty-eighth Annual Conference on Neural Information Processing Systems. 2024

PAPER Bib Website

@inproceedings{wei2024selfcodealign,
  title = {SelfCodeAlign: Self-Alignment for Code Generation},
  author = {Wei, Yuxiang and Cassano, Federico and Liu, Jiawei and Ding, Yifeng and Jain, Naman and Mueller, Zachary and de Vries, Harm and Werra, Leandro Von and Guha, Arjun and Zhang, Lingming},
  booktitle = {The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year = {2024},
  url = {https://openreview.net/forum?id=xXRnUU7xTL},
}

Pre-print / Learning Code Preference via Synthetic Evolution

Jiawei Liu, Thanh Nguyen, Mingyue Shang, Hantian Ding, Xiaopeng Li, Yu Yu, Varun Kumar, Zijian Wang

arXiv preprint arXiv:2410.03837. 2024

PAPER Bib Website

@article{liu2024learning,
  title = {Learning Code Preference via Synthetic Evolution},
  author = {Liu, Jiawei and Nguyen, Thanh and Shang, Mingyue and Ding, Hantian and Li, Xiaopeng and Yu, Yu and Kumar, Varun and Wang, Zijian},
  journal = {arXiv preprint arXiv:2410.03837},
  year = {2024},
}

COLM’24 / Evaluating Language Models for Efficient Code Generation

Jiawei Liu, Songrun Xie, Junhao Wang, Yuxiang Wei, Yifeng Ding, Lingming Zhang

First Conference on Language Modeling. 2024

PAPER Bib Poster Website 🤗 HF

@inproceedings{liu2024evaluating,
  title = {Evaluating Language Models for Efficient Code Generation},
  author = {Liu, Jiawei and Xie, Songrun and Wang, Junhao and Wei, Yuxiang and Ding, Yifeng and Zhang, Lingming},
  booktitle = {First Conference on Language Modeling},
  year = {2024},
  url = {https://openreview.net/forum?id=IBCBMeAhmC},
}

OOPSLA’24 / WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models

Chenyuan Yang, Yinlin Deng, Runyu Lu, Jiayi Yao, Jiawei Liu, Reyhaneh Jabbarvand, Lingming Zhang

Proc. ACM Program. Lang. 8 (OOPSLA2). Oct 2024

PAPER Bib

@article{yang2023white,
  author = {Yang, Chenyuan and Deng, Yinlin and Lu, Runyu and Yao, Jiayi and Liu, Jiawei and Jabbarvand, Reyhaneh and Zhang, Lingming},
  title = {WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models},
  year = {2024},
  issue_date = {October 2024},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  volume = {8},
  number = {OOPSLA2},
  url = {https://doi.org/10.1145/3689736},
  doi = {10.1145/3689736},
  journal = {Proc. ACM Program. Lang.},
  month = oct,
  articleno = {296},
  numpages = {27},
}

ICML’24 / Magicoder: Empowering Code Generation with OSS-Instruct

Yuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang

Forty-first International Conference on Machine Learning. Oct 2024

Adopted by Meta Llama 3.1, Google CodeGemma, and IBM Granite

PAPER Bib Slides

@inproceedings{wei2023magic,
  title = {Magicoder: Empowering Code Generation with {OSS}-Instruct},
  author = {Wei, Yuxiang and Wang, Zhe and Liu, Jiawei and Ding, Yifeng and Zhang, Lingming},
  booktitle = {Forty-first International Conference on Machine Learning},
  year = {2024},
  url = {https://openreview.net/forum?id=XUeoOBid3x},
}

ACL’24 / XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts

Yifeng Ding, Jiawei Liu, Yuxiang Wei, Lingming Zhang

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Aug 2024

PAPER Bib

@inproceedings{ding2024xft,
  title = {{XFT}: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts},
  author = {Ding, Yifeng and Liu, Jiawei and Wei, Yuxiang and Zhang, Lingming},
  editor = {Ku, Lun-Wei and Martins, Andre and Srikumar, Vivek},
  booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  month = aug,
  year = {2024},
  address = {Bangkok, Thailand},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2024.acl-long.699},
  doi = {10.18653/v1/2024.acl-long.699},
  pages = {12941--12955},
}

Pre-print / StarCoder 2 and The Stack v2: The Next Generation

Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei and 56 more authors

arXiv preprint arXiv:2402.19173. Aug 2024

PAPER Bib

@article{Lozhkov2024StarCoder2A,
  title = {StarCoder 2 and The Stack v2: The Next Generation},
  author = {Lozhkov, Anton and Li, Raymond and Allal, Loubna Ben and Cassano, Federico and Lamy-Poirier, Joel and Tazi, Nouamane and Tang, Ao and Pykhtar, Dmytro and Liu, Jiawei and Wei, Yuxiang and Liu, Tianyang and Tian, Max and Kocetkov, Denis and Zucker, Arthur and Belkada, Younes and Wang, Zijian and Liu, Qian and Abulkhanov, Dmitry and Paul, Indraneil and Li, Zhuang and Li, Wen-Ding and Risdal, Megan L. and Li, Jia and Zhu, Jian and Zhuo, Terry Yue and Zheltonozhskii, Evgenii and Dade, Nii Osae Osae and Yu, Wenhao and Krauss, Lucas and Jain, Naman and Su, Yixuan and He, Xuanli and Dey, Manan and Abati, Edoardo and Chai, Yekun and Muennighoff, Niklas and Tang, Xiangru and Oblokulov, Muhtasham and Akiki, Christopher and Marone, Marc and Mou, Chenghao and Mishra, Mayank and Gu, Alexander and Hui, Binyuan and Dao, Tri and Zebaze, Armel and Dehaene, Olivier and Patry, Nicolas and Xu, Canwen and McAuley, Julian and Hu, Han and Scholak, Torsten and Paquet, S{\'e}bastien and Robinson, Jennifer and Anderson, Carolyn Jane and Chapados, Nicolas and Patwary, Mostofa and Tajbakhsh, Nima and Jernite, Yacine and Ferrandis, Carlos Mu{\~n}oz and Zhang, Lingming and Hughes, Sean and Wolf, Thomas and Guha, Arjun and von Werra, Leandro and de Vries, Harm},
  journal = {arXiv preprint arXiv:2402.19173},
  year = {2024},
}

NeurIPS’23 / Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

Jiawei Liu*, Chunqiu Steven Xia*, Yuyao Wang, Lingming Zhang

Thirty-seventh Conference on Neural Information Processing Systems. Aug 2023

Over 700k HuggingFace downloads; integrated by various industries

PAPER Bib Slides Website 🤗 HF

@inproceedings{liu2023is,
  title = {Is Your Code Generated by Chat{GPT} Really Correct? Rigorous Evaluation of Large Language Models for Code Generation},
  author = {Liu, Jiawei and Xia, Chunqiu Steven and Wang, Yuyao and Zhang, Lingming},
  booktitle = {Thirty-seventh Conference on Neural Information Processing Systems},
  year = {2023},
  url = {https://openreview.net/forum?id=1qvx610Cu7},
}

FSE’23 / NeuRI: Diversifying DNN Generation via Inductive Rule Inference

Jiawei Liu, Jinjun Peng, Yuyao Wang, Lingming Zhang

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Aug 2023

🏆 ACM SIGSOFT Distinguished Paper Award

PAPER Bib Slides Artifact

@inproceedings{liu2023neuri,
  title = {NeuRI: Diversifying DNN Generation via Inductive Rule Inference},
  author = {Liu, Jiawei and Peng, Jinjun and Wang, Yuyao and Zhang, Lingming},
  year = {2023},
  isbn = {9798400703270},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3611643.3616337},
  doi = {10.1145/3611643.3616337},
  booktitle = {Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
  pages = {657--669},
  numpages = {13},
  location = {San Francisco, CA, USA},
  series = {ESEC/FSE 2023},
}

ASPLOS’23 / NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers

Jiawei Liu*, Jinkun Lin*, Fabian Ruffy, Cheng Tan, Jinyang Li, Aurojit Panda, Lingming Zhang

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2. Aug 2023

🏆 Distinguished Artifact Award

PAPER Bib Poster Slides Artifact

@inproceedings{liu2023nnsmith,
  title = {NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers},
  author = {Liu, Jiawei and Lin, Jinkun and Ruffy, Fabian and Tan, Cheng and Li, Jinyang and Panda, Aurojit and Zhang, Lingming},
  year = {2023},
  isbn = {9781450399166},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3575693.3575707},
  doi = {10.1145/3575693.3575707},
  booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2},
  pages = {530--543},
  numpages = {14},
  keywords = {Deep Learning Compilers, Compiler Testing, Fuzzing},
  location = {Vancouver, BC, Canada},
  series = {ASPLOS 2023},
}

OOPSLA’22 / Coverage-guided tensor compiler fuzzing with joint IR-pass mutation

Jiawei Liu, Yuxiang Wei, Sen Yang, Yinlin Deng, Lingming Zhang

Proceedings of the ACM on Programming Languages 6 (OOPSLA1). Apr 2022

PAPER Bib Slides Artifact

@article{liu2022coverage,
  title = {Coverage-guided tensor compiler fuzzing with joint IR-pass mutation},
  author = {Liu, Jiawei and Wei, Yuxiang and Yang, Sen and Deng, Yinlin and Zhang, Lingming},
  journal = {Proceedings of the ACM on Programming Languages},
  volume = {6},
  number = {OOPSLA1},
  pages = {1--26},
  year = {2022},
  publisher = {ACM New York, NY, USA},
  url = {https://doi.org/10.1145/3527317},
  doi = {10.1145/3527317},
  month = apr,
  articleno = {73},
}

MM’21 OSC / Fast and Flexible Human Pose Estimation with HyperPose

Yixiao Guo*, Jiawei Liu*, Guo Li*, Luo Mai, Hao Dong

Proceedings of the 29th ACM International Conference on Multimedia. Apr 2021

PAPER Bib

@inproceedings{guo2021fast,
  author = {Guo, Yixiao and Liu, Jiawei and Li, Guo and Mai, Luo and Dong, Hao},
  title = {Fast and Flexible Human Pose Estimation with HyperPose},
  year = {2021},
  isbn = {9781450386517},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3474085.3478325},
  doi = {10.1145/3474085.3478325},
  booktitle = {Proceedings of the 29th ACM International Conference on Multimedia},
  pages = {3763--3766},
  numpages = {4},
  keywords = {computer vision, high-performance computing, pose estimation},
  location = {Virtual Event, China},
  series = {MM '21},
}

Awards & Honors

Illinois Innovation Award ($20K) 2025

Amazon AICE Ph.D. Fellowship ($70K) 2025

Machine Learning and Systems Rising Stars 2024

Warren W. Yee Memorial Fellowship, University of Illinois 2024

ACM SIGSOFT Distinguished Paper Award (FSE'23) 2023

Distinguished Artifact Award (ASPLOS'23) 2023

Invited Talk

NLP+SE Seminar, UT Austin: Smelling the Quality of LLM-generated Code Mar 2025

Programming Systems, Uber: Evaluating LLMs for Correct & Efficient Code Generation Sept 2024

ARiSE Lab, Columbia University: Simplify the Making of Great Software in the ML Era April 2024

Snowflake GenAI: Rigorous Evaluation of LLMs for Code (Slides) Feb 2024

AST Lab, ETH Zürich: Generating Test-Cases for ML Compilers (Slides) Jan 2024

GAI4SE, NC State University: LLMs for Software Testing (Guest Lecture) Nov 2023

Apache TVM Conference: Automating DL Compiler Bug Finding with NNSmith Mar 2023

SAMPL, University of Washington: Coverage-Guided Tensor Compiler Fuzzing (Slides) May 2022

Service

Organizing: LLM4Code@ICSE'{24,25} (Publicity Chair)

Program Committee/Reviewer: ASE'24, TSE, TOSEM, NeurIPS'24, ICLR'25

Artifact Evaluation Committee: PLDI'23, OSDI'22, ATC'22

Teaching

CS 427 by Darko Marinov Spring 2025