Jiawei Liu

I’m a Ph.D. candidate (2021-Present) in the area of Programming Language/Formal Methods/Software Engineering at University of Illinois Urbana-Champaign, advised by Lingming Zhang. My research aims to improve software quality and developer productivity.

📬 Shortest path to find me: jiawei6@illinois.edu

Papers

  1. ISSTA’25 / Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging
    Proceedings of the 34th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2025 (To Appear)
  2. ICLR’25 Oral / BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
    Terry Yue Zhuo, Vu Minh Chien, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul and 23 more authors
    The Thirteenth International Conference on Learning Representations. 2025
  3. NeurIPS’24 / SelfCodeAlign: Self-Alignment for Code Generation
    Yuxiang Wei, Federico Cassano,  Jiawei LiuYifeng Ding, Naman Jain, Zachary Mueller, Harm Vries, Leandro Von Werra, Arjun Guha, Lingming Zhang
    The Thirty-eighth Annual Conference on Neural Information Processing Systems. 2024
  4. Pre-print / Learning Code Preference via Synthetic Evolution
    arXiv preprint arXiv:2410.03837. 2024
  5. COLM’24 / Evaluating Language Models for Efficient Code Generation
    Jiawei Liu, Songrun Xie, Junhao Wang, Yuxiang WeiYifeng DingLingming Zhang
    First Conference on Language Modeling. 2024
  6. OOPSLA’24 / WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models
    Proc. ACM Program. Lang. 8 (OOPSLA2). Oct 2024
  7. ICML’24 / Magicoder: Empowering Code Generation with OSS-Instruct
    Forty-first International Conference on Machine Learning. Oct 2024
    Adopted by Meta Llama 3.1, Google CodeGemma, and IBM Granite
  8. ACL’24 / XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
    Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Aug 2024
  9. Pre-print / StarCoder 2 and The Stack v2: The Next Generation
    Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar,  Jiawei LiuYuxiang Wei and 56 more authors
    arXiv preprint arXiv:2402.19173. Aug 2024
  10. NeurIPS’23 / Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
    Thirty-seventh Conference on Neural Information Processing Systems. Aug 2023
    Over 700k HuggingFace downloads; integrated by various industries
  11. FSE’23 / NeuRI: Diversifying DNN Generation via Inductive Rule Inference
    Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Aug 2023
    🏆 ACM SIGSOFT Distinguished Paper Award
  12. ASPLOS’23 / NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers
    Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2. Aug 2023
    🏆 Distinguished Artifact Award
  13. OOPSLA’22 / Coverage-guided tensor compiler fuzzing with joint IR-pass mutation
    Proceedings of the ACM on Programming Languages 6 (OOPSLA1). Apr 2022
  14. MM’21 OSC / Fast and Flexible Human Pose Estimation with HyperPose
    Yixiao Guo*,  Jiawei Liu*, Guo Li*, Luo MaiHao Dong
    Proceedings of the 29th ACM International Conference on Multimedia. Apr 2021

Invited Talk

NLP+SE Seminar, UT Austin: Smelling the Quality of LLM-generated Code Mar 2025

Programming Systems, Uber: Evaluating LLMs for Correct & Efficient Code Generation Sept 2024

ARiSE Lab, Columbia University: Simplify the Making of Great Software in the ML Era April 2024

Snowflake GenAI: Rigorous Evaluation of LLMs for Code (Slides) Feb 2024

AST Lab, ETH Zürich: Generating Test-Cases for ML Compilers (Slides) Jan 2024

GAI4SE, NC State University: LLMs for Software Testing (Guest Lecture) Nov 2023

Apache TVM Conference: Automating DL Compiler Bug Finding with NNSmith Mar 2023

SAMPL, University of Washington: Coverage-Guided Tensor Compiler Fuzzing (Slides) May 2022

Service

Organizing: LLM4Code@ICSE'{24,25} (Publicity Chair)

Program Committee/Reviewer: ASE'24, TSE, TOSEM, NeurIPS'24, ICLR'25

Artifact Evaluation Committee: PLDI'23, OSDI'22, ATC'22

Teaching

CS 427 by Darko Marinov Spring 2025