About me

I am a second-year Ph.D. student in Computer Science at Institute of Computing Technology, Chinese Academy of Sciences, advised by Prof. Shenghua Liu and Yiwei Wang. I am a member of CAS Key Laboratory of AI Security. My research interests lie in the general area of trustworthy large language models.

I completed a research internship at MSRA GenAI group and subsequently interned with Qwen Team, where I worked on pretraining foundation models and served as a core contributor to the latest Qwen backbone model.

I am currently visiting the Language Technologies Institute (LTI), Carnegie Mellon University (CMU) this year for a one-year research stay under the supervision of Prof. Chenyan Xiong.

🔥 News

  • 2025-08: Four papesr accepted to EMNLP 2025
  • 2025-05: Five papesr accepted to ACL 2025
  • 2025-05: One paper accepted to ICML 2025
  • 2025-01: Three papers accepted to ICLR 2025, WWW 2025, and NAACL 2025.
  • 2024-11: One paper accepted to COLING 2025
  • 2024-09: Two papers accepted to EMNLP 2024
  • 2024-05: One paper accepted to ACL 2024

📚 Publications

You can also find my articles on my Google Scholar profile. *marks the corresponding author

2025

  • Yutong Wang, Pengliang Ji, Kaixin Li, Baolong Bi, Tao Feng, Guillaume Sartoretti. Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning [arxiv] [paper]

  • Hongcheng Gao, Zihao Huang, Lin Xu, Jingyi Tang, Xinhao Li, Yue Liu, Haoyang Li, Taihang Hu, Minhua Lin, Xinlong Yang, Ge Wu, Balong Bi, Hongyu Chen, Wentao Zhang. Pixels, Patterns, but No Poetry: To See The World like Humans [arxiv] [paper] [page]

  • Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, Shenghua Liu. A Survey of Context Engineering for Large Language Models [arxiv] [paper] [github]

  • Baolong Bi, Shenghua Liu, Xingzhang Ren, Dayiheng Liu, Junyang Lin, Yiwei Wang, Lingrui Mei, Junfeng Fang, Jiafeng Guo, Xueqi Cheng. RefineX: Learning to Refine Pre-training Data at Scale from Expert-Guided Programs [arxiv] [paper] [code]

  • Juan Chen, Baolong Bi*, Wei Zhang, Jingyan Sui, Xiaofei Zhu, Yuanzhuo Wang, Lingrui Mei, Shenghua Liu. Rethinking All Evidence: Enhancing Trustworthy Retrieval-Augmented Generation via Conflict-Driven Summarization [arxiv] [paper]

  • Jiayu Yao, Shenghua Liu, Yiwei Wang, Lingrui Mei, Baolong Bi, Yuyao Ge, Zhecheng Li, Xueqi Cheng. Who is in the Spotlight: The Hidden Bias Undermining Multimodal Retrieval-Augmented Generation [arxiv] [paper]

  • Zehao Li, Hao Jiang, Yujun Cai, Jianing Chen, Baolong Bi, Shuqin Gao, Honglong Zhao, Yiwei Wang, Tianlu Mao, Zhaoqi Wang. STDR: Spatio-Temporal Decoupling for Real-Time Dynamic Scene Rendering. [arxiv] [paper]

  • Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhiwei Li, BaoLong Bi, Ling-Rui Mei, Junfeng Fang, Zhijiang Guo, Le Song, Cheng-Lin Liu. From System 1 to System 2: A Survey of Reasoning Large Language Models. [arxiv] [paper] [github]

  • Cheng Wang, Yue Liu, Baolong Bi, Duzhen Zhang, Zhongzhi Li, Junfeng Fang. Safety in Large Reasoning Models: A Survey. [arxiv] [paper]

  • Lingrui Mei, Shenghua Liu, Yiwei Wang, Baolong Bi, Yuyao Ge, Jun Wan, Yurong Wu, Xueqi Cheng. a1: Steep Test-time Scaling Law via Environment Augmented Generation. [arxiv] [paper]

  • Yilong Xu, Jinhua Gao, Xiaoming Yu, Yuanhai Xue, Baolong Bi, Huawei Shen, Xueqi Cheng. Training a Utility-based Retriever Through Shared Context Attribution for Retrieval-Augmented Language Models. [arxiv] [paper]

  • Yue Liu, Jiaying Wu, Yufei He, Hongcheng Gao, Hongyu Chen, Baolong Bi, Jiaheng Zhang, Zhiqi Huang, Bryan Hooi. Efficient Inference for Large Reasoning Models: A Survey. [arxiv] [paper]

  • Hongcheng Gao, Jiashu Qu, Jingyi Tang, Baolong Bi, Yue Liu, Hongyu Chen, Li Liang, Li Su, Qingming Huang. Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation. [arxiv] paper [huggingface] [github]

  • Yuyao Ge, Shenghua Liu, Yiwei Wang, Lingrui Mei, Lizhe Chen, Baolong Bi, Xueqi Cheng. Innate Reasoning is Not Enough: In-Context Learning Enhances Reasoning Large Language Models with Less Overthinking. [arxiv] [paper]

  • Baolong Bi, Shenghua Liu, Yiwei Wang, Yilong Xu, Junfeng Fang, Lingrui Mei, Xueqi Cheng. Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models. [arxiv] [paper] [github]

  • Shiyu Ni, Keping Bi, Jiafeng Guo, Lulu Yu, Baolong Bi, Xueqi Cheng. Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception. [arxiv] [paper]

  • Zherui Li, Houcheng Jiang, Hao Chen, Baolong Bi, Zhenhong Zhou, Fei Sun, Junfeng Fang, Xiang Wang. Reinforced Lifelong Editing for Language Models. [arxiv] [paper]

  • Tianyu Zhang, Junfeng Fang, Houcheng Jiang, Baolong Bi, Xiang Wang, Xiangnan He. Explainable and Efficient Editing for Large Language Models. [arxiv] [paper]

2024

  • Baolong Bi, Shaohan Huang, Yiwei Wang, Tianchi Yang, Zihan Zhang, Haizhen Huang, Lingrui Mei, Junfeng Fang, Zehao Li, Furu Wei. Context-DPO: Aligning Language Models for Context-Faithfulness. [page] [arxiv] [paper]

  • Zehao Li, Wenwei Han, Yujun Cai, Hao Jiang, Baolong Bi, Shuqin Gao, Honglong Zhao, Zhaoqi Wang. GradiSeg: Gradient-Guided Gaussian Segmentation with Enhanced 3D Boundary Precision. [arxiv] [paper]

  • Yuyao Ge, Shenghua Liu, Baolong Bi, Yiwei Wang, Lingrui Mei, Wenjie Feng, Lizhe Chen, Xueqi Cheng. Can Graph Descriptive Order Affect Solving Graph Problems with LLMs?. [arxiv] [paper]

  • Lingrui Mei, Shenghua Liu, Yiwei Wang, Baolong Bi, Ruibin Yuan, Xueqi Cheng. HiddenGuard: Fine-Grained Safe Generation with Specialized Representation Router. [arxiv] [paper]

  • Baolong Bi, Shenghua Liu, Yiwei Wang, Lingrui Mei, Hongcheng Gao, Junfeng Fang, Xueqi Cheng. StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models. [arxiv] [paper]

  • Yilong Xu, Jinhua Gao, Xiaoming Yu, Baolong Bi, Huawei Shen, Xueqi Cheng. ALiiCE: Evaluating Positional Fine-grained Citation Generation. [arxiv] [paper]

  • Baolong Bi, Shenghua Liu, Yiwei Wang, Lingrui Mei, Hongcheng Gao, Yilong Xu, Xueqi Cheng. Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities. [arxiv] [paper]

  • Lingrui Mei, Shenghua Liu, Yiwei Wang, Baolong Bi, Jiayi Mao, Xueqi Cheng. “Not Aligned” is Not “Malicious”: Being Careful about Hallucinations of Large Language Models’ Jailbreak. [arxiv] [paper]

  • Baolong Bi, Shenghua Liu, Lingrui Mei, Yiwei Wang, Pengliang Ji, Xueqi Cheng. Decoding by Contrasting Knowledge: Enhancing LLMs’ Confidence on Edited Facts. [page] [arxiv] [paper]

  • Baolong Bi, Shenghua Liu, Yiwei Wang, Lingrui Mei, Xueqi Cheng. Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness. [arxiv] [paper]

  • Baolong Bi, Shenghua Liu, Yiwei Wang, Lingrui Mei, Xueqi Cheng. LPNL: Scalable Link Prediction with Large Language Models. [arxiv] [paper]

  • Lingrui Mei, Shenghua Liu, Yiwei Wang, Baolong Bi, Xueqi Cheng. SLANG: New Concept Comprehension of Large Language Models. [arxiv] [paper]

📝 Services

  • Reviewer for
    • International Conference on Learning Representations (ICLR) 2025
    • Annual Meeting of the Association for Computational Linguistics (ACL) 2024, 2025 (ARR 2024 December; ARR 2025 February, May)
    • Conference on Empirical Language Modeling (COLM) 2025
    • Conference on Neural Information Processing Systems (NeurIPS) 2025
    • Conference on NeurIPS Datasets and Benchmarks Track 2025
    • ACM International Conference on Information and Knowledge Management (CIKM) 2025

🎓 Education

  • Institute of Computing Technology, Chinese Academy of Sciences
    Ph.D. in Computer Science (2023 - present)
    Advisor: Prof. Shenghua Liu and Yiwei Wang

  • Chongqing University
    B.E. in Computer Science (Excellent) (2019 - 2023)