Xiaobo Wang

Xiaobo Wang | 王晓博

PhD Student

University of Science and Technology of China

Research Intern @ BIGAI

Google Scholar citations

Research Interests

  • Reward Modeling
  • LLM Alignment
  • Continual Learning

About Me

I am a second-year PhD student at the University of Science and Technology of China, currently pursuing research as an intern at the Beijing Institute for General Artificial Intelligence (BIGAI), under the guidance of Professor Qi Liu and Researcher Zilong Zheng. I hold a bachelor's degree from Beihang University.

Open to collaboration. My research centers on reward modeling, LLM alignment, and continual learning. Feel free to reach out if you are interested in these topics.

Research Overview

"Toward language models that keep improving from their own experience — through better reward signals, robust alignment, and continual adaptation."

Reward Modeling & Alignment

Designing reliable reward models and preference-optimization methods that stay calibrated and aligned as the policy evolves beyond static training data (e.g., UAPO, SAVE).

Continual Learning & Memory

Enabling models to acquire, retain, and update knowledge over time through knowledge editing and ever-improving memory systems (e.g., ICE, RAM).

Latest News

View All

Publications

* indicates equal contribution.

View Full List

Preprints

Academic Service

Conference Reviewer

Program Committee Member

ACL, EMNLP, NeurIPS

Education

University of Science and Technology of China logo

Ph.D. Student

2024.09 - Present

University of Science and Technology of China

Beihang University logo

B.Eng.

2020.09 - 2024.06

Beihang University

Experience

BIGAI logo

Research Intern

2025.07 - Present

Beijing Institute for General Artificial Intelligence (BIGAI), Beijing, China

ByteDance logo

Research Intern

2023.10 - 2024.01

ByteDance, Beijing, China

Contact

Location

Beijing, China