Yu Yang

About Me

I am a third year Ph.D. student at Duke University advised by Prof. Pan Xu. My primary research interests focus on reinforcement learning (RL), with a specific emphasis on off-dynamics RL, RL-driven applications in healthcare, and the incorporation of RL techniques into foundation models. I am also exploring the use of RL to improve large language models (LLMs), particularly in enhancing their reasoning capabilities and alignment with human.

Publications

Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer.

Yu Yang, Pan Xu.

Transactions on Machine Learning Research (TMLR), 2025.

PDF

An Interactive Framework for Generating Clinical Data with Human Feedback.

Yu Yang*, Jiafeng Song*, Zhishuai Liu, Henry P Foote, Rishikesan Kamaleswaran, Pan Xu (* Equal contribution)

IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), 2025.

PDF

D2Fed: Federated Semi-Supervised Learning With Dual-Role Additive Local Training and Dual-Perspective Global Aggregation.

Jingxin Mao, Yu Yang, Zhiwei Wei, Yanlong Bi, Rongqing Zhang.

IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025.

PDF

Optimal batched best arm identification.

Tianyuan Jin, Yu Yang, Jing Tang, Xiaokui Xiao, Pan Xu.

Conference on Neural Information Processing Systems (NeurIPS), 2024.

PDF

More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling.

Haque Ishfaq, Yixin Tan, Yu Yang, Qingfeng Lan, Jianfeng Lu, A Rupam Mahmood, Doina Precup, Pan Xu.

Reinforcement Learning Conference (RLC), 2024.

PDF

On the stability of expressive positional encodings for graphs.

Yinan Huang, William Lu, Joshua Robinson, Yu Yang, Muhan Zhang, Stefanie Jegelka, Pan Li.

International Conference on Learning Representations (ICLR), 2024.

PDF

Generative character inpainting guided by structural information.

Haolong Li, Zizheng Zhong, Wei Guan, Chenghao Du, Yu Yang, Yuxiang Wei, Chen Ye.

The Visual Computer, 2021.

PDF

Preprints

Diffusion Posterior Sampling for Nonlinear Contextual Bandits

Weixin Wang*, Yu Yang*, Pan Xu. (* Equal contribution)

Under Review.

How to Provably Improve Return Conditioned Supervised Learning?

Zhishuai Liu, Yu Yang, Ruhan Wang, Pan Xu, Dongruo Zhou.

NeurIPS 2025 Workshop on ARLET.

PDF

MOBODY: Model-Based Off-Dynamics Offline Reinforcement Learning.

Yihong Guo, Yu Yang, Pan Xu, Anqi Liu.

NeurIPS 2025 Workshop on ARLET.

PDF

Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning.

Ruhan Wang*, Yu Yang*, Zhishuai Liu, Dongruo Zhou, Pan Xu. (* Equal contribution)

NeurIPS 2025 Workshop on Reliable ML.

PDF

About Me

Publications

Preprints

Services

Conference Reviewers