Talks and presentations

A Quick Overview of Reinforcement Learning (RL)

April 18, 2026

Talk, Fudan University (internal seminar), Shanghai, China

Abstract
This seminar serves as a theoretical prerequisite for understanding modern Large Language Model (LLM) reinforcement learning alignment techniques, such as GRPO and DAPO. Rather than focusing on the heavy engineering pipelines of RLHF, this talk constructs a rigorous, uninterrupted mathematical narrative.