
Abstract
Artificial Intelligence has become central to innovation in medical diagnostics, offering In this talk, I will present our recent work on bilevel reinforcement learning (BRL), a powerful framework used in tasks such as reinforcement learning from human feedback (RLHF), inverse reinforcement learning, and AI alignment. While previous research has largely focused on low-dimensional/tabular settings, our work emphasizes practical implementation and scalability to continuous state-action environments. We propose a first-order, Hessian-free BRL algorithm that is computationally efficient and thus well-suited for high-dimensional applications.
Through experiments, we demonstrate that our method outperforms state-of-the-art BRL algorithms in benchmark environments. Unlike existing methods, which are limited by second-order computational bottlenecks or restrictive assumptions such as discrete state/action spaces, our approach scales effectively and achieves superior performance with fewer samples.
As a direction for future work, we plan to integrate diffusion models into the BRL framework to enhance robustness and data efficiency. These models, known for their generative capabilities, offer exciting potential for improving policy optimization in bilevel setups and further advancing the capabilities of modern bilevel reinforcement learning systems.
For more info, please follow this link.