학과 세미나 및 콜로퀴엄
| 2024-05 | ||||||
|---|---|---|---|---|---|---|
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 1 | 24 | 25 |
| 26 | 27 | 28 | 29 | 30 | 31 | |
A major trajectory in the development of statistical learning has been the expansion of mathematical spaces underlying observed data, extending from numbers to vectors, functions, and beyond. This expansion has fostered significant theoretical and computational breakthroughs. One notable direction involves analyzing sets where each set becomes an object of interest for inference. This perspective accommodates the intrinsic and non-ignorable heterogeneity inherent in data-generating processes. Among various theoretical frameworks to analyze sets, a principled approach is viewing a set as an empirical measure. In this talk, I revisit the concept of the median - a robust alternative to the mean as a centroid - and introduce a novel extension of this concept within the space of probability measures under the framework of optimal transport. I will present theoretical results and a generic computational pipeline that leverages existing algorithmic developments in the field, with examples. Furthermore, the potential benefits of this novel approach for scalable inference and scientific discovery will be explored.
Reinforcement learning (RL) has become one of the most central problems in machine learning, showcasing remarkable success in recommendation systems, robotics and super-human level game plays. Yet, existing literature predominantly focuses on (almost) fully observable environments, overlooking the complexities of real-world scenarios where crucial information remains hidden.
In this talk, we consider reinforcement learning in partially observable systems through the proposed framework of the Latent Markov Decision Process (LMDP). In LMDPs, an MDP is randomly drawn from a set of possible MDPs at the beginning of the interaction, but the context -- the latent factors identifying the chosen MDP -- is not revealed to the agent. This opacity poses new challenges for decision-making, particularly in scenarios like recommendation systems without sensitive user data, or medical treatments for undiagnosed illnesses. Despite the significant relevance of LMDPs to real-world problems, existing theories rely on restrictive separation assumptions -- an unrealistic constraint in practical applications. We present a series of new results addressing this gap: from leveraging higher-order information to develop sample-efficient RL algorithms, to establishing lower bounds and improved results under more realistic assumptions within Latent MDPs.
