학과 세미나 및 콜로퀴엄
In nonstationary bandit learning problems, the decision-maker must continually gather information and adapt their action selection as the latent state of the environment evolves. In each time period, some latent optimal action maximizes expected reward under the environment state. We view the optimal action sequence as a stochastic process, and take an information-theoretic approach to analyze attainable performance. We bound per-period regret in terms of the entropy rate of the optimal action process. The bound applies to a wide array of problems studied in the literature and reflects the problem’s information structure through its information-ratio.
In the past decade, machine learning methods (MLMs) for solving partial differential equations (PDEs) have gained significant attention as a novel numerical approach. Indeed, a tremendous number of research projects have surged that apply MLMs to various applications, ranging from geophysics to biophysics. This surge in interest stems from the ability of MLMs to rapidly predict solutions for complex physical systems, even those involving multi-physics phenomena, uncertainty, and real-world data assimilation. This trend has led many to hopeful thinking MLMs as a potential game-changer in PDE solving. However, despite the hopeful thinking on MLMs, there are still significant challenges to overcome. These include limits compared to conventional numerical approaches, a lack of thorough analytical understanding of its accuracy, and the potentially long training times involved. In this talk, I will first assess the current state of MLMs for solving PDEs. Following this, we will explore what roles MLMs should play to become a conventional numerical scheme.
The size and complexity of recent deep learning models continue to increase exponentially, causing a serious amount of hardware overheads for training those models. Contrary to inference-only hardware, neural network training is very sensitive to computation errors; hence, training processors must support high-precision computation to avoid a large performance drop, severely limiting their processing efficiency. This talk will introduce a comprehensive design approach to arrive at an optimal training processor design. More specifically, the talk will discuss how we should make important design decisions for training processors in more depth, including i) hardware-friendly training algorithms, ii) optimal data formats, and iii) processor architecture for high precision and utilization.
Scientific knowledge, written in the form of differential equations, plays a vital role in various deep learning fields. In this talk, I will present a graph neural network (GNN) design based on reaction-diffusion equations, which addresses the notorious oversmoothing problem of GNNs. Since the self-attention of Transformers can also be viewed as a special case of graph processing, I will present how we can enhance Transformers in a similar way. I will also introduce a spatiotemporal forecasting model based on neural controlled differential equations (NCDEs). NCDEs were designed to process irregular time series in a continuous manner and for spatiotemporal processing, it needs to be combined with a spatial processing module, i.e., GNN. I will show how this can be done.
This talk presents mathematical modeling, numerical analysis and simulation using finite element method in the field of electromagnetics at various scales, from analyzing quantum mechanical effects to calculating the scattering of electromagnetic wave in free space. First, we discuss and analyze the Schrodinger-Poisson system of quantum transport model to calculate electron states in three-dimensional heterostructures. Second, the electromagnetic vector wave scattering problem is solved to analyze the field characteristics in the presence of stealth platform. This talk also introduces several challenging issues in these applications and proposes their solutions through mathematical analysis.
Optimal Transport (OT) problem investigates a transport map that bridges two distributions while minimizing a specified cost function. OT theory has been widely utilized in generative modeling. Initially, the OT-based Wasserstein metric served as a measure for assessing the distance between data and generated distributions. More recently, the OT transport map, connecting data and prior distributions, has emerged as a new approach for generative models. In this talk, we will introduce generative models based on Optimal Transport. Specifically, we will present our work on a generative model utilizing Unbalanced Optimal Transport. We will also discuss our subsequent efforts to address the challenges associated with this approach.
