Department Seminars & Colloquia
When you're logged in, you can subscribe seminars via e-mail
In this talk, we discuss the Neural Tangent Kernel. The NTK is closely related to the dynamics of the neural network during training via the Gradient Flow(or Gradient Descent). But, since the NTK is random at initialization and varies during training, it is quite delicate to understand the dynamics of the neural network. In relation to this issue, we introduce an interesting result: in the infinite-width limit, the NTK converge to a deterministic kernel at initialization and remains constant during training. We provide a brief proof of the result for the simplest case.
9월 14일, 10월 4일, 5일 세 번에 걸친 발표로, 본 시간에는 주로 9월 14일 내용의 리뷰를 주로 다룸.
9월 14일, 10월 4일, 5일 세 번에 걸친 발표로, 본 시간에는 주로 9월 14일 내용의 리뷰를 주로 다룸.
In this talk, we discuss the Neural Tangent Kernel. The NTK is closely related to the dynamics of the neural network during training via the Gradient Flow(or Gradient Descent). But, since the NTK is random at initialization and varies during training, it is quite delicate to understand the dynamics of the neural network. In relation to this issue, we introduce an interesting result: in the infinite-width limit, the NTK converge to a deterministic kernel at initialization and remains constant during training. We provide a brief proof of the result for the simplest case.
9월 14일, 10월 4일, 5일 세 번에 걸친 발표.
9월 14일, 10월 4일, 5일 세 번에 걸친 발표.
In this talk, I will explain the setting of online convex optimization and the definition of regret and constraint violation. I then will introduce various algorithms and their theoretical guarantees under various assumptions. The connection with some topics in machine learning such as stochastic gradient descent, multi-armed bandit, and reinforcement learning will also be briefly discussed.
산업경영학동(E2-1) 세미나실 (2216)
ACM Seminars
강남우 (KAIST)
Generative AI-based Product Design and Development
산업경영학동(E2-1) 세미나실 (2216)
ACM Seminars
"어떻게 하면 더 좋은 제품을 더 빠르게 개발할 수 있을까?"라는 문제는 모든 제조업이 안고 있는 숙제입니다. 최근 DX를 통해 많은 데이터들이 디지털화되고, AI의 급격한 발전을 통해 제품개발프로세스를 혁신하려는 시도가 일어나고 있습니다. 과거의 시뮬레이션 기반 설계에서 AI 기반 설계로의 패러다임 전환을 통해 제품개발 기간을 단축함과 동시에 제품의 품질을 향상시킬 수 있습니다. 본 세미나는 딥러닝을 통해 제품 설계안을 생성/탐색/예측/최적화/추천할 수 있는 생성형 AI 기반의 설계 프로세스(Deep Generative Design)를 소개하고, 모빌리티를 비롯한 제조 산업에 적용된 다양한 사례들을 소개합니다.
In physics, Bohr’s correspondence principle asserts that the theory of quantum mechanics can be reduced to that of classical mechanics in the limit of large quantum numbers. This rather vague statement can be formulated explicitly in various ways. In this talk, focusing on an analytic point of view, we discuss the correspondence between basic inequalities and that between measures. Then, as an application, we present the convergence from quantum to kinetic white dwarfs.
Room B332, IBS (기초과학연구원)
Discrete Mathematics
Domagoj Bradač (ETH Zürich)
Effective bounds for induced size-Ramsey numbers of cycles
Room B332, IBS (기초과학연구원)
Discrete Mathematics
The k-color induced size-Ramsey number of a graph H is the smallest number of edges a (host) graph G can have such that for any k-coloring of its edges, there exists a monochromatic copy of H which is an induced subgraph of G. In 1995, in their seminal paper, Haxell, Kohayakawa and Łuczak showed that for cycles these numbers are linear for any constant number of colours, i.e., for some C=C(k), there is a graph with at most Cn edges whose any k-edge-coloring contains a monochromatic induced cycle of length n. The value of C comes from the use of the sparse regularity lemma and has a tower-type dependence on k. In this work, we obtain nearly optimal bounds for the required value of C. Joint work with Nemanja Draganić and Benny Sudakov.
With the success of deep learning technologies in many scientific and engineering applications, neural network approximation methods have emerged as an active research area in numerical partial differential equations. However, the new approximation methods still need further validations on their accuracy, stability, and efficiency so as to be used as alternatives to classical approximation methods. In this talk, we first introduce the neural network approximation methods for partial differential equations, where a neural network function is introduced to approximate the PDE (Partial Differential Equation) solution and its parameters are then optimized to minimize the cost function derived from the differential equation. We then present the approximation error and the optimization error behaviors in the neural network approximate solution. To reduce the approximation error, a neural network function with a larger number of parameters is often employed but when optimizing such a larger number of parameters the optimization error usually pollutes the solution accuracy. In addition to that, the gradient-based parameter optimization usually requires computation of the cost function gradient over a tremendous number of epochs and it thus makes the cost for a neural network solution very expensive. To deal with such problems in the neural network approximation, a partitioned neural network function can be formed to approximate the PDE solution, where localized neural network functions are used to form the global neural network solution. The parameters in each local neural network function are then optimized to minimize the corresponding cost function. To enhance the parameter training efficiency further, iterative algorithms for the partitioned neural network function can be developed. We finally discuss the possibilities in this new approach as a way of enhancing the neural network solution accuracy, stability, and efficiency by utilizing classical domain decomposition algorithms and their convergence theory. Some interesting numerical results are presented to show the performance of the partitioned neural network approximation and the iteration algorithms.
In this lecture, we aim to delve deep into the emerging landscape of 'Foundation Models'. Distinct from traditional deep learning models, Foundation Models have ushered in a new paradigm, characterized by their vast scale, versatility, and transformative potential. We will uncover the key differences between these models and their predecessors, delving into the intricate mechanisms through which they are trained and the profound impact they are manifesting across various sectors. Furthermore, the talk will shed light on the invaluable role of mathematics in understanding, optimizing, and innovating upon these models. We will explore the symbiotic relationship between Foundation Models and mathematical principles, elucidating how the latter not only underpins their functioning but also paves the way for future advancements.
The Gauss-Bonnet theorem implies that the two dimensional torus does not have nonnegative Gauss curvature unless it is flat, and that the two dimensional sphere does not a metric which has Gaussian curvature bounded below by one and metric bounded below by the standard round metric.
Gromov proposed a series of conjectures on generalizing the Gauss-Bonnet theorem in his four lectures. I will report my work with Gaoming Wang (now Tsinghua) on Gromov dihedral rigidity conjecture in hyperbolic 3-space and scalar curvature comparison of rotationally symmetric convex bodies with some simple singularities.
We consider the problem of graph matching, or learning vertex correspondence, between two correlated stochastic block models (SBMs). The graph matching problem arises in various fields, including computer vision, natural language processing and bioinformatics, and in particular, matching graphs with inherent community structure has significance related to de-anonymization of correlated social networks. Compared to the correlated Erdos-Renyi (ER) model, where various efficient algorithms have been developed, among which a few algorithms have been proven to achieve the exact matching with constant edge correlation, no low-order polynomial algorithm has been known to achieve exact matching for the correlated SBMs with constant correlation. In this work, we propose an efficient algorithm for matching graphs with community structure, based on the comparison between partition trees rooted from each vertex, by extending the idea of Mao et al. (2021) to graphs with communities. The partition tree divides the large neighborhoods of each vertex into disjoint subsets using their edge statistics to different communities. Our algorithm is the first low-order polynomial-time algorithm achieving exact matching between two correlated SBMs with high probability in dense graphs.
In this talk, I will introduce twistor theory, which connects complex geometry, Riemannian geometry, and algebraic geometry by producing a complex manifold, called the twistor space, from a quaternionic Kähler manifold. First, I will explain why quaternionic Kähler manifolds have to be studied in view of holonomy theory in Riemannian geometry, and how twistor theory enables us to use algebraic geometry in studying their geometry. Next, based on the realization of homogeneous twistor spaces as adjoint varieties, I will present a description of the compactified spaces of conics in adjoint varieties, which is motivated by twistor theory.
In this talk, we will primarily discuss the theoretical analysis of knowledge distillation based federated learning algorithms. Before we explore the main topics, we will introduce the basic concepts of federated learning and knowledge distillation. Subsequently, we will understand a nonparametric view of knowledge distillation based federated learning algorithms and introduce generalization analysis of these algorithms based the theory of regularized kernel regression methods.
The Stefan problem is a free boundary problem describing the interface between water and ice. It has PDE and probabilistic aspects. We discuss an approach to this problem, based on optimal transport theory. This approach is related to the Skorokhod problem, a classical problem in probability regarding the Brownian motion.
In the analysis of singularities, uniqueness of limits often arises as an important question: that is, whether the geometry depends on the scales one takes to approach the singularity. In his seminal work, Simon demonstrated that Lojasiewicz inequalities, originally known in real algebraic geometry in finite dimensions, can be applied to show uniqueness of limits in geometric analysis in infinite dimensional settings. We will discuss some instances of this very successful technique and its applications.
While deep neural networks (DNNs) have been widely used in numerous applications over the past few decades, their underlying theoretical mechanisms remain incompletely understood. In this presentation, we propose a geometrical and topological approach to understand how deep ReLU networks work on classification tasks. Specifically, we provide lower and upper bounds of neural network widths based on the geometrical and topological features of the given data manifold. We also prove that irrespective of whether the mean square error (MSE) loss or binary cross entropy (BCE) loss is employed, the loss landscape has no local minimum.