2025.08.20
2025 WADSC Conference

The Well-Aging Data Science Convergence Research Center at Chung-Ang University held an academic conference on June 13, 2025.


Speakers:

  • Professor Hayoung Shin, Soongsil University

  • Professor Jisoo Kim, Seoul National University

Date & Time: June 13, 2025, 10:00 – 11:40 AM
Location: Building 310, Room 413


Hayoung Shin Department of Information Statistics and Actuarial Science, Soongsil University

Statistics with the boundary at infinity on Hadamard spaces

All data exist in some space and possess geometric properties. Thus geometry can be exploited to perform statistical analyses. Traditional statistics mostly deals with data that lie in linear spaces with no curvature, possessing Euclidean geometry. However, many interesting modern data sets - from diverse fields like computer vision, natural language processing, computational biology and healthcare - lie in curved spaces; that is, they are non-Euclidean. An especially useful class of such spaces is that of Hadamard spaces, also called spaces of global non-positive curvature (as opposed to the zero curvature of Euclidean spaces). Examples of such spaces are the spaces of symmetric positive definite matrices and in particular hyperbolic spaces, which have been receiving great interest from statisticians and machine learning researchers as natural homes for hierarchical data. Hadamard spaces possess a very useful property called the boundary at infinity, which can be used to define directions and thus generalize various Euclidean statistical methods to Hadamard spaces. This talk will present one such example, defining quantiles on Hadamard spaces using the boundary at infinity. It will cover some theoretical properties, uses, and demonstrate applications on various real data sets, including single-cell RNA sequencing data and embryological data.


Jisoo Kim Department of Statistics, Seoul National University

Statistical Estimation of Topological Data Analysis and Its Applications to Machine Learning


This presentation introduces the fundamental concepts of Topological Data Analysis (TDA) and methods for applying TDA to machine learning. Broadly speaking, TDA is a methodology for extracting and analyzing topological features from data. Its primary technique is Persistent Homology, which observes data across multiple scales and identifies persistent topological structures.

TDA not only conveys scientific information about data but also provides additional features useful for learning tasks, and has been shown to be particularly effective in machine learning.

The first part of the talk will focus on the statistical estimation of TDA. Because of randomness in data distributions, TDA outputs can include errors, which can be quantified statistically. After introducing the concept of persistent homology, the talk will discuss methods to quantify uncertainty due to data randomness via confidence sets, and approaches to selecting meaningful topological features. The second part of the talk will present two approaches to applying TDA in machine learning.

  1. Featurization: Transforming complex mathematical structures in persistent homology into Euclidean vectors or functions for use in machine learning.

  2. Evaluation: Using topological features to evaluate the quality of data or models.

    Through real-world case studies, the presentation will highlight the potential of TDA in advancing machine learning research.