Topological Data Analysis Oxford

Wednesday, April 28, 2021

9:00-9:50 CDT

Curvature sets over persistence diagrams

Speaker: Facundo Memoli [Ohio State University]

Abstract +Abstract –

We study an invariant of compact metric spaces which combines the notion of curvature sets introduced by Gromov in the 1980s together with the notion of Vietoris-Rips persistent homology. For given integers k≥0 and  n≥1 these invariants arise by considering the degree k Vietoris-Rips persistence diagrams of all subsets of a given metric space with cardinality at most  n. We call these invariants [n,k]-persistence sets. We argue that computing these invariants could be significantly easier than computing the usual Vietoris-Rips persistence diagrams. We establish stability results for these invariants and we also precisely characterize some of them in the case of spheres with geodesic and Euclidean distances. We also identify a rich family of metric graphs for which the [4,1]-persistence sets fully recover their homotopy type. Along the way we prove some useful properties of Vietoris-Rips persistence diagrams.

10:15-10:45 CDT

Intrinsic Persistent Homology via density-based metric learning

Speaker: Ximena Fernández [Swansea University]

Abstract +Abstract –

Typically, persistence diagrams computed from a sample depend strongly on the distance associated to the data. When the point cloud is a sample of a Riemannian manifold embedded in a Euclidean space, an estimator of the intrinsic distance is relevant to obtain persistence diagrams from data that capture its intrinsic geometry. In this talk, we consider a computable estimator of a Riemannian metric known as Fermat distance, that accounts for both the geometry of the manifold and the density that produces the sample. We prove that the metric space defined by the sample endowed with this estimator [known as sample Fermat distance] converges a.s. in the sense of Gromov-Hausdorff to the manifold itself endowed with the [population] Fermat distance. This result is applied to obtain sample persistence diagrams that converge towards an intrinsic persistence diagram. We show that this approach outperforms more standard methods based on Euclidean norm, with theoretical results and computational experiments [1]. [1] E. Borghini, X. Fernández, P. Groisman, G. Mindlin. ‘Intrinsic persistent homology via density-based metric learning’. arXiv:2012.07621 [2020]

11:00-11:30 CDT

Approximate and discrete vector bundles

Speaker: Luis Scoccola  [Michigan State University]

Abstract +Abstract –

Synchronization problems, such as the problem of reconstructing a 3D shape from a set of 2D projections, can often be modeled by principal bundles. Similarly, the application of local PCA to a point cloud concentrated around a manifold approximates the tangent bundle of the manifold. In the first case, the characteristic classes of the bundle provide obstructions to global synchronization, while, in the second case, they provide topological information of the manifold beyond its homology, and give obstructions to dimensionality reduction. I will describe joint work with Jose Perea in which we propose notions of approximate and discrete vector bundle, study the extent to which they determine true vector bundles, and give algorithms for the stable and consistent computation of low-dimensional characteristic classes directly from these combinatorial representations.

13:00-13:50 CDT

Topological Data Analysis of Database Representations for Information Retrieval

Speaker: Anthea Monod [Imperial College]

Abstract +Abstract –

Appropriately representing elements in a database so that queries may be accurately matched is a central task in information retrieval.  This recently has been achieved by embedding the graphical structure of the database into a manifold so that the hierarchy is preserved.  Persistent homology provides a rigorous characterization for the database topology in terms of both its hierarchy and connectivity structure.  We compute persistent homology on a variety of datasets and show that some commonly used embeddings fail to preserve the connectivity.  Moreover, we show that embeddings which successfully retain the database topology coincide in persistent homology.  We introduce the dilation-invariant bottleneck distance to capture this effect, which addresses metric distortion on manifolds.  We use it to show that distances between topology-preserving embeddings of databases are small.

14:15-14:45 CDT

Predicting Survival Outcomes using Topological Shape Features of AI-reconstructed Medical Images

Speaker: Chul Moon [Southern Methodist University]

Abstract +Abstract –

Tumor shape and size have been used as important markers for cancer diagnosis and treatment. This paper proposes a topological feature computed by persistent homology to characterize tumor progression from digital pathology and radiology images and examines its effect on the time-to-event data. The proposed topological features are invariant to scale-preserving transformation and can summarize various tumor shape patterns. The topological features are represented in functional space and used as functional predictors in a functional Cox proportional hazards model. The proposed model enables interpretable inference about the association between topological shape features and survival risks. Two case studies are conducted using lung cancer pathology and brain tumor radiology images. The results show that the topological features predict survival prognosis after adjusting clinical variables, and the predicted high-risk groups have significantly worse survival outcomes than the low-risk groups [p-values

Thursday, April 29, 2021

9:15-9:45 CDT

Back to Basics – Topology of Simplicial Complexes for Business Optimisations

Speaker: Marc Lange [Elbformat Consulting]

Abstract +Abstract –

When our topology ancestors were playing with simplicial complexes to devise models of RP^2 or common triangulations of surfaces they could not have envisioned how ubiquitous and large graph structures are nowadays. Starting with reminders on elementary collapses, clique complexes and a touch of NP-completeness for maximal cliques, I will illustrate the relevant business examples we have found in our Data Science practice as elbformat consulting in Hamburg, Germany. Cases will include Structural Website Optimisations, User Funnel Analyses, Process Mining insights and a Recommendation Engine Prototype.

10:15-10:45 CDT

Learning with Approximate or Distributed Topology

Speaker: Alexander Wagner [Duke University]

Abstract +Abstract –

The computational cost of calculating the persistence diagram for a large input inhibits its use in a deep learning framework. The fragility of the persistence diagram to outliers and the instability of critical cells present further challenges. In this talk, I will present two distinct approaches to address these concerns. In the first approach, by replacing the original filtration with a stochastically downsampled filtration on a smaller complex, one can obtain results in topological optimization tasks that are empirically more robust and much faster to compute than their vanilla counterparts. In the second approach, we work with the set of persistence diagrams of subsets of a fixed size rather than with the diagram of the complete point cloud. The benefits of this distributed approach are a greater degree of practical robustness to outliers, faster computation due to parallelizability and scaling of the persistence algorithm, and an inverse stability theory. After outlining these benefits, I will describe a dimensionality reduction pipeline using distributed persistence. This is joint work with Elchanan Solomon and Paul Bendich.

11:00-11:30 CDT

Geometric and Topological Fingerprints for Periodic Crystals

Speaker: Teresa Heiss [Institute of Science and Technology Austria]

Abstract +Abstract –

The following application has motivated us to develop new Computational Geometry and Topology methods, involving Brillouin zones and periodic k-fold persistent homology: We model crystals by [infinite] periodic point sets, i.e. by the union of several translates of a lattice, where every point represents an atom. Two periodic point sets are equivalent if there is a rigid transformation from one to the other. A periodic point set can be represented by a finite cutout s.t. copying this cutout infinitely often in all directions yields the periodic point set. The fact that these cutouts are not unique creates problems when working with them. Therefore, material scientists would like to work with a complete, continuous invariant instead. In this talk, I will present two continuous invariants that are at least generically complete: Firstly, the density fingerprint, computing the probability that a random ball of radius r contains exactly k points of the periodic point set, for all positive integers k and positive reals r. And secondly, the persistence fingerprint, which is the sequence of order k persistence diagrams, newly defined for infinite periodic point sets, for all positive integers k. Joint work with Herbert Edelsbrunner, Alexey Garber, Vitaliy Kurlin, Georg Osang, Janos Pach, Morteza Saghafian, Phil Smith, Mathijs Wintraecken.

13:00-13:50 CDT

Sampling smooth manifolds using ellipsoids

Speaker: Sara Kalisnik [Bentley University]

Abstract +Abstract –

A common problem in data science is to determine properties of a space from a sample. For instance, under certain assumptions a subspace of a Euclidean space may be homotopy equivalent to the union of balls around sample points, which is in turn homotopy equivalent to the Čech complex of the sample. This enables us to determine the unknown space up to homotopy type, in particular giving us the homology of the space. A seminal result by Niyogi, Smale and Weinberger states that if a sample of a closed smooth submanifold of a Euclidean space is dense enough [relative to the reach of the manifold], there exists an interval of radii, for which the union of closed balls around sample points deformation retracts to the manifold. A tangent space is a good local approximation of a manifold, so we can expect that an object, elongated in the tangent direction, will better approximate the manifold than a ball. We present the result that the union of ellipsoids of suitable size around sample points deformation retracts to the manifold while requiring much smaller density than in the case of union of balls. The proof requires new techniques, as unlike the case of balls, the normal projection of a union of ellipsoids is in general not a deformation retraction.

14:15-14:45 CDT

Simplicial Neural Networks

Speaker: Stefania Ebli  [Ecole Polytechnique Fédérale de Lausanne]

Abstract +Abstract –

In this talk I will present simplicial neural networks [SNNs], a generalization of graph neural networks to data that live on a class of topological spaces called simplicial complexes. These are natural multi-dimensional extensions of graphs that encode not only pairwise relationships but also higher-order interactions between vertices – allowing us to consider richer data, including vector fields and n-fold collaboration networks. We define an appropriate notion of convolution that we leverage to construct the desired convolutional neural networks. We test the SNNs on the task of imputing missing data on coauthorship complexes. This is joint work with M. Defferrard and G.Spreemann.

15:00-15:30 CDT

Topological Sholl Descriptors for Neuronal Clustering and Classification

Speaker: Sadok Kallel [American University of Sharjah]

Abstract +Abstract –

Variations in neuronal morphology among cell classes, brain regions, and animal species are thought to underlie known heterogeneities in neuronal function. Thus, accurate quantitative descriptions and classification of large sets of neurons is important for functional characterization. However, unbiased computational methods to classify groups of neurons are currently scarce. We introduce an unbiased method to study neuronal morphologies. We develop mathematical descriptors that assign to each Neuron an invariant depending on distance from the soma, and taking values in real numbers or other suitable metric spaces [including a metric space of persistence diagrams]. Such descriptors can include tortuosity, branching pattern, “energy”, wiring, TMD, etc. Using detection and metric learning algorithms, we can then provide efficient clustering and classification schemes for neurons. This is joint work with Reem Khalil, Ahmad Farhat and Pawel Dlotko

Friday, April 30, 2021

9:00-9:50 CDT

Compatibility and Optimization for Quiver Representations

Speaker: Vidit Nanda [University of Oxford]

Abstract +Abstract –

Many interesting objects across pure and applied mathematics [including persistence modules, cellular sheaves and connection matrices] are most naturally viewed as linear algebraic data parametrized by a finite space. In this talk, I will describe a practical framework for dimensionality reduction and linear optimization over a wide class of such objects.

10:15-10:45 CDT

The amplitude of an abelian category: Measures in persistence theory

Speaker: Barbara Giunti [Technische Universität Graz]

Abstract +Abstract –

The use of persistent homology in applications is justified by the validity of certain stability results. At the core of such results is a notion of distance between the invariants that one associates to data sets. While such distances are well-understood in the one-parameter case, the situation for multiparameter persistence modules is more challenging, since there is no generalisation of the barcode. Here we introduce a general framework to study stability questions in multiparameter persistence. We first introduce the [outer] amplitude, a functional on abelian categories that mimics the properties of an outer measure in measure theory, then study different ways to associate distances to such functionals. Our framework is very comprehensive, as many different invariants that have been introduced in the literature are examples of outer amplitudes, and similarly, we show that many known distances for multiparameter persistence are distances from outer amplitudes. Finally, we provide new stability results using our framework.

11:00-11:30 CDT

Sliding windows persistence of quasiperiodic functions

Speaker: Hitesh Gakhar [University of Oklahoma]

Abstract +Abstract –

Sliding window embeddings were originally used in the study of dynamical systems to reconstruct the topology of underlying attractors from generic observation functions. In 2015, a technique for recurrence detection in time series data using sliding window embeddings of periodic functions and persistent homology was developed. We study a closely related class of functions, namely quasiperiodic functions, whose constitutive frequencies are non-commensurate harmonics. The sliding window embeddings of such functions are dense in high dimensional tori, where the dimension depends on the number of incommensurate harmonics. In this talk, we will present results pertaining to the structure of sliding window embeddings and their persistent homology, along with a brief discussion on how to choose the embedding parameters.

13:00-13:50 CDT

What are left and right endpoints for multiparameter persistence?

Speaker: Ezra Miller [Duke University]

Abstract +Abstract –

Fundamental to applications of ordinary persistent homology in one parameter is the reconstruction of a module from the perfect matching between left endpoints and right endpoints of its bar code.  Do these concepts have analogues in multiple parameters?  The answer is largely yes: endpoints can be defined, and the module can be reconstructed from them, though the correspondence is not a perfect matching but rather a more arbitrary linear map.  The algebra needed for these developments will be covered from scratch, followed by a view toward how they might be used for computational purposes.

14:15-14:45 CDT

Identifying analogous topological features across multiple systems

Speaker: Iris Yoon [University of Delaware]

Abstract +Abstract –

We present a new method for comparing topological features using dissimilarity matrices obtained from observing activity in distinct complex systems. Our method uses the Dowker complex of a cross-dissimilarity matrix to identify all possible ways a common feature could be represented by the barcodes of activity within the individual systems. This method can be used to study both how distinct systems respond to the same stimuli and how behavior in one system drives behavior in another. Motivated by questions in neuroscience, our framework will allow researchers to investigate open problems such as how neural systems code for complex stimuli and how such coding structures propagate and evolve through different neural systems without direct reference to external correlates. The same tools can also be applied more generally to explore two-dimensional persistence and to identify which topological features are preserved after dimensionality reduction. This is joint work with Chad Giusti [University of Delaware] and Robert Ghrist [University of Pennsylvania]

15:00-15:30 CDT

Recent Advances in Topology-Based Graph Classification

Speaker: Bastian Rieck [ETH Zurich]

Abstract +Abstract –

Topological data analysis emerged as an effective tool in machine learning, supporting the analysis of neural networks, but also driving the development of novel algorithms that incorporate topological characteristics. As a problem class, graph classification is of particular interest here, since graphs are inherently amenable to a topological description in terms of their connected components and cycles. This talk will briefly summarise recent advances in topology-based graph classification, focussing equally on ’shallow’ and ‘deep’ approaches. Starting from an intuitive description of persistent homology, we will discuss how to incorporate topological features into the Weisfeiler–Lehman colour refinement scheme, thus obtaining a simple feature-based graph classification algorithm. We will then build a bridge to graph neural networks and demonstrate a topological variant of ‘readout’ functions, which can be learned end-to-end. Care has been taken to make the talk accessible to an audience that may not have been exposed to machine learning or topological data analysis.

Video liên quan

Chủ Đề