Less is more ? How Optimal Transport can help for compressive learning
Abstract: Nowadays large-scale machine learning faces a number of fundamental computational challenges, triggered by the high dimensionality of modern data and the increasing availability of very large training collections. These data can also be of a very complex nature, such as such as those described by the graphs that are integral to many application areas. In this talk I will present some solutions to these problems. I will introduce the Compressive Statistical Learning (CSL) theory, a general framework for resource-efficient large scale learning in which the training data is summarized in a small single vector (called sketch) that captures the information relevant to the learning task. We will show how Optimal Transport (OT) can help us establish statistical guarantees for this type of learning problem. I will also show how OT can allow us to obtain efficient representations of structured data, thanks to the Gromov-Wasserstein distance. I will address concrete learning tasks on graphs such as online graph subspace estimation and tracking, graphs partitioning, clustering and completion.