Title: Learning from Unreliable Labels via Crowdsourcing

Georgios B. Giannakis, University of Minnesota

Abstract: Crowdsourcing, as the name suggests, harnesses the information provided by crowds of human annotators to perform learning tasks, such as word tagging in natural language processing, crowdsensing, and ChatGPT, among others. Even though crowdsourcing can be efficient and relatively inexpensive, combining the noisy, scarce, and potentially adversarial responses provided by multiple annotators of unknown expertise can be challenging, especially in unsupervised setups, where no ground-truth data is available.

Focusing on the classification task, the first part of this talk will touch upon models and algorithms for label fusion along with their performance.  Approaches will be also discussed for data-aware crowdsourcing, and links will be outlined with deep-, self-supervised, and meta-learning. Aiming to robustify crowdsourced classification against adversarial attacks, the last part will cover spectrum based algorithms to flag and mitigate the effect of spammers. If time allows, means of dealing with dependednt annotators will be discussed briefly.

BIO: Georgios B. GIANNAKIS received his Diploma in Electrical Engr. (EE) from the Ntl. Tech. U. of Athens, Greece, 1981. From 1982 to 1986 he was with the U. of Southern California (USC), where he received his MSc. in EE, 1983, MSc. in Mathematics, 1986, and Ph.D. in EE, 1986. He was with the U. of Virginia from 1987 to 1998, and since 1999 he has been with the U. of Minnesota (UMN), where he held an Endowed Chair of Telecommunications, served as director of the Digital Technology Center 2008-21, and since 2016 he holds a UMN Presidential Chair in ECE.

His interests span the areas of statistical learning, communications, and networking – subjects on which he has published more than 495 journal papers, 805 conference papers, 26 book chapters, two edited books and two research monographs. Current research focuses on Data Science with applications to IoT, and power networks with renewables. He is the (co-) inventor of 36 issued patents, and the (co-)recipient of 10 best journal paper awards from the IEEE Signal Processing (SP) and Communications Societies, including the G. Marconi Prize. He received the IEEE-SPS Norbert Wiener Society Award (2019); EURASIP’s A. Papoulis Society Award (2020); Technical Achievement Awards from the IEEE-SPS (2000) and from EURASIP (2005); the IEEE ComSoc Education Award (2019); and the IEEE Fourier Technical Field Award (2015). He is a member of the Academia Europaea, Greece’s Academy of Athens, and Fellow of the National Academy of Inventors, the European Academy of Sciences, UK’s Royal Academy of Engineering, Life Fellow of IEEE, and EURASIP. He has served the IEEE in several posts, including that of a Distinguished Lecturer for the IEEE-SPS.

Title: Provable Plug-and-Play Diffusion Posterior Sampling for Inverse Problems

Yuejie Chi, Carnegie Mellon University

Abstract: Diffusion models, which convert noise into new data instances by learning to reverse a Markov diffusion process, have become a cornerstone in generative AI. While their practical power has now been widely recognized, the theoretical underpinnings remain far from mature. We first develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models in discrete time for both deterministic and stochastic samplers, highlighting fast convergence under mild data assumptions. Motivated by this theory, we then advocate diffusion models as an expressive data prior in solving ill-posed inverse problems, and introduce a plug-and-play method (DPnP) to perform posterior sampling. DPnP alternatively calls two samplers, a proximal consistency sampler solely based on the forward model, and a denoising diffusion sampler solely based on the score functions of the data prior. Performance guarantees and numerical examples will be demonstrated to illustrate the promise of DPnP.

Bio: Dr. Yuejie Chi is the Sense of Wonder Group Endowed Professor of Electrical and Computer Engineering in AI Systems at Carnegie Mellon University, with courtesy appointments in the Machine Learning department and CyLab. She received her Ph.D. and M.A. from Princeton University, and B. Eng. (Hon.) from Tsinghua University, all in Electrical Engineering. Her research interests lie in the theoretical and algorithmic foundations of data science, signal processing, machine learning and inverse problems, with applications in sensing, imaging, decision making, and AI systems. Among others, Dr. Chi received the Presidential Early Career Award for Scientists and Engineers (PECASE), SIAM Activity Group on Imaging Science Best Paper Prize, IEEE Signal Processing Society Young Author Best Paper Award, and the inaugural IEEE Signal Processing Society Early Career Technical Achievement Award for contributions to high-dimensional structured signal processing. She is an IEEE Fellow (Class of 2023) for contributions to statistical signal processing with low-dimensional structures.

Title: Out-of-Distribution Detection via Multiple Testing

Venu Veeravalli, University of Illinois at Urbana-Champaign

Abstract: Out-of-Distribution (OOD) detection in machine learning refers to the problem of detecting whether the machine learning model’s output can be trusted at inference time. This problem has been described qualitatively in the literature, and a number of ad hoc tests for OOD detection have been proposed. In this talk we outline a principled approach to the OOD detection problem, by first defining the problem through a hypothesis test that includes both the input distribution and the learning algorithm. Our definition provides insights for the construction of good tests for OOD detection. We then propose a multiple testing inspired procedure to systematically combine any number of different OOD test statistics using conformal p-values. Our approach allows us to provide strong guarantees on the probability of incorrectly classifying an in-distribution sample as OOD. In our experiments, we find that the tests proposed in prior work perform well in specific settings, but not uniformly well across different types of OOD instances. In contrast, our proposed method that combines multiple test statistics performs uniformly well across different datasets, neural networks and OOD instances.

Bio: Prof. Veeravalli received the Ph.D. degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 1992. He is currently the Henry Magnuski Professor in the Department of Electrical and Computer Engineering (ECE) at the University of Illinois at Urbana-Champaign, where he also holds appointments with the Coordinated Science Laboratory (CSL), the Department of Statistics, and the Discovery Partners Institute. He was on the faculty of the School of ECE at Cornell University before he joined Illinois in 2000. He served as a program director for communications research at the U.S. National Science Foundation in Arlington, VA during 2003-2005. His research interests span the theoretical areas of statistical inference, machine learning, and information theory, with applications to data science, wireless communications, and sensor networks. He is currently the Editor-in-Chief of the IEEE Transactions on Information Theory. He is a Fellow of the IEEE and a Fellow of the Institute of Mathematical Statistics (IMS). Among the awards he has received for research and teaching are the IEEE Browder J. Thompson Best Paper Award, the U.S. Presidential Early Career Award for Scientists and Engineers (PECASE), the Abraham Wald Prize in Sequential Analysis (twice), and the Fulbright-Nokia Chair in Information and Communication Technologies.