Title: Learning from Unreliable Labels via Crowdsourcing

Georgios B. Giannakis, University of Minnesota

Abstract: Crowdsourcing, as the name suggests, harnesses the information provided by crowds of human annotators to perform learning tasks, such as word tagging in natural language processing, crowdsensing, and ChatGPT, among others. Even though crowdsourcing can be efficient and relatively inexpensive, combining the noisy, scarce, and potentially adversarial responses provided by multiple annotators of unknown expertise can be challenging, especially in unsupervised setups, where no ground-truth data is available.

Focusing on the classification task, the first part of this talk will touch upon models and algorithms for label fusion along with their performance.  Approaches will be also discussed for data-aware crowdsourcing, and links will be outlined with deep-, self-supervised, and meta-learning. Aiming to robustify crowdsourced classification against adversarial attacks, the last part will cover spectrum based algorithms to flag and mitigate the effect of spammers. If time allows, means of dealing with dependednt annotators will be discussed briefly.

BIO: Georgios B. GIANNAKIS received his Diploma in Electrical Engr. (EE) from the Ntl. Tech. U. of Athens, Greece, 1981. From 1982 to 1986 he was with the U. of Southern California (USC), where he received his MSc. in EE, 1983, MSc. in Mathematics, 1986, and Ph.D. in EE, 1986. He was with the U. of Virginia from 1987 to 1998, and since 1999 he has been with the U. of Minnesota (UMN), where he held an Endowed Chair of Telecommunications, served as director of the Digital Technology Center 2008-21, and since 2016 he holds a UMN Presidential Chair in ECE.

His interests span the areas of statistical learning, communications, and networking – subjects on which he has published more than 495 journal papers, 805 conference papers, 26 book chapters, two edited books and two research monographs. Current research focuses on Data Science with applications to IoT, and power networks with renewables. He is the (co-) inventor of 36 issued patents, and the (co-)recipient of 10 best journal paper awards from the IEEE Signal Processing (SP) and Communications Societies, including the G. Marconi Prize. He received the IEEE-SPS Norbert Wiener Society Award (2019); EURASIP’s A. Papoulis Society Award (2020); Technical Achievement Awards from the IEEE-SPS (2000) and from EURASIP (2005); the IEEE ComSoc Education Award (2019); and the IEEE Fourier Technical Field Award (2015). He is a member of the Academia Europaea, Greece’s Academy of Athens, and Fellow of the National Academy of Inventors, the European Academy of Sciences, UK’s Royal Academy of Engineering, Life Fellow of IEEE, and EURASIP. He has served the IEEE in several posts, including that of a Distinguished Lecturer for the IEEE-SPS.

Title: Score-based Diffusion Models: Data Generation and Inverse Problems

Yuejie Chi, Carnegie Mellon University

Abstract: Diffusion models, which convert noise into new data instances by learning to reverse a Markov diffusion process, have become a cornerstone in generative AI. While their practical power has now been widely recognized, the theoretical underpinnings remain far from mature. We first develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models in discrete time for both deterministic and stochastic samplers, highlighting fast convergence under mild data assumptions. Motivated by this theory, we then advocate diffusion models as an expressive data prior in solving ill-posed inverse problems, and introduce a plug-and-play method (DPnP) to perform posterior sampling. DPnP alternatively calls two samplers, a proximal consistency sampler solely based on the forward model, and a denoising diffusion sampler solely based on the score functions of the data prior. Performance guarantees and numerical examples will be demonstrated to illustrate the promise of DPnP.

Bio: Dr. Yuejie Chi is the Sense of Wonder Group Endowed Professor of Electrical and Computer Engineering in AI Systems at Carnegie Mellon University, with courtesy appointments in the Machine Learning department and CyLab. She received her Ph.D. and M.A. from Princeton University, and B. Eng. (Hon.) from Tsinghua University, all in Electrical Engineering. Her research interests lie in the theoretical and algorithmic foundations of data science, signal processing, machine learning and inverse problems, with applications in sensing, imaging, decision making, and AI systems. Among others, Dr. Chi received the Presidential Early Career Award for Scientists and Engineers (PECASE), SIAM Activity Group on Imaging Science Best Paper Prize, IEEE Signal Processing Society Young Author Best Paper Award, and the inaugural IEEE Signal Processing Society Early Career Technical Achievement Award for contributions to high-dimensional structured signal processing. She is an IEEE Fellow (Class of 2023) for contributions to statistical signal processing with low-dimensional structures.

Title: Out-of-Distribution Detection via Multiple Testing

Venu Veeravalli, University of Illinois at Urbana-Champaign

Abstract: Out-of-Distribution (OOD) detection in machine learning refers to the problem of detecting whether the machine learning model’s output can be trusted at inference time. This problem has been described qualitatively in the literature, and a number of ad hoc tests for OOD detection have been proposed. In this talk we outline a principled approach to the OOD detection problem, by first defining the problem through a hypothesis test that includes both the input distribution and the learning algorithm. Our definition provides insights for the construction of good tests for OOD detection. We then propose a multiple testing inspired procedure to systematically combine any number of different OOD test statistics using conformal p-values. Our approach allows us to provide strong guarantees on the probability of incorrectly classifying an in-distribution sample as OOD. In our experiments, we find that the tests proposed in prior work perform well in specific settings, but not uniformly well across different types of OOD instances. In contrast, our proposed method that combines multiple test statistics performs uniformly well across different datasets, neural networks and OOD instances.

Bio: Prof. Veeravalli received the Ph.D. degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 1992. He is currently the Henry Magnuski Professor in the Department of Electrical and Computer Engineering (ECE) at the University of Illinois at Urbana-Champaign, where he also holds appointments with the Coordinated Science Laboratory (CSL), the Department of Statistics, and the Discovery Partners Institute. He was on the faculty of the School of ECE at Cornell University before he joined Illinois in 2000. He served as a program director for communications research at the U.S. National Science Foundation in Arlington, VA during 2003-2005. His research interests span the theoretical areas of statistical inference, machine learning, and information theory, with applications to data science, wireless communications, and sensor networks. He is currently the Editor-in-Chief of the IEEE Transactions on Information Theory. He is a Fellow of the IEEE and a Fellow of the Institute of Mathematical Statistics (IMS). Among the awards he has received for research and teaching are the IEEE Browder J. Thompson Best Paper Award, the U.S. Presidential Early Career Award for Scientists and Engineers (PECASE), the Abraham Wald Prize in Sequential Analysis (twice), and the Fulbright-Nokia Chair in Information and Communication Technologies.

Title: A Revisit of Nine Challenges in Artificial Intelligence and Wireless Communications

Wen Tong, Huawei Wireless

Abstract: In this talk, we address the early discussions on the nine challenges in AI and wireless communications for 6G: Challenge 1: Computing Crisis of LLM; Challenge 2: Gradient Disappearance in DL; Challenge 3: Memory of the Deep Learning Neural Networks; Challenge 4: Dependence of DL on Big Data; Challenge 5: Dynamic Accretionary and Meta-learning; Challenge 6: Wireless Data aided Collective Learning; Challenge 7:  Wireless Communications enabled Federated learning; Challenge 8: Foundation of Semantic Communications; Challenge 9: Structure of Semantic Communication Systems. Over the past 2 years, the rapid development of GPT and the phenomenal success of LLM have provided some of the answers of these 9 challenges stemmed from pre-ChatGPT era. The progress of the transformer technology has shred new lights and insights for theses nine challenges.

Bio: Wen Tong (Fellow, IEEE) received the B.S. degree from the Department of Radio Engineering, Nanjing Institute of Technology, Nanjing, China, in 1984, and the M.Sc. and Ph.D. degrees in electrical engineering from Concordia University, Montreal, QC, Canada, in 1986 and 1993, respectively. He joined the Wireless Technology Labs, Bell Northern Research, Canada, in 1995. In 2011, he was appointed as the Head of the Communications Technologies Labs, Huawei. He also spearheads and leads Huawei’s 5G wireless technologies research and development. Prior to joining Huawei in 2009, he was the Nortel Fellow and the Head of the Network Technology Labs, Nortel. He is currently the Huawei Fellow and the CTO of Huawei Wireless. He is the Head of the Huawei Wireless Research. He pioneered fundamental technologies from 1G to 5G wireless with more than 500 awarded

U.S. patents. He was elected as a Huawei Fellow. He was a recipient of the IEEE Communications Society Industry Innovation Award for “the leadership and contributions in development of 3G and 4G wireless systems” in 2014, and the IEEE Communications Society Distinguished Industry Leader Award for “pioneering technical contributions and leadership in the mobile communications industry and innovation in 5G mobile communications technology” in 2018. He is a fellow of the Canadian Academy of Engineering and a fellow of Royal Society of Canada. He also serves as the Board of Director for WiFi Alliance.

Title: Uncertainty Quantification for Detecting Hallucinations in Large Language Models

András György, Google DeepMind

Abstract: Detecting hallucinated answers is an important task to ensure factuality of large language models (LLMs). When no external information is available, hallucinations are often identified based on the uncertainty of the predictions. On the other hand, uncertainty can be epistemic or aleatoric, where the former comes from the lack of knowledge about the ground truth (such as about facts or the language), and the latter comes from irreducible randomness (such as multiple possible answers), and only epistemic uncertainty is related to hallucinations. In this talk I will overview some methods to estimate uncertainty in general (in particular, taking into account that the same things can be expressed in natural language multiple ways), and present some new results on uncertainty estimation with theoretical guarantees, with special considerations to estimating if the epistemic uncertainty is large. The latter approach allows detecting hallucinations for both single- and multi-answer queries, which is in contrast to many standard uncertainty quantification strategies which are not able to detect hallucinations in the multi-answer case.

Bio: András György is a Senior Staff Research Scientist at Google DeepMind, London, UK. He received his Ph.D. from the Budapest University of Technology and Economics, Hungary. He held research positions at the Institute for Computer Science and Control (SZTAKI), Hungary, leading the Machine Learning Research Group, and at the University of Alberta, Canada. He was also a faculty member at the Department of Electrical and Electronic Engineering, Imperial College London, UK. His research interests include machine learning, statistical learning theory, online learning, optimization and, more recently, large language models. Among others, Dr. György received a best paper award at the 7th IEEE Global Conference on Signal and Information Processing (GLOBALSIP2019) in 2019, a best paper runner-up award at the 34th Annual Conference on Learning Theory (COLT 2021), the Gyula Farkas prize of the János Bolyai Mathematical Society in 2001, and the Academic Golden Ring of the President of the Republic of Hungary in 2003.

Title: The Performance of Meaning: Breathing Life into Text

Andrew Breen, Amazon

Abstract: Generative AI-based technologies such as Alexa, Amazon Q, Claude v3, GPT-4o, and Gemini v1.5, are revolutionizing how we interact with machines through advances in LLMs and Foundation Models. These AI systems, trained on massive amounts of text data, can now hold conversations, translate languages, write creatively, and answer questions in a natural way. Initially, such interactions were text-based, but increasingly, these systems are offering spoken language interfaces. Text, though powerful and natural, lacks the nuance and ease of spoken language, which has the power to move us deeply, conveying unspoken thoughts, desires, and passions. It is the very essence of human interaction, from everyday conversations to artistic expression. When combined with other modalities, spoken language offers an unparalleled medium for communication and creativity. A machine capable of replicating this is the holy grail of human-machine communication. By unlocking the full expressive power of spoken language, LLMs are fundamentally transforming how we interact with machines. Virtual assistants such as Alexa have pioneered natural spoken language interactions with machines since 2014. Such systems showcase direct interaction with the real world, grounding knowledge in reality, and taking actions (e.g., manage shopping lists, control smart home appliances, etc.), but remain limited in the complexity of human-machine discourse they can support. The advent of LLMs has broken through that “glass ceiling”, promising truly human-like discourse.

This talk will explore the history of speech generation, its transformation from a niche field to an everyday technology. We’ll delve into the groundbreaking impact of neural speech generation in 2016, which revolutionized what was thought possible just two years prior, and show how LLMs are poised to offer equally astonishing breakthroughs. The talk concludes by offering a glimpse into the exciting future of human-machine interaction powered by spoken language.

Bio: Andrew Breen has a B.Sc. Hons, in Physics with Computing Physics from University College Swansea, an M.Sc. (Eng.) by research from Liverpool University, and a Ph.D. in Speech Science from University College London. He is a long standing member of the Institution of Engineering and Technology (MIET). He was awarded the IEE J. Langham Thomson premium in 1993, and has received business awards from BT, MCI and Nuance. Andrew has been an industrial representative on two European funded projects. He was a founder of SSW (Speech Synthesis Workshop), and has been on the organising committee for InterSpeech. Andrew worked for a number of years at BT Labs., initially on Automatic Speech Recognition (ASR), and then lead teams on Text-To-Speech (TTS), Avatars and multi-modal distributed systems. While at BT he invented the Laureate TTS system. In 1999 he joined the University of East Anglia as a Sr. Lecturer, but two years later join Nuance as founder of their TTS organisation. After Nuance’s acquisition by ScanSoft, he took on a number of roles, including head of TTS research and languages, Head of TTS research and product, Director of embedded TTS for automotive, and Director of TTS Research and Product Development in India and China. He has extensive experience of managing remote teams, having lead teams across the US, Europe, India and China. In 2017 he joined Amazon as head of TTS research, managing the team which produced Amazon’s first Neural TTS system. He is currently research director of speech and audio generation in Amazon’s AGI organisation.