Wedneday, June 12th
Location: Room G115, ground floor of Maxwell-Dworkin Building, 33 Oxford Street, Cambridge, MA 02138
8-9:15 Breakfast
9:15-10:15 Keynote: Adam Kalai
Title: When Calibration Goes Awry: Hallucination in Language models
Abstract: We show that calibration, which is naturally encouraged by the loss minimized during pre-training of language models, leads to certain type of hallucinations. Moreover, the rate of hallucinations depends on the domain via the classic Good-Turing estimator. Interestingly, this estimate is small for domains like paper references which have been a notorious source of hallucinations. The analysis also suggests methods for mitigating hallucinations.
This is joint work with Santosh Vempala and was done while the speaker was at MSR.
Bio: Adam Tauman Kalai is a Member of Technical Staff and Research Scientist at OpenAI working on AI Safety and Ethics. Previously, he was a Senior Principal Researcher at Microsoft Research New England. He has worked in multiple fields, including Algorithms, Fairness, Machine Learning Theory, Game Theory, and Crowdsourcing. He received his BA from Harvard and PhD from Carnegie Mellon University. He has also served as an Assistant Professor at Georgia Tech and the Toyota Technological Institute at Chicago, and is a member of the science team of the whale-translation Project CETI. His honors include the Majulook prize, best paper awards, an NSF CAREER award, and an Alfred P. Sloan fellowship.
10:15-10:45 Coffee break
10:45-12:00 Session 1
Session chair: Guy Rothblum
Complexity-Theoretic Implications of Multi-calibration. abstract, Arxiv
Sílvia Casacuberta, Cynthia Dwork and Salil Vadhan
Loss Minimization Yields Multicalibration for Large Neural Networks. Arxiv
Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai and Preetum Nakkiran
A Unifying Perspective on Multicalibration: Game Dynamics for Multi-Objective Learning. Arxiv
Nika Haghtalab, Mike Jordan and Eric Zhao
Stability and Multigroup Fairness in Ranking with Uncertain Predictions. abstract, Arxiv
Siddartha Devic, Aleksandra Korolova, David Kempe and Vatsal Sharan
Multigroup Robustness. abstract, Arxiv
Lunjia Hu, Charlotte Peale and Judy Hanwen Shen
12:00-2:00 Lunch (on your own)
2:00-3:15 Session 2
Session chair: Eliad Tsafadia
Local Lipschitz Filters for Bounded-Range Functions with Applications to Arbitrary Real-Valued Functions. Arxiv
Jane Lange, Ephraim Linder, Sofya Raskhodnikova and Arsen Vasilyan
Counting Distinct Elements in the Turnstile Model with Differential Privacy under Continual Observation. Arxiv
Palak Jain, Iden Kalemaj, Sofya Raskhodnikova, Satchit Sivakumar and Adam Smith
Differential Privacy on Trust Graphs. Full version
Badih Ghazi, Ravi Kumar, Pasin Manurangsi and Serena Wang
Time-Aware Projections: Truly Node-Private Graph Statistics under Continual Observation. Arxiv
Palak Jain, Adam Smith and Connor Wagaman
Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares. Arxiv
Gavin Brown, Jonathan Hayase, Sam Hopkins, Weihao Kong, Xiyang Liu, Sewoong Oh, Juan Carlos Perdomo and Adam Smith
3:15-3:45 Coffee break
3:45-5 Session 3
Session chair: Lee Cohen
Can Copyright be Reduced to Privacy? Full version
Niva Elkin-Koren, Uri Hacohen, Roi Livni and Shay Moran
Score Design for Multi-Criteria Incentivization. Full version
Anmol Kabra, Mina Karzand, Tosca Lechner, Nati Srebro and Serena Wang
The Relative Value of Prediction in Algorithmic Decision Making. Arxiv
Juan Perdomo
When are Two Lists Better than One?: Benefits and Harms in Joint Decision-Making. Arxiv
Kate Donahue, Sreenivas Gollapudi and Kostas Kollias
Interpolating Item and User Fairness in Multi-Sided Recommendations. abstract, Arxiv
Qinyi Chen, Jason Cheuk Nam Liang, Negin Golrezaei and Djallel Bouneffouf
Thursday, June 13th
Location: Room G115, ground floor of Maxwell-Dworkin Building, 33 Oxford Street, Cambridge, MA 02138
8-9:15 Breakfast
9:15-10:15 Keynote: Florian Tramer
Title: Stealing a Generative AI’s Secrets (Responsibly)
Abstract: Companies that develop generative AI tools, such as ChatGPT, keep most development and deployment details secret. We typically don’t know what the underlying model looks like (or how big it is), what it was trained on, or what safety measures are applied. In this talk, I’ll show how we reverse-engineered such secrets from various production systems. I’ll conclude with a discussion of responsible disclosure practices in today’s AI world, and how we might improve them.
Bio: Florian Tramèr is an assistant professor of computer science at ETH Zurich. Before that he was a PhD student at Stanford University, and a visiting researcher at Google Brain. His research interests lie in Computer Security, Cryptography and Machine Learning security. In his current work, he studies the worst-case behavior of Deep Learning systems from an adversarial perspective, to understand and mitigate long-term threats to the safety and privacy of users. He is a recipient of the 2021 Rising star in Adversarial ML award. Together with collaborators, his research was selected as runner-up for the 2022 Caspar Bowden Award, and a winner of the USENIX Security 2023 distinguished paper award.
10:15-10:45 Coffee break
10:45-12:00 Session 4
Session chair: Jonathan Ullman
Online Algorithms with Limited Data Retention. Full version. Arxiv
Nicole Immorlica, Brendan Lucier, Markus Mobius and James Siderius
Leave-One-Out Distinguishability in Machine Learning. Arxiv
Jiayuan Ye, Anastasia Borovykh, Soufiane Hayou and Reza Shokri
Low-Cost High-Power Membership Inference Attacks by Boosting Relativity. abstract, Arxiv
Sajjad Zarifzadeh, Philippe Liu and Reza Shokri
Synthetic Census Data Generation via Multidimensional Multiset Sum. Arxiv
Cynthia Dwork, Kristjan Greenewald and Manish Raghavan
Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest. Full version
Basileal Imana, Aleksandra Korolova and John Heidemann
12:00-2:00 Lunch (on your own)
2:00-3:00 Session 5
Session chair: Guy Rothblum
Balanced Filtering via Disclosure-Controlled Proxies. Full version
Siqi Deng, Emily Diana, Michael Kearns and Aaron Roth
Drawing Competitive Districts in Redistricting. Full version
Gabriel Chuang, Oussama Hanguir and Clifford Stein
Distribution-Specific Auditing For Subgroup Fairness. Full version
Daniel Hsu, Jizhou Huang and Brendan Juba
3:00-3:30 Coffee break
3:30-4:30 Poster session (Maxwell-Dworkin lobby)
Friday, June 14th
Location: Northwest Building, B103, 52 Oxford Street. Cambridge, MA 02138
***Note that Friday’s talks are at a different location***
8-9:15 Breakfast
9:15-10:15 Keynote: Finale Doshi-Velez
Title: Opportunities from Involving People in Machine Learning Validation.
Abstract: Oftentimes, real situations are sufficiently complex that a machine learning model cannot be validated by computational means alone. For example, the many ways in which electronic health record are deficient are hard to formalize and quantify, making it also difficult to validate the quality of a model built from them. In this talk, I will discuss about two ways in which human feedback can be used as part of a more holistic approach to model validation. The first is by direct human inspection of the model. This results in the need to build small but still highly-performant models, which is an interesting computational challenge in itself. I will share how we were able to create models with only a few discrete states with performance similar to that of much more complex models. Second, people can validate a part of the data. As an example, I will discuss how we can improve our confidence in a model quality estimate by having a person label influential data points as typical or atypical. Through these examples of human-assisted validation, I hope to expose areas for theoretical work designed to improve validation in human+model settings, rather than in model-only settings.
Bio: Finale Doshi-Velez is a Gordon McKay Professor in Computer Science at the Harvard Paulson School of Engineering and Applied Sciences. She completed her MSc from the University of Cambridge as a Marshall Scholar, her PhD from MIT, and her postdoc at Harvard Medical School. Her interests lie at the intersection of machine learning, healthcare, and interpretability.
10:15-10:45 Coffee break
10:45-12:00 Session 6
Session chair: Ran Canetti
Modeling Diversity Dynamics in Time-Evolving Collaboration Networks. Full version
Christopher Archer and Gireeja Ranade
Content Moderation and the Formation of Online Communities: A Theoretical Framework. abstract, Arxiv
Cynthia Dwork, Chris Hays, Jon Kleinberg and Manish Raghavan
Equilibria, Efficiency, and Inequality in Network Formation for Hiring and Opportunity. abstract, Arxiv
Cynthia Dwork, Chris Hays, Jon Kleinberg and Manish Raghavan
Markovian Search with Socially Aware Constraints. abstract, Full version
Mohammad Reza Aminian, Vahideh Manshadi and Rad Niazadeh
Dynamic Matching with Post-allocation Service and its Application to Refugee Resettlement. abstract, Full version
Kirk Bansak, Soonbong Lee, Vahideh Manshadi, Rad Niazadeh and Elisabeth Paulson
12:00-2:00 Lunch (on your own)
2:00-3:15 Session 7
Session chair: Sofya Raskhodnikova
DPZero: Private Fine-Tuning of Language Models without Backpropagation. Arxiv
Liang Zhang, Bingcong Li, Kiran Koshy Thekumparampil, Sewoong Oh and Niao He
Shifted Interpolation for Differential Privacy. abstract, Arxiv
Jinho Bok, Weijie Su and Jason Altschuler
Differentially Private Optimization with Sparse Gradients. Arxiv
Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Ravi Kumar and Pasin Manurangsi
Attaxonomy: Unpacking Differential Privacy Guarantees Against Practical Adversaries. Arxiv
Rachel Cummings, Shlomi Hod, Jayshree Sarathy and Marika Swanberg
Incentivized Collaboration in Active Learning. Full version (coming soon)
Lee Cohen and Han Shao
3:15-3:45 Coffee break
3:45-5 Session 8
Session chair: Guy Rothblum
Adaptive Data Analysis in a Balanced Adversarial Model. Arxiv
Kobbi Nissim, Uri Stemmer and Eliad Tsfadia
Effects of Privacy-Inducing Noise on Welfare and Influence of Referendum Systems. Full version
Suat Evren and Praneeth Vepakomma
Privacy Can Arise Endogenously in an Economic System with Learning Agents. Full version
Nivasini Ananthakrishnan, Tiffany Ding, Mariel Werner, Sai Praneeth Karimireddy and Michael Jordan
Supply-Side Equilibria in Recommender Systems. Arxiv
Meena Jagadeesan, Nikhil Garg and Jacob Steinhardt
Contract Design With Safety Inspections. abstract, Arxiv
Alireza Fallah and Michael Jordan