How Does Access Impact Risk? Assessing AI Foundation Model Risk Along a Gradient of Access
Zoë Brammer, along with contributors from the AI Foundation Model Access Working Group
SUMMARY
In recent months, a number of leading AI labs have released advanced AI systems. While some models remain highly restricted, limiting who can access the model and its components, others provide fully open access to their model weights and architecture. To date, there is no clear method for understanding the risks that can arise as access to these models increases. Our latest work seeks to address this gap.
Over the course of the last 6 months, IST has convened a series of closed-door working group meetings with AI developers, researchers, and practitioners; sent a survey to representatives of leading AI labs, think tanks, and academic institutions; led expert interviews; and conducted IST staff research. We focused on a key question:
How does access to foundation models and their components impact the risk they pose to individuals, groups, and society?
How Does Access Impact Risk is the result of this collective effort. The report develops a matrix to map categories of risk against a gradient of access to AI foundation models.
Conclusions
Based on the results of this novel analytical approach, this study draws a number of preliminary conclusions about the relationship between risk and access to AI foundation models:
- Uninhibited access to powerful AI models and their components significantly increases the risk these models pose across a range of categories, as well as the ability for malicious actors to abuse AI capabilities and cause harm.
- Specifically, as access increases, the risk of malicious use (such as fraud and other crimes, the undermining of social cohesion and democratic processes, and/or the disruption of critical infrastructure), compliance failure, taking the human out of the loop, and capability overhang (model capabilities and aptitudes not envisioned by their developers) all increase.
- At the highest levels of access, the risk of a “race to the bottom”–a situation in which conditions in an increasingly crowded field of cutting-edge AI models might incentivize developers and leading labs to cut corners in model development–increases when assuming a “winner takes all” dynamic.
- As access increases, the risk of reinforcing bias–the potential for AI to inadvertently further entrench existing societal biases and economic inequality as a result of biased training data or algorithmic design–fluctuates.
Acknowledgments: This work is inherently collaborative. As researchers, conveners, and facilitators, IST is immensely grateful to the members of the AI Foundation Model Access working group. We also appreciate the generous support of the Patrick J. McGovern Foundation, whose funding allowed us to continue this project through the lens of IST’s Applied Trust and Safety program. While each working group member or external reviewer does not necessarily endorse everything written in this report, we extend our gratitude to the following contributors and editors in particular: Anthony Aguirre, Markus Anderljung, Chloe Autio, Chris Byrd, Gaia Dempsey, David Evan Harris, Vaishnavi J., Landon Klein, Sébastien Krier, Jeffrey Ladish, Nathan Lambert, Aviv Ovadya, Elizabeth Seger, and Deger Turan. Additionally, not everyone in the working group could choose to be named openly as contributors. We are just as grateful to them, including those individuals we have worked closely with and who helped to inspire the initial effort.
download pdf