AI Policy Primer (May 2024)
Issue #10: Seoul Summit, global values, and systemic safety
In this month’s AI Policy Primer, we look at the Seoul Summit, recent research centering global values in large language models, and the UK AI Safety Institute’s new work on systemic safety. We also published an overview of the AI policy landscape earlier this week, which introduces a 4-box model to organise the topics that we think AI policy practitioners may need to understand. As always, let us know if you have any feedback at aipolicyperspectives@google.com.
Policymakers taking action
Seoul Summit strengthens AI safety coordination
What happened: The Republic of Korea and the UK co-hosted the AI Seoul Summit. The follow-up to last year’s Summit at Bletchley Park, the event convened representatives from 28 governments (including the US & China) industry, academia and civil society to discuss ‘three critical priorities on AI: safety, innovation and inclusivity’.
What’s interesting: The Summit produced several outputs that pushed forward international cooperation on frontier AI safety.
Frontier AI Safety Commitments: commitments from 16 leading AI companies to publish safety frameworks (if they have not done so already) by the next Summit in France in February 2025 about how they will measure the risks of frontier models. Our recent Frontier Safety Framework outlined Google DeepMind’s approach, which comes as the emerging dynamic of responsible capability scaling continues to gain traction.
International Network of AI Safety Institutes: a new agreement—backed by 10 countries including the US & UK in addition to the EU—to build “complementarity and interoperability” between technical work and approaches to safety to promote the safe, secure and trustworthy development of AI.
Interim International Scientific Report on the Safety of Advanced AI: a new report, loosely inspired by the Intergovernmental Panel on Climate Change, that aims to provide an independent and inclusive ‘state of the science’ report on the capabilities and risks of frontier AI.
Looking ahead: The new safety frameworks published by AI labs ahead of the France Summit is likely to establish responsible capability scaling as an industry norm. This will kickstart a process of industry best practices being agreed and adopted by a critical mass of labs within the next two years, and may spur a wave of empirical research into scaling, safety and capabilities evaluations. We also published a blogpost with ideas about how the Summits in Seoul, France and beyond can galvanise international cooperation on frontier AI safety.
Study watch
Researchers eye global values
What happened: Researchers from the University of Oxford, New York University, Meta, Cohere, and elsewhere released a study looking at how preferences for language models differ across the world. The group compiled a database, PRISM, which represents the end result of a large-scale experiment in which 1,500 participants from 75 countries provided details of their background, familiarity with LLMs and stated preferences for fine-grained behaviours (i.e. specific information about how they want an LLM to behave).
What’s interesting:
The work explored how different constituencies are likely to use language models. It found, for example, that older people (55+) are more likely to talk about elections and seek travel recommendations compared to younger people (18-24 years) who are more likely to discuss managing relationships or job searches.
The work is the latest in the long line of work exploring with which values language models ought to be aligned, while earlier this month OpenAI released its ‘model spec’ to explain how it makes decisions about how it shapes model behaviour (e.g. how ChatGPT responds to NSFW requests). While developers often seek to empower users to change certain aspects of model behaviour through functions like user instructions and custom safety filters, developers are increasingly considering a “personalisation within bounds” model that sets overall guardrails for model behaviour while allowing for some flexibility within this boundary.
Looking ahead: In the future, we anticipate that labs will introduce personalisation tools into consumer platforms to allow users to shape behaviour on sensitive queries. If adopted, this approach will represent a significant shift away from platform policies in which the platform makes all content decisions.
What we’re hearing
UK launches systemic AI safety programme
What happened: The UK government announced an £8.5 million grants programme to fund research into systemic AI safety. The programme will be led by the UK AI Safety Institute (AISI) in partnership with UK Research and Innovation (UKRI) and the Alan Turing Institute.
What's interesting:
The grants aim to broaden the AISI's remit to include 'systemic AI safety', which seeks to manage societal-level impacts of AI and help existing institutions, systems and infrastructure adapt to the diffusion of AI. As the group explains, “addressing AI's risks to people and society requires looking beyond AI models' capabilities.”
Potential research areas include curbing the spread of AI-generated misinformation, understanding how to adapt infrastructure and systems for a world with widespread AI usage, and generating ideas for safely deploying AI in society. The programme aims to attract proposals from researchers in academia, industry and the public sector. Those that are particularly promising may receive further funding to support their development into fuller, longer-term projects.
The move comes as governments increasingly recognise the need to proactively use AI to mitigate risks. The grants build on to the "defensive AI acceleration" (d/acc) concept advocated by Vitalik Buterin and further amplified by Matt Clifford, which argues that we need to build defensive technologies to protect against AI threats (see our recent post on the same topic).
Looking ahead: As we continue to see capabilities—and the number of AI applications—grow, we expect new government initiatives seeking to use AI in prosocial ways to bolster societal infrastructure and enhance societal defences. We anticipate that the first major initiative in this vein will have links to existing government priorities such as climate change or cybersecurity.