This post is written by Conor Griffin, Policy Research lead at Google DeepMind. It is the first in our new AI Policy Note series on AI Policy Perspectives. With these Policy Notes, we aim to provide readers with mental models for how to think about different AI policy topics, as well as a synthesis of the evidence on them. Thanks to Harry Law, Sébastien Krier, Jennifer Beroshi, and Nicklas Lundblad for comments.
3 Key Points
In this article, we introduce the AI Policy Atlas, a 4-box model to organise the topics that we think AI policy practitioners may need to understand.
The goal is to encourage policy teams to deliberate on what topics are most and least relevant to them. And to show why it makes sense to widen participation in AI policy discussions, to include everybody from venture capitalists to computational biologists.
We also (briefly) analyse a subset of topics from the Atlas, from the evolving AI supply chain to how AI may affect information quality. In the coming year, we hope to write short notes on select topics from the Atlas. If you'd like to contribute to an AI Policy Note on a certain topic, please get in touch at aipolicyperspectives@google.com.
The landscape of AI policy is vast and intricate, encompassing everything from election interference to safety issues to worries that governments are not sufficiently benefiting from AI. This complexity also leads to a more fundamental question: what exactly is "AI policy"?
A glance at current priorities reveals a range of activities. Policymakers in the UK, US Japan, and Canada are building new AI Safety Institutes, with more countries likely to follow. The EU is shifting its attention from passing, to implementing, the EU AI Act. Singapore is developing a sandbox to boost small businesses’ access to generative AI tools. Policy teams in AI companies will look to feed into these efforts, while also informing their own company’s AI policies - on everything from open sourcing to terms of use. To navigate this web, AI policy practitioners must develop opinions on numerous interconnected topics, from how much to invest in technical mitigations, such as provenance and watermarking, to the extent to which schools should enable or restrain access to AI tools.
Introducing the AI Policy Atlas
In this article, we map this sprawling territory of AI policy topics. There is no widely agreed upon definition for ‘policy’, so we use the term loosely to refer to both AI public policy and AI companies’ internal policies. By public policies, we mean the range of decisions and actions that policymakers can take on AI, including strategy and agenda setting; the design, adoption, or removal of specific policy instruments, such as regulations and taxes; as well as other levers at governments’ disposal, from R&D funding to procurement. By AI companies' internal policies, we mean the decisions, actions and protocols that make up companies' attempts to develop AI responsibly - i.e. in line with society’s values, needs and expectations. Of course in both cases, not taking a decision, or an action, can also be considered as a policy.
In our Atlas, we map these policies in category 4. Before that, we map the topics that practitioners need to understand to design and implement these policies effectively, across three categories: (1) the evolving AI ecosystem; (2) the state of AI progress; and (3) the impact of AI on society.
Our Atlas has many limitations, which we describe at the end. But we hope that it can help AI policy teams on several fronts:
Reassurance: If you find yourself working on standards for data enrichment workers one day, before pirouetting into discussions about the potential of AI biology the next, you may feel like the field of AI policy is infinite and disjointed. With the Atlas, we hope to provide reassurance - including to ourselves - that there is a finite, albeit evolving, body of topics that will agglomerate productively over time.
Topic prioritisation: It's hard for AI policy teams to deliberately prioritise some policy topics over others, and such prioritisation often happens passively by responding to requests or by focussing on what is most visible or tractable. Mapping AI policy topics forces policy teams to consider whether they are over or under-prioritising certain topics.
Access to expertise: Mapping AI policy topics quickly highlights that it is impossible for AI policy teams to have deep expertise in all, or even most of these topics. As such, the Atlas is also a call to widen the diversity of participants in AI policy discussions, to include everybody from researchers working on mechanistic interpretability and sociotechnical evaluations; to lawyers and security specialists working on topics such as fiduciary duties and secure research environments. By creating their own AI Policy Atlas, policy teams can also decide where they are best placed to contribute, where they need to build up external expertise, and where they need to actively avoid getting distracted.
Below, we introduce a brief sample of the topics from the Atlas, explain why we think they are important, and highlight information gaps that policy practitioners could address. In the coming months, we hope to write short notes on select topics from the Atlas, such as public attitudes to AI or how AI may affect climate change. Inspired by the UK Parliamentary Office POST notes, we will try to provide mental models for how to think about the topics, and synthesise the available evidence on them. If you’d like to contribute to an AI Policy Note on one of these topics, please get in touch at aipolicyperspectives@google.com.
Category 1 - The AI Ecosystem: Shifting actors, evolving trends
In category one, we want to identify the evolving set of actors that make up the AI ecosystem. A starting question is to predict who the most consequential AI developers will be over the coming 5-10 years. As outlined in the recent 2024 MAD report, the AI ecosystem is in flux. Over the past five years, a wave of AI of startups has emerged. Some, like Anthropic, Mistral, or Cohere are developing large AI models from the ground up. Others are developing narrow applications in GenAI categories, such as image generation or code assistants, or in high-priority science areas, such as protein design. Companies like Groq or Scale provide AI developers with compute and data, or ‘picks and shovels’, while established companies like Reddit or Epic, also increasingly act as data providers. Other startups, like Together, function more as market intermediaries, providing developers with chips to train, tweak and deploy open-source models. This evolving AI ecosystem challenges more simplistic framings of AI as being split between ‘developers’ and ‘users’ of the technology and will have policy implications, for example, when it comes to assigning responsibility for carrying out ethics and safety evaluations, or determining where liability falls.
We also need to understand how these AI organisations meaningfully differ. Consider business models. Many AI policy discussions focus on business-to-consumer (B2C) AI products, like chatbots or image generation tools, and the companies that develop them. This is natural as these tools tend to be (somewhat) free, easy-to-use, and very memeable. We also have readily available market size metrics on web visits, app downloads and subscriber numbers that clearly demonstrate their uptake. In contrast, many B2B or B2G AI use cases take place behind closed doors, often in relatively staid areas, such as document retrieval. These use cases typically involve subsuming AI into internal workflows, or existing products, without a visible price tag attached. This means that we have a partial picture about how AI is diffusing across society, which in turn skews the resulting policy discussions away from important topics, such as the use of AI in drug discovery or cybersecurity.
Beyond organisations, we need to track the flow and make-up of AI practitioners, and the accompanying policy questions this raises, such as: to what extent should we be worried about ‘AI brain drain’ from academia to the private sector, versus being reassured that industry wants to hire PhD graduates? To what extent should we be worried, or excited, about the flow from practitioners from more foundational AI research, to sector-specific applications? How worried should we be about more traditional AI brain drain - i.e. practitioners leaving their home countries to work in a small number of AI hotspots? There are plausible scenarios where such brain drain could be positive for the sender country, if those departing provide remittances, technology transfers and later return to help build up local ecosystems. However, historically, the geographical concentration of scientific research and spin outs has likely had detrimental effects on inequality, while no doubt also accelerating innovation. This raises a question about how AI companies, and policymakers, can better support, without co-opting, AI ecosystems outside the West, where more is already occurring than is typically credited. As AI shifts to real-world deployment, a further question is whether governments and AI companies should be funding and attracting different kinds of AI expertise, perhaps shifting away from PhDs, to more targeted engineering or data curation qualifications, or to different types of PhDs, such as in computational biology.
As AI deployment grows, the public will also become a more important actor in AI policy debates. This will require tackling difficult questions about what exactly we mean by public attitudes to AI, how to best measure it, and what AI companies and policymakers should do in response to it. For example, in the UK, a majority of the public claims to support applications of AI that many AI ethicists and practitioners take issue with, such as predictive policing and AI-enabled border control - although this support drops among people who are more likely to be negatively affected by these applications. Does this mean that AI companies should expand AI literacy programmes to better explain the risks posed by these applications? Or should they more deeply question what applications they oppose? Or both? Similarly, in the US, Democrats are more likely to support AI, and AI regulation, than Republicans are. What does that mean for how large language models should respond to politically sensitive queries? To what extent should companies develop technical, or participatory, methods that strive for outputs that meet a perceived middle ground, versus taking a strong organisational stance, versus allowing users to tune model outputs to their own persuasion?
Such questions highlight that we can't just study AI actors in isolation. We also need to identify and understand macro trends that will shape how these actors think and act on AI. In recent years, the growing tensions between the US and China have had clear knock-on effects on AI policy, via export controls on chips and discussions about how far to extend such measures. Similarly, the return of industrial policy to the West has encouraged governments to more actively support local AI champions to help promote local productivity, language and cultural resilience. What macro trends - war? climate change? demographic shifts? - will most shape the next five years of AI development and deployment?
Category 2 - The state of AI progress: Approaches, capabilities, uses
In category two, we want to get more tractable and understand the types of AI that are most likely to enter the world. There are inevitable limits to what can be done here, given the deluge of new papers, datasets, products, and APIs. But, for policy practitioners, there is also clear value to staying abreast of key points of inflection and uncertainty. We can start by identifying emerging and diverging approaches to developing AI systems that will have downstream policy effects. For example, the EU’s 2020 White Paper on AI, which laid the groundwork for the subsequent EU AI Act, was published before OpenAI’s GPT-3 paper, and the subsequent emergence of large language models (LLMs) as a dominant paradigm in AI research. Early iterations of the Act were thus modelled on product safety regulation and classified AI capabilities and applications into discrete buckets, based on their relative perceived risk, with corresponding obligations for developers and deployers/users. The emergence of LLMs, a general-purpose AI system that could be deployed across many applications, with a complex supply chain (see above), challenged the Act’s approach, prompting a rethink.
Paradigmatic shifts of the LLM variety are rare, but today’s practitioners are exploring various approaches to AI that could have policy implications. For example, to what extent should policymakers or AI labs expect research in mechanistic interpretability, machine unlearning, synthetic data, or fine-tuning to help mitigate risks posed by AI to safety, privacy, or fairness? Similarly, the recent wave of excitement about large language models has focussed on digital tasks that can be carried out entirely on a computer, rather than much of the day-to-day work in, say, construction sites, warehouses, or traditional retail outlets. To what extent will growing efforts to merge large AI models with robotics succeed? And how might this change the jobs that are most likely to be affected by AI, as well as broader public perceptions of AI, and the policy response required?
Beyond tracking emerging approaches to AI, we also need to be able to monitor, and to some degree, predict the most consequential capabilities that AI systems will have. As we describe below, there are gaps in how we evaluate AI systems, which make it difficult to determine, or describe, AI capabilities. Despite these limitations, some AI capabilities jump out as particularly relevant. These include those where near-term step jumps in performance are most plausible, such as the ability of agentic AI systems to execute workflows on behalf of users; the ability for AI systems to do novel things, such as predicting diseases from retinal images, rather than just speeding up, or improving, tasks that humans already perform (the dividing line here is fuzzy); AI capabilities that are likely to be misused by humans, such as facial recognition; AI capabilities that may be intrinsically dangerous, irrespective of how humans use them, such as deception; and AI capabilities where the underlying concepts, and science, are most heavily disputed, such as emotion analysis.
Moving beyond generic AI capabilities, the policy response to AI will largely be determined by how people use the technology. Today, this debate often starts by noting progress in a specific AI capability - say music generation - and extrapolating out to grand declarations about the expected impact on the music industry, and potential policy responses to this. Instead, we can start by trying to understand the end domain in question. For example, if we think about using AI in education, we could start by clarifying the function(s) that we think education should serve, such as preparing people for the labour market, helping to integrate people into societal structures, and/or boosting people’s autonomy. We could clarify what potential users of AI we are focussed on - e.g. students, educators, or administrators? If students, we could specify what age cohorts, or locations, we are talking about. We could clarify whether we want to help students learn specific knowledge, skills or values, and what specific obstacles we think AI could help to address. This line of reasoning may lead us to specific AI use cases, like AI tutors or personalised learning materials. At this point, we can ground ourselves by learning from past efforts to apply technology to education, which, as the scholar Mary Burns recently noted, have generated “....such excitement and idealism …..and such disappointment and cynicism”. We could then assess to what extent recent research breakthroughs, and growing student/teacher excitement about GenAI tools, could enable practitioners to overcome the limitations of past AI applications, which often took a behaviourist approach to learning, focussed on 1-2-1 repetition and memorisation. This analysis, in turn, can inform the most useful policy response.
Looking across domains, whether in finance, biology or arts & culture, the use cases for AI will look very different, and whether they get deployed, or not, will depend on an array of factors, from sector-specific regulation to the synchronicity with other sectoral technologies. The common thread is that the most viable and consequential AI use cases - e.g. weather forecasting - are unlikely to be those that are most prominent in current AI policy debates.
Category 3 - AI’s impact on society: Risks, benefits, and determining what we want
In category three, we want to understand the potential benefits and risks that AI poses to society. There are many ways to classify these impacts. Risk-focussed discussions often distinguish between discrete AI incidents, both intentional and accidental, and slow-burning societal impacts that may only become clear over time. Other AI risk taxonomies marry up AI harms that are already occurring, or are very likely to occur soon, with longer-term risks. Conversely, despite rampant AI boosterism, there is very little, substantive, policy discussion on the benefits that AI may bring to society.
There are also fundamental challenges with the risks vs benefits framing. The most obvious is AI’s inherent dual-use nature. In areas like cybersecurity, we can quickly point to things that we do want to happen - organisations having better AI-based vulnerability discovery - and things that we don’t - threat actors having better AI-based vulnerability discovery. Even in areas like fairness, privacy, or explainability, where AI is primarily spoken about as a risk, there are clear potential benefits, which we discuss below. A deeper challenge is that we often don’t know what outcome we want to occur. For example, when it comes to biosecurity, do we want large language models to be able to provide accurate answers to detailed, technical, virology questions?
In other areas, people fundamentally disagree about AI’s likely impact, and we lack the measurement tools to bring these disparate views together. For example, when it comes to AI’s potential impact on climate change, most discussions focus on the potential increase in emissions from training ever-larger language models. However, running inference on AI models already accounts for more emissions than training models, and even these inference emissions will likely be swamped by the indirect positive or negative effects on emissions that arise when people use AI applications - such as grid optimisation. If we take a longer-term view, AI could potentially significantly change the structure of the economy, energy consumption, as well as industry and consumer behaviour - with major effects on emissions. However, as noted above, we do not have reliable data on how AI is being used across society, and so we have no way to know, or even reliably intuit, how much AI is helping or hurting net zero efforts.
Given these challenges, it makes sense to go back to first principles and align on what outcomes we want from AI, chart a set of scenario(s) that may occur, think about how to measure progress, and then consider the best policy interjection points. For example, when we consider ‘information quality’, there is rightly concern in 2024 about AI-based disinformation and election interference. However, examples of AI disinformation are rare, albeit increasing, and there is reasonable scepticism about how much any content, AI-generated or otherwise, actually changes most voters’ minds. As outlined by Eric Hoel, a bigger challenge may be the rise in low-quality AI-generated content, even if such content is not necessarily false (misinformation), or intended to mislead (disinformation). Others have also argued that widespread reliance on AI-generated content could lead to a "knowledge collapse", if models start to converge towards more generic, homogenised outputs, that humans increasingly rely on, rather than actively seeking out more diverse ideas - although others have cited mitigations for this, including optimising AI for novelty.
On the other hand, AI may also help to improve the quality of information, or help people to access higher-quality information, acting as a sort of spam filter + content curator 2.0. The World Bank and others have devised techniques to quantify how society is maintaining traditional public goods, like clean air, fresh water, and forests, as well as how accessible these resources are to the public at large. If we could devise similar techniques to measure how well society is maintaining and giving access to high-quality digital information, or to the digital commons, do we think AI would be net beneficial to it?
Similarly, AI is often posed as a risk to explainability, rather than as a benefit. If we think about who needs explanations in society, and why, we can make some early distinctions. Developers of AI systems may need some mechanistic understanding of how AI models work, in order to make these models safer or fairer. Similarly, scientists looking to deploy AI in their discipline may need to understand if AI models are learning the same concepts that they rely on or something else. For both groups, there are reasons for optimism. There are inspiring examples of mechanistic explainability research and lists of open research problems to tackle. AI models are also inherently more accessible to study than human brains. There is early evidence that by studying how AI models learn we can expand human knowledge, for example, by discovering novel chess moves from AI systems.
There are also reasons for pessimism. Mechanistic explainability research tends to be slow, focussed on earlier, smaller AI models, and under-incentivised - a key policy gap. Consumers of AI products will also want something very different from an explanation, such as knowing the ins and outs of how to cancel a subscription, rather than a mechanistic explanation of how the underlying model works. As social scientists have outlined, consumers also typically view explanations, or at least good explanations, as an interactive process, with opportunities to ask questions and clarify things, rather than a dump of complex information. However, this again highlights how AI could help explainability. For example, multimodal chatbots could explain how they themselves work, as well as how other things work, in ways that are not possible today. Taken together, this raises a question: will people feel they have access to more, or less, good explanations, in an AI-enabled world?
Category 4 - The policy response
In category four, we turn to the AI policy response. We focus first on AI companies, and then policymakers, but naturally many of these policies may be led by, or delivered with, partners from across civil society, academia, and elsewhere.
Q: What topics do AI policy practitioners need to understand?
When we speak about AI companies' internal policies, we refer to the key decisions, actions and protocols that make up companies' attempts to develop and deploy AI responsibly - i.e. in line with society’s values, needs and expectations. There can be healthy scepticism about such efforts, but the concept of responsible AI is also more concrete than is often assumed. It draws on decades of work from, among others, the fields of responsible research and innovation (RRI) and technology assessment, where practitioners sought to anticipate, influence, and control the development of the emerging technologies, to ensure that they were informed by the values, needs, and expectations of society. Increasingly, we can point to a clear set of activities that organisations that develop consequential AI systems may be expected to carry out - from robust data governance and security, to foresight exercises, to provenance and watermarking (see graphic, above).
One challenge for AI companies is working out how to prioritise their time and resources, both between, and within, these efforts. For example, the AI community has traditionally evaluated safety and ethics risks from AI systems using (semi-) automated, dataset-driven techniques. As outlined in a paper by Weidinger et al., this means that other ways to evaluate AI systems have been neglected, such as post-deployment evaluations that study how flagship AI applications have actually affected society. When it comes to improving their evaluation efforts, AI companies face a range of questions, including: What impact(s) to focus on? How much to focus on developing new evaluations vs building the infrastructure to execute and use existing evaluations? How to best work with external partners and specialists, including in government? How to determine what results should constitute ‘safe enough’?
Crucially, the concept of responsible AI, like its antecedents, does not just only focus on identifying and mitigating risks, but also aims to identify, accelerate, and better distribute benefits from AI. A key focus is identifying potential market failures, where beneficial AI solutions may not get developed, or may not reach those who need them. In sectors like pharmaceuticals, we see many examples of such failures - e.g. neglected tropical diseases, as well as public-private-partnerships to address them, such as the Drugs for Neglected Diseases Initiative (DNDI). These efforts are supported by a mixture of ‘push’ (e.g. grant funding) and ‘pull’ schemes (e.g. advance market commitments) from the public and private sectors. What does the equivalent look like for AI market failures?
When we turn to AI public policy, many discussions revolve around whether or not to regulate AI, whether via new legislation, enabling acts, and/or voluntary guidelines. Of course, much of AI is already regulated, via existing privacy, consumer protection or industry-specific regulations, and the bigger challenge can be working out how to interpret or implement these regulations, and whether there are gaps to address. There is also more novel AI regulation being passed than is sometimes realised, for example at the state level in the US, where a growing number of laws now ban certain deepfakes, particularly those designed to interfere with elections, or that are used for sexual harassment and abuse.In other cases, as with liability law in the EU, policymakers are both adapting existing regulations to account for AI and scoping novel regulation that are specific to AI.
What is arguably missing from this hive of activity, is a debate about whether we may need to fundamentally rethink some foundational types of regulation, for the AI era. For example, privacy regulations may need to expand beyond restricting personal data collection to cover AI systems' ability to (claim to) infer sensitive individual attributes. Given that AI is a general-purpose technology, with a complex supply chain, that can be deployed across sectors, we also need deeper reflection on how different types of regulation should work, in concert. For example, with agentic AI systems, assigning responsibility becomes complex as it's unclear whether users have an obligation to oversee an agent’s actions, and how this intersects with the responsibilities of the system's deployers or developers, or other actors in the supply chain. Sector-specific regulations add further complexity, as interpretations may vary on how to apply these rules to AI agents operating across different domains.
Beyond regulation, policymakers have a broad set of levers at their disposal to shape how AI is developed, or who develops it. One approach is to target the key inputs to modern AI systems - compute, data, and talent. Policymakers have already passed programmes, such as the National Artificial Intelligence Research Resource pilot to widen academic access to compute, to try to incentivise more socially useful AI applications. In the other direction, policymakers have also used compute to try to monitor, restrain or control AI activity, as exemplified by the US-led restrictions on the export of chips to China, and more recently, by the 2023 US executive order, and EU AI Act, which added requirements on developers of AI models above a certain compute threshold. Lots of more speculative ideas exist, such as designing new types of chips with in-built controls. Beyond compute, policymakers could also focus on other levers, such as data, for example by trying to create more public interest datasets, like Protein Data Bank (PDB), to unlock more transformative AI applications, like AlphaFold. The PDB’s history highlights the complex mix of ingredients and incentives that are typically needed to create and maintain such high-quality datasets, but there are promising efforts underway, such as the UK’s OpenSafely initiative and Our Future Health genomic data programmes.
The AI Policy Atlas - caveats and limitations
We don’t cover every topic: As indicated by the empty ‘...’ boxes in our graphics, our Atlas does not cover every AI policy topic. It could extend out much further, both horizontally - there are many domains where you could apply AI - and vertically - topics such as fairness, bias and discrimination have many sub-topics.
We cover too many topics: Our goal is not to suggest that AI policy teams, or individual practitioners, need to understand each of the topics in our Atlas. Rather, policy teams can create their own AI Policy Atlas to indicate where they are best placed to contribute, where they need external expertise, and where they need to avoid getting distracted.
The topics are not discrete: There are not individual AI policy rooms where people discuss each of these topics. Rather, many of the important policy questions come at the intersection of these topics, such as what types of evaluations should we design to understand the future impact of AI agents on employment outcomes in industry x?
We use the term ‘policy’ very loosely: Our conceptualisation of ‘policy’ overlaps with other concepts, particularly ‘governance’, but also, in places, with AI ‘ethics’ ‘safety’, ‘responsibility’, and ‘compliance’. The idea is to bring together different practitioners working on the same topics to reduce duplication.
In the coming year, we hope to write short notes on select topics from the Atlas, such as public attitudes to AI. If you’d like to contribute to an AI Policy Note on one of these topics, please get in touch at aipolicyperspectives@google.com. You can also see a full, interactive version of the Atlas here.
A question for readers: What AI policy topic do you think is most neglected or under-appreciated in 2024?
Interesting article. I am surprised by the omission of “human rights” though. I understand “Atlas does not cover every AI policy topic” but HRs are foundational for AI policy…