AI Policy Primer (July 2024)
Issue #12: Weather forecasting, AI agents, and synthetic data
In July’s edition of the AI Policy Primer, we take a look at weather forecasting models, the governance of AI agents, and recent debates surrounding synthetic data. As always, leave a comment or let us know if you have any feedback at aipolicyperspectives@google.com.
What we’re reading
Taking the temperature of weather models
What happened: Google DeepMind’s GraphCast, a state-of-the-art AI weather prediction model, won the MacRobert Award hosted by the Royal Academy of Engineering. GraphCast can predict hundreds of weather variables up to ten days in advance, and is faster and more accurate than traditional weather models. The system, which Google DeepMind open-sourced, was joined in 2024 by WeatherMesh – a model developed by weather forecasting start-up WindBorne. Google Research also recently released NeuralGCM, a model that can simulate Earth’s atmosphere.
What’s interesting:
GraphCast goes beyond standard weather prediction by offering earlier warnings of extreme weather events. It can predict the tracks of cyclones with great accuracy further into the future, characterise atmospheric rivers associated with flood risk, and predict the onset of extreme temperatures. These abilities have the potential to save lives through greater preparedness and faster emergency response, and address environmental challenges.
Weather is a domain where the state takes on prediction tasks, for example, via the National Weather Service (NWS) under the National Oceanic and Atmospheric Administration (NOAA) in the United States and the Met Office of the Department for Science, Innovation and Technology (DSIT) in the UK. Before GraphCast, the High-Resolution Forecast (HREF) developed by the independent intergovernmental organisation European Centre for Medium-Range Weather Forecasts’ (ECMWF) was the state-of-the art model.
GraphCast, trained on ECMWF’s ERA 5 dataset, is now being used by ECMWF, marking a move towards new modes of public-private partnership in weather prediction. As AI companies increasingly contribute to public goods, we should prepare for the emergence of new types of collaboration between model makers in the private sector and model deployers in the public sector. To that end, the Royal Academy of Engineering notes the potential for GraphCast to support critical decision-making across industries and optimise resource allocation.
Looking ahead: GraphCast is part of wider research to understand the broader patterns of our climate. Alongside other GDM models – such as AlphaFold 3, GNoME, and others – it demonstrates AI's potential to accelerate scientific discovery and address some of our greatest challenges. To learn more, see the page for the MacRobert Award, Google DeepMind's blog about GraphCast, an accompanying paper, and the code shared on GitHub.
Sector spotlight
Governing AI agents
What happened: The development and deployment of AI agents continues to spark commentary. Such systems aim to autonomously plan and execute complex tasks with limited human involvement (unlike AI tools like Gemini or Claude that provide task-specific assistance and respond to user queries without independent initiative or decision-making capabilities).
What’s interesting:
While developers have yet to deploy powerful agents, they have - along with researchers from academia and civil society - released work focused on identifying and assessing the governance mechanisms needed to allow for the safe deployment of such systems. Google DeepMind, for example, published a collection of papers considering issues such as value alignment, safety and misuse, economic and environmental impact, epistemic security, and access in the context of agentic AI systems.
The University of Toronto’s Noam Kolt looked at governance challenges connected to discretionary authority (making sure the agent doesn’t vicariously use authority to act unreasonably), loyalty (determining how best to keep an agent acting in the user’s best interests), delegation (how to manage the creation of subagents), and information asymmetry (managing situations in which the agent knows more than the person, or ‘principal’, employing it). Kolt also examines visibility, the subject of a paper by authors at Mila and GovAI, proposing measures including agent identifiers, real-time monitoring, and activity logging.
Looking ahead: Developing successful governance structures also requires understanding how agents might actually be used in practice. One method in this vein is Seth Lazar’s work considering the cultural and epistemic impact of AI agents, which outlines the different forms that agents may take: as ‘companions’ offering comfort and support, as ‘attention guardians’ that could help people decide where to focus, and as ‘universal intermediaries’ that mediate our interactions with the digital world.
Sector spotlight
Climbing the data wall
What happened: Debates about the availability of training data, the amount of data likely to be used as models scale, and potential bottlenecks and solutions continue to run. Last month, Epoch AI estimated with an 80% confidence interval that the existing high quality training data stock will be fully depleted at some point between 2026 and 2032, bringing new energy to discussions about the “data wall” and potential remedies for the problem.
What’s interesting:
Data availability is crucial for AI development. As a rule of thumb, researchers generally accept that more – and higher quality data – tends to lead to better model performance (with the caveat that models at the same capacity have been getting better over time). Some research suggests we may soon exhaust available training data, which may in turn stymie the development of frontier models.
Google researchers showed that, when fine-tuning the Imagen text-to-image model, increasing the size of the synthetic dataset monotonically improved the model's accuracy; synthetic data was also used to train Anthropic’s Claude 3, as outlined in its technical report; and comments from Mark Zuckerberg and Dario Amodei highlight the importance of synthetic data for scaling.
An opposing view, however, proposes that synthetic data may not be enough to overcome the data wall. Studies from Oxford and Rice University both suggest that the use of synthetic data degrades model quality over time, while other research shows that compounding errors from training on synthetic text online may result in a phenomenon known as ‘model collapse’. Recent research, however, shows that ‘model collapse’ tends only to occur when synthetic data is substituted for real data, rather than added to it.
Looking ahead: Developers will likely face other challenges too, like ensuring the factuality and fidelity of synthetic data, and the potential for synthetic data to amplify or introduce biases. AI itself may be part of this solution, as it can help annotate and curate data, making it more accessible and useful to AI labs and other researchers.