Today’s essay is the third entry in a seven-part series focused on questions of responsibility, accountability, and control in a pre-AGI world. This piece, and those that follow, are written in a personal capacity by Seb Krier, who works on Policy Development & Strategy at Google DeepMind.
Following the second essay focused on the relationship between risk and progress, in the coming weeks I'll be continuing to publish one piece every week or two that attempts to make sense of where we are now and where we are going. An important caveat: these are my current high-level thoughts, but they're certainly not set in stone. The nature of AI means my views evolve often, so I highly encourage comments and counterpoints. Over the coming weeks, this series will include the following essays:
How do we use models to bolster societal defenses?
Are there ways of reconciling competing visions of fairness in AI?
What does a world with many agentic models look like? How should we start thinking about cohabitation?
How do you deal with faster disruption and technological displacement in the labor market, both nationally and internationally?
How can we balance proliferation and democratization with appropriate oversight when developing and deploying powerful AI systems?
Huge thanks to Kory Matthewson, Nick Whittaker, Nick Swanson, Lewis Ho, Adam Hunt, Benjamin Hayum, Gustavs Zilgalvis, Harry Law, Nicklas Lundblad, Seliem El-Sayed, and Jacques Thibodeau for their very helpful comments! Views are my own etc.
How do we use models to bolster societal defenses?
I think the sequencing of releases is actually quite important. If suddenly everyone has access to a particularly powerful AGI, then it’s unclear how the balance between misuse/offense and patching/defense is affected. Assuming frontier models do actually present highly sophisticated capabilities with misuse potential, maybe the first thing we should do is identifying any domain where the offense-defense balance favors offense, and then using them to improve societal defenses and antibodies. While not a panacea, such measures would undeniably enhance our resilience. For example:
Addressing technical vulnerabilities, bolster cybersecurity protections and red-teaming in national infrastructure, hospitals etc to build higher barriers to entry for malicious actors and reduce the success of cyberattacks;
Accelerating research on creating effective defensive AI systems, for example through oracle-like scientists with sufficient epistemic uncertainty. For example, this paper shows how by modeling diverse reasoning chains and full posterior distributions, language models can be less prone to overconfidence or shortcut learning;
Scrutinizing laws and regulations for inefficiency, contradictions, obsolete standards etc. to make them simpler, less gameable, more consistent. Also using simluations and modeling to foresee unexpected effects of proposed legislation;
Ensuring that assistants we rely on help us build better cognitive security and filter information in epistemically desirable ways - for example by highlighting confirmation bias, dealing with information overload, encouraging critical reflection, and reinforcing good epistemic practices;
In national security and defense organizations, using models to continuously analyze global digital activities, improving missile defense, identifying potential threats, warfare, misuse of AI systems and so on;
Investing in stronger defenses tackling biosecurity risks, like monitoring and oversight of DNA synthesis, securing cloud labs, better vaccine production, better pandemic preparedness and acceleraitng the development of medical countermeasures;
Stress-testing and redesigning key components of the internet and digital infrastructure to be more resilient against AGI-driven attacks and reduce the effectiveness of foreign hybrid warfare;
Radically rethinking social security, since impacts on jobs will likely be faster and more chaotic than previous technological breakthroughs;
Focusing on digital identity and authentication: the rise of deepfakes hasn’t so far been as problematic as some anticipated a while back, but I think it’s too early to judge. Impersonification is already growing; why not invest more in robust digital identity and authentication systems?
A key question concerns the "vulnerable world hypothesis", which argues that technological progress has made it much easier to cause mass harm (offense) than to prevent such harm (defense). Certain technologies like nuclear weapons and engineered pathogens enable small groups or individuals to wreak catastrophic damage. Defensive technologies may not fully counteract these offensive threats. On the other hand, while nuclear weapons pose catastrophic risks, the geopolitical system of deterrence and nonproliferation has so far prevented their use in war since 1945. Dangerous natural pandemics have emerged, but global public health infrastructure has largely contained them.
Gustavs Zilgalvis mentioned an interesting historical analogy in a discussion about this: the use of city walls. The Theodosian Walls of the Byzantine Empire, vital in their grand strategy against the Huns, represented a strategic equilibrium where the Empire. In places where it wasn’t able to fortify all its territories, resorted to paying the Huns 2,100 pounds of gold per year to spare the unfortified regions. This arrangement persisted until advancements in offensive technology, like Mehmed II's cannons and a new kind of long-range bow called the composite reflex bow, rendered the walls ineffective - ultimately leading to the fall of Constantinople. Another nice example Gustav shared from International Politics: Enduring Concepts and Contemporary Issues is the following:
In many ways, it's a lot easier to split the atom than it is to build walls large enough and do it fast enough to defend against nuclear weapons; but if we learn the right lessons from history there may be some hope. This requires thinking not only about how or whether to restrict models, but how to use them. Ideas along these lines seem notably absent from the literature. While academia has been effective at scrutinizing risks and harms, wider remedies and prescriptions are harder to find. This proactive approach should be encouraged more as it not only minimizes risks but also amplifies the benefits of AGI models in reinforcing societal structures. So even if the offense-defense balance favors offense in areas like nuclear security and biosecurity, defensive systems are not entirely ineffective. As Michael Nielsen notes “many aspects of safety and security are (approximately) public goods or collective action problems, and the market as currently constructed undersupplies them. There are many ways one might address this. I am a fan of ongoing efforts – including those by Buterin – to develop new financial instruments that help address such problems.”
Question for researchers: how can we proactively harness the power of advanced AI to strengthen societal resilience and address vulnerabilities, rather than focusing solely on restricting access?