Thoughts on: The Handover by David Runciman
Thomas Hobbes, AI, and 450 year old alignment problems
This review is written by Nick Swanson from Google DeepMind’s public policy team. Subscribe for more essays, policy notes, and reviews and leave a comment below or get in touch with us at aipolicyperspectives@google.com to tell us what you think.
David Runciman’s The Handover (published September 2023) applies the thinking of 17th Century political philosopher Thomas Hobbes to the age of AI. In Leviathan - written to the backdrop of the English Civil War - Hobbes set out the philosophical grounding for the creation of states, which he believed were the solution to the ‘state of nature’, a war of all-against-all. By submitting to a state - which in his view does not need to be just, democratic or liberal, it just needs to function - we end the perpetual state of violence of the kind wrought by the violence he lived through.
Hobbes' thinking - and its manifestation in nation states, their law and their coercive power - is the source code of the world we live in, and like any operating system, it gets very little thought. We tend to focus on the surface-level applications built on top of it. But an operating system defines what apps you can run, and how they perform. This might be an analogy Hobbes would appreciate, given he regarded reasoning as a form of computation.
Runciman argues that the rise of artificial, machine-like systems is not remotely novel - we are surrounded by leviathans. He argues that the state itself is an artificial (if not intelligent) ‘being’ whose goals are not the same as the humans which make up its constituent parts, and the reward function of the state is the ability to make decisions related to its survival and power, not to make optimal or correct decisions from our point of view. States can - and obviously do - provide benefits to the population, but survival remains its ultimate goal.
The book references Homo Deus in which Noah Yuval Harari argues that human beings are uniquely able to (co)operate on the basis of collective fictions or stories. Runciman, however, goes a layer deeper - looking at the mechanisms, incentives, and strategies employed by the state and large corporations. For him, the important moment in human history was not Harari’s ‘cognitive revolution’, but the moment we operationalised and mechanised our ability to think over the long-term, act over the long-term, and benefit from the cumulative growth in scientific discovery over the long term (something enabled by the creation of the nation state, and pooling our collective decision-making). A related and more granular approach was developed by Richard Danzig, who argues that markets, bureaucracies and machines are all information processing systems which reduce reality into more narrow inputs, such as bits, prices and completed driving licence applications. Just as AI can seem alien and new to us, so can the behaviours of states and outcomes of markets.
Though Runciman is somewhat sceptical about the certainty of reaching artificial general intelligence, the book benefits from at least taking the possibility of it at face value and thinking through some of the implications of it - for instance, the concept of legal personhood for AI. Often the conversation around legal personhood and artificial intelligence is dismissed as being an emotional response to a statistical model, or anthropomorphisation (and perhaps it often is that). However, legal personhood is a means of giving entities additional duties, and clarifying their relationship to the law. We do it with companies and trusts to clarify very specifically that they are not humans. Some thoughts on the challenges of doing this for AI can be read here.
In the modern world, states look to technology as a means of creating and projecting (economic) power, and states are well suited to throwing money and resources at unlocking fundamental breakthroughs. Some, however, might take issue with Runciman’s proposition that the relative lack of a tech sector in Europe “is not for want of trying”. Whilst the United States clearly has certain advantages of size, scale, resources, knowledge and controlling the world’s reserve currency, building something like a tech sector (an app running on OS Leviathan) clearly requires trade offs and choices. Many of the examples of decisions that the state could have taken to support fundamental innovation - for instance increased military spending - are clearly within the grasp of European leviathans, and have happened in the recent past - as evidenced by Europe’s early lead in mobile telephony.
Whilst a simplified chatbot-scaling-to-super-intelligence-pipeline dominates much of the current discourse, and though it is beyond the scope of this book, it would be great to read Runciman’s thoughts on the implications for the leviathan of a world in which we see rapid (and potentially exponential) growth in scientific discovery as a result of this scaling. What would it mean, for instance, for AI to help develop a transformative cancer treatment? And what would it mean to the West for a Chinese company to do this? What would it mean for leviathan if increasing proportions of economic activity took the form of inference compute - potentially imported? And what is the right scale or organising principle for decision-making when the production of power is less rooted in territory?
Another area not covered in the book, but interesting to consider, is the form factor and a user’s relationship to AI. When discussed in the context of the omnipotent power of the state leviathan, AI comes to sound like a similarly lumbering and unresponsive entity. However, personal assistants and AI systems rooted in and aligned to the interests of individuals could empower them in the face of the leviathans which currently surround us, helping to navigate the bureaucracy which characterises much of our interaction with the state (there is nothing ‘human’ about filling out forms any more than delegating it to agents we trust). They could act as cognitive shields, protecting both our stated and revealed preferences from external influence. They could provide the context, information and cognitive support to help us make better decisions as individuals.
The section on how close we came to destruction during the Cold War as a result of automated state systems and nuclear response plans (but were ultimately saved by human intuition), is effectively discussing an alignment problem. Systems like AlphaZero are described as ruthless optimisers, unable to consider whether it is even ‘playing the right game’ or engaging in a worthwhile conflict. But not every instance of collective decision-making is zero-sum or at the stakes presented by mutually assured destruction. In the future, more capable agentic models may prove better at helping us to manage collective action problems or tragedies of the commons. An example could be arbitration, where parties can set acceptable thresholds for negotiation, and systems could help find better - more creative - solutions than normally occur in dispute. Done well, this could also help address power imbalances between parties, done badly it could displace power imbalances from well-paid lawyers to the capabilities of one’s personal AI.
Towards the end of the book (and as part of a critical view of the rationalism and optimism in the methods of the EA movement) the author makes a strong case that politics is not simply a series of problems to be solved. Runciman proposes that we should put our energies into strategically reforming the state itself to bring it under our greater control. This is - in essence - an alignment problem that humans have faced since the 17th century, and the mechanism by which we work on it is constitutional and legislative reform - something he urges ‘long-termists’ to put their energies into.
The book argues that the state is the only social structure we have with a wide and long enough view of history to ensure AGI remains moored to our long-term interests (albeit one which comes with its own non-aligned aims and survival goals). So shouldn’t working on AI safety be a top priority for the state and society? Some modern leviathans seem to think so. In May this year and for the second time in 7 months, global leaders have come together at an AI safety summit where the topics of alignment, loss of control, and existential risk are taken seriously and at face value. The UK’s AI Safety Institute is effectively an exercise in developing the state’s capacity to understand and evaluate large AI models - what would Hobbes think?