Authored by Google DeepMind’s Nicklas Lundblad (Senior Director of Policy and Strategic Advisor) and Dorothy Chou (Director of Policy & Public Engagement), the below is the first of our Science in the Spotlight essay series, exploring AI, science and society.
Here is a simple model of progress: problems of increasing complexity are solved through the use of science and technology. Doing so has created more economic opportunity, lifted billions of people out of poverty and increased social welfare in many different ways. This progress, however, has come at the price of increasing complexity, and new classes of problems, and these have, in turn, required more scientific and technological progress to solve.
It’s easy to see where this is going, right? This is a spiral - and the big question society faces at some point is whether it will continue to be a positive spiral of growth, or become a negative spiral of complexity.
The answer to this question depends on developing the ability to continue to use science and technology to address ever-increasing levels of complexity. In a sense this is a version of the “red queen problem” - named after the Red Queen in Alice in Wonderland, who notes that it takes all the running one can do to remain in the very same place.
In this simplified model of progress, stagnation is not just levelling out at current levels of welfare, quality of life and happiness – it is inevitably inviting decline. It requires continuing to run faster—and even speed up—to avoid succumbing to the complexity already created, and collapse (as suggested by Joseph Tainter in his magisterial work on the collapse of complex societies).
The Red Queen model is far from the only viable model to consider as we think about how to tackle some of today's major societal challenges, but it provides a useful framework for understanding the role of science and technology, and what the rate of scientific progress tells us about the state of social progress more broadly.
Measuring the pace of scientific progress
In the Red Queen model, one core metric to work with is scientific productivity: is society making the scientific breakthroughs it needs to keep building new technologies? Another core metric is the interplay between technology and science, and how technology inspires and enhances scientific productivity in different ways. In other words, good ideas must become easier to find – and today, this is getting harder and harder.
How do we measure the pace of scientific progress? It is easy to do it in a way that is too simplistic and most likely unhelpful – e.g. focusing on some parameter that seems somewhat objective and then projecting that parameter with some level of (often fake) precision. Examples include methods like:
Measuring the number of patents in a field, or patents per country or year. Patents are a very noisy measure, especially since they have acquired a purely defensive role in much of the world today - both in trade and in domestic disputes.
Measuring the number of published papers in a field. This measure has become even more problematic now that there are a growing number of papers produced in part or wholly by LLMs in different ways.
Measuring the funding of a specific area. Funding often can fuel progress, so it is not wrong, but it is less exact than one would wish for.
Even noisier measures include elements of productivity growth, GDP-growth or general economic metrics that can be used to assess overall progress in some narrow sense. These metrics do not typically capture scientific progress that well.
So what is an effective way to measure the pace of scientific progress?
The first task requires tackling the question of what scientific progress actually is. It may seem self-evident, but looking more closely demonstrates it is really hard to pinpoint exactly what constitutes scientific progress. This is easiest to illustrate when looking at different fields. Is physics progressing faster or slower than chemistry? What about biology and mathematics? Economics and literary theory? Which is the faster and slower of these?
One way to think about scientific progress is that it is about completing a borderless puzzle, expanding understanding of the world and how it works. The measurement to capture is the pace at which pieces of the puzzle go to the right places, or the pace at which new pieces are found. As scientists slowly, painstakingly, try to complete the puzzle, that’s how progress is made. The pace at which the puzzle grows - new pieces are found and existing pieces placed in a way that slowly reveals the larger motif - is what should be measured.
But this is hard. What, exactly, is a piece? And when is it placed in the right place?
The work of Tsao and Narayanamurti offers insight into these questions. In their exploration of technoscientific revolutions they speak about the network of questions and answers as a good metaphor for science. What is important to understand is the pace at which new questions and answers are found - and ideally both should be progressing at pace. A world that slowly answers the existing questions - where there are more questions answered than new questions asked - is probably slipping into normal science. A world where only new questions are added is one where everyone is too lazy to do actual empirical work, and equally unhelpful.
What is important to measure, then, is the relative speed of questions and answers added to this network – and that is doable, if hard.
Other challenges in measuring scientific progress
Scientific progress in fundamental areas has an interesting challenge when it comes to measuring return on investment or overall impact on a field. Often impact can't be ascertained without hindsight, and as a result, the biggest recognitions of achievement like Nobel Prizes are more or less on delay. They watch to see which breakthroughs stand the test of time. What appears to be a significant breakthrough in the moment may be eclipsed by another; the impact of Google's transformer architecture was only truly recognized in its totality by a layperson when ChatGPT (T for transformer) launched.
This makes investing in fundamental science difficult—where should bets be placed and when should investments be cut off or doubled down on? When is it obvious something is a dead end rather than a stop on the way to something much greater?
The contrast to other areas that also require patient capital due to longer timelines, like drug discovery, is interesting to observe. With a drug that makes it to market, it's easy to calculate impact, efficacy, and ROI based on survival rates: for example increased survival rates due to a cancer treatment or even in a widespread pandemic. Even those results can only be calculated retrospectively, but at the very least, the impact is indisputable (if trials are run effectively). Similarly desalination of water is a fairly straightforward test - these are applied sciences and solutions that have a clear outcome in sight.
Where can scientists build consensus around what they are trying to achieve in a given area, and how does that achievement or breakthrough stack up relative to others? That is what establishing benchmarks can help with in lieu of a clear test in applied fields.
The role of benchmarks to measure scientific progress
The importance of benchmarking progress is exactly why competitions like CASP, which diagnose and define problems that scientists work toward and benchmarks for progress, are so crucial to establish. Agreeing on what 'good' looks like helps everyone to mark scientific progress in a way that crowds out noise and helps leapfrog the tests of time that so many fundamental fields are subjected to (though does not eliminate the test of time utility). And they are even more crucial if artificial intelligence is increasingly integrated into scientific practice. In the age of AI, what does good science & impactful progress even mean?
It might be tempting to create benchmarks in fundamental science that give recognition to making sense of what feels like chaos in a scientific field. But in her book 'Unthinking Mastery' Julietta Singh raises the idea that Western cultures are built on a composition of domination and exploitation in the name of mastering and civilizing not only other lands and physical bodies, but also ideas that perhaps extend far outside of the ability to bend them to our expertise. She argues instead in favor of considering benchmarks of progress that are postcolonial and deconstructs notions of what it means to gain knowledge. After all, mastery is actually just fracturing: PhDs are given in increasingly siloed areas with minimal transferable or generalisable learnings.
If mastery of a discipline is no longer a goal, perhaps it becomes possible to reframe society's relationship with science as one where 'directionality becomes infinite and failure a process we might begin to meet with pedagogical delight.' Similarly, Physicist Sabine Hossenfelder has warned that a predisposition to beauty or what fits a predisposition to symmetry and simplicity has produced great math but bad science. She posits that this bias is what has held back physics for more than four decades. It's crucial to rethink how science is measured and done in order to make progress.
Perhaps with AI, now is the time as fields must detangle understanding and knowledge from comprehension, to develop greater curiosity and value it more than mastery. What would a benchmark for curiosity and seeing reality as-it-is be? How would the field incentivize this shift in approach? One proposal might be to reconsider and require the publication of negative results and providing them open access. If the fields were to value and reward what is not known and understood as much as they do mastery, the known unknowns might give way to greater context and knowledge for more generalised understanding and broader solutions. If it is in the crossovers of multiple silos that the solutions to complex problems like climate change can be found—the combination of political science, environmental science, materials science, and more—then perhaps what does not work is as crucial as what does. 'In failure we participate in new emergences... a refusal of mastery.' Might it be productive to distance ourselves from being the active subjects of reading texts that confirm our biases and instead reconstruct what science is and means altogether? Perhaps that is precisely the shift necessary for the world to embark on the next set of breakthroughs.
Science policy as a foundation for progress
Of course, it is easy to see where this view can be challenged: one could argue that instead of slowing down, society could engage in reversing progress to a point where it strikes an equilibrium between technology, science and complexity that is sustainable, instead of rushing into the future with an increasing complexity debt. One could also argue that the progress gained in this model is illusory—instead of eliminating complexity the model reallocates it into black boxes and reduces society’s level of control along with the increasing power of technology – a dangerous equation if true.
But even if the Red Queen model is imperfect, any equilibrium in de-growth scenarios will be dependent on the ability to deal with complexity. Taking science and the rate of progress seriously will therefore continue to be essential, as will the need to ensure science policy is fit for purpose. Within the model, science is not a single policy area, but the one that underpins all others -- the continued expansion of organised human curiosity is the premise on which progress relies. If we're going to successfully meet the challenges we face as a society head on, science policy - now confined to discussions of funding, major initiative and the effects on the startup market - will have to become something much more grounded and central to our political discourse.
Thank you for this article! These matters should have been placed to the global discussions as they basically impact the future of science and humanity.