What we’re watching
UK announces Foundation Model Taskforce
What happened: The UK government recently announced plans to establish a Foundation Model Taskforce made up of government and industry representatives. Its mandate is to develop the safe and reliable use of foundation models across the economy and ensure that the UK remains globally competitive. Backed by £100m in initial funding, the Taskforce will also help direct the £900m the Government recently allocated to boost UK compute capacity. ARIA Chair Matt Clifford will lead the Taskforce on an interim basis.
What’s interesting: Some proponents have called for the Taskforce to recommend a state-backed, proprietary foundation model - coined ‘BritGPT’ by the press. They argue that this would be easier to deploy in sensitive use cases - e.g. national security - compared to industry models, and would give the government a more direct stake in, and influence over, frontier AI model development.
However, developing a competitive BritGPT would likely be challenging. A possible route forward could see the UK instead prioritise securing access to frontier industry models, and investing in the skills, compute, and datasets to fine-tune and deploy these models in high-benefit use-cases, for example in energy transmission. In doing so, the UK could become an exemplar for responsible public sector deployment, by prioritising bespoke safety and ethics evaluations for such applications.
US public evaluation of large models
What happened: In May, the White House announced a new initiative to publically evaluate a set of large language models at the upcoming DEFCON 31 hacker convention in August. Eight AI firms will submit their models to a red-teaming exercise that will aim to ‘break’ the models or find flaws in them. Data provider Scale AI will develop the underlying evaluation platform.
What’s interesting: The evaluation initiative is part of a new set of US government actions to promote responsible AI. Traditionally, AI ‘evaluations’ have referred mainly to efforts to assess the performance of models on specific tasks, pre-launch, on single-moment-in-time quantitative benchmarks.
Today, AI practitioners are exploring new ways to evaluate how safe, or ethical model outputs are. For example, testing for toxicity or distributional bias and helping inform content policies and deployment decisions. Interpreted broadly, AI ‘evaluations’ also include internal or external ‘audits’ and longer-term assessments of the effects that AI applications are having on users and society.
Such evaluations are widely recognised as a critical component of the responsible AI toolkit. There are calls to standardise them to, for example, help enforce the upcoming EU AI Act. However, there is little consensus on which exact evaluations are needed and how mandatory they should be.
Across the evaluation toolkit, a lot of work is needed, including: developing new ethics, safety and sociotechnical evaluations; improving existing evaluations, e.g. by boosting representation; using evaluations as an early warning system for more extreme risks; red-teaming and adversarial testing efforts; and designing social impact evaluations to understand how AI is affecting users and society.
What we’re reading
FTC releases guidance for persuasive AI
What happened: In May the US Federal Trade Commission (FTC) published a blog post warning organisations against using generative AI to manipulate or trick customers – or to persuade them to change their behaviour in ways that causes harm - because such actions would fall under FTC jurisdiction. The post followed similar FTC warnings against using AI to deceive customers, whether by using AI for fraud or by making unsubstantiated claims for AI products.
What is interesting: Large AI models can produce compelling text, image, audio and video that studies suggest could persuade people to change their opinions or behaviour. This may be beneficial (e.g. encouraging vaccination) or harmful (e.g. the early use of voice cloning for scams). In some cases, it may be hard to judge: Amnesty International’s use of AI-generated images in its reporting on anti-government demonstrations to protect protestors’ identities.
The FTC guidance is part of a growing policymaker focus on AI persuasion. In May, the EU updated the language in its proposed AI Act to prohibit AI applications that engage in ‘purposeful’ manipulation, despite concerns that such intentionality may be difficult to prove. EU standards organisation Cen-Cenelec is also working on a voluntary standard for how to define, measure, and use ‘AI-enhanced nudging’, arguing that the subtlety of nudging can make it difficult to address via hard laws.
Ultimately, regulators will need to strike the right balance by exploring potentially positive forms of AI persuasion – while not overstating its benefits and avoiding excessive paternalism – and protecting those most at risk such as children. AI labs also ought to consider how to evaluate potential risks and benefits from persuasion when evaluating their models. and applications.
What we’re doing
AI & education RSA roundtable
What happened: Earlier in May, the Royal Society of the Arts hosted a roundtable co-organised by Google DeepMind on the role of AI in the future of education. The roundtable focused on the UK and brought together experts from across education - including teachers, students, programme leaders, edtech CEOs and academics.
What’s interesting: AI’s potential role in education has become a priority topic for educators and policymakers. Recent advances in large models have brought the opportunities and potential challenges for the sector into sharp focus. In the US, the Department of Education’s Office of Educational Technology recently released a new report summarising the risks and opportunities related to AI in teaching, learning, research, and assessment. Meanwhile, last month saw the launch by Code.org of TeachAI, an initiative to produce guidelines for policymakers, administrators, educators, and companies in best practices for safely using AI in education.
Optimistic proposals have suggested that personalised learning represents a future where students can access unique learning opportunities throughout their career. Conversely, some in the room debated how AI might exacerbate existing issues within the education system such as inequality and regional variations in teaching standards.
Other observers spoke more broadly about AI as a forcing function to re-examine the broader role of assessment and examination in education, the digital divide and digital illiteracy (44% of Europeans aged 16-74 do not have basic digital skills), and the protection of children’s data. As well as the need for AI to assist teachers and provide solutions for groups with specific and complex needs, including migrants and those facing exclusion.
Looking ahead: A deeper analysis of the roundtable discussion will be shared publicly at a later date. We also continue to work on our ‘Experience AI’ programme in partnership with Raspberry Pi Foundation. The programme gives teachers co-designed, adaptable lesson plans to promote AI literacy amongst secondary school students. Find out more about it here.