[Note that this blog post has been updated on 8 February 2024 to reflect the publication of the UK Government’s response to the ‘A pro-innovation approach to AI regulation’ published on 6 February 2024 (Government Response).]
As readers of our blog will be aware (see our previous post here on: Training AI models: Content, copyright and the EU and UK TDM exceptions) in March last year the UK Government u-turned on its controversial proposal to broaden the text and data mining exception to allow text and data mining for any purpose without permission or a licence. In doing so, the UK Government acknowledged that there nevertheless remained a need to clarify how AI developers can utilise copyright works and data for the purpose of training AI models. To this end, the UK Intellectual Property Office (IPO) commenced working on the development of a (now scrapped, see more below) voluntary Code of Practice on copyright and AI (Code of Practice) with the aim of making commercial licenses for data mining more readily available while ensuring protections for copyright holders.
On 2 February 2024 the House of Lords’ Communications and Digital Committee published its own paper on ‘Large language models and generative AI’ (HL Report). The HL Report documents the findings of the HL inquiry which considered evidence, from over 40 expert witnesses and examined 900 pages of written evidence, on the likely trajectories for large language models (LLMs) over the next 3 years. As a result of its findings, the HL Report then proposes actions which it considers are necessary to ensure that the UK effectively responds to opportunities and risks posed by such LLMs.
The HL Report dedicates an entire chapter to copyright in which it welcomed the proposal to implement a Code of Practice, but also warned that is has ‘deeper concerns about the Government’s commitment to fair play around copyright’ and makes a series of recommendations to the UK Government to address those concerns.
In summary, the HL Report says that the UK Government should:
- Prioritise fairness and responsible innovation. As to this, the HL Report says that: “LLMs may offer immense value to society. But that does not warrant the violation of copyright law or its underpinning principles. We do not believe it is fair for tech firms to use rightsholder data for commercial purposes without permission or compensation, and to gain vast financial rewards in the process. There is compelling evidence that the UK benefits economically, politically and societally from upholding a globally respected copyright regime.” The HL Report also says that the current legal framework is failing to ensure that the objective of copyright: to reward creators for their efforts, prevent third parties from using their copyright works without permission and incentivise innovation.
- Resolve disputes definitively. The HL Report states that the Government has a duty to act and that ‘it cannot sit on its hand for the next decade until sufficient case law has emerged’. In relation to the Code of Practice, the HL Report also warns that ‘debate cannot continue indefinitely’ and if the process remains unresolved by Spring 2024 the UK Government should respond to the HL Report by publishing its view on whether the existing copyright framework provides sufficient protections to rightsholders in light of the recent advances made by LLMs. The HL Report also recommends that if the UK Government does conclude that there are shortfalls in the existing copyright legislation, it should set out its proposals (presumably in a white paper) for updating legislation to ‘future proof’ copyright protection.
- Empower rights holders to check if their data has been used without permission. The HL Report recommended that the (now scrapped) Code of Practice should include a mechanism which enables rights holders to check training data to enable rights holders and other interested parties to assess compliance with copyright law.
- Investment in quality training data sets. The HL Report recommends that the UK Government should work with licensing agencies and copyright repositories to ensure that there is investment in large high quality training data sets to encourage technology firms to use licensed material to train its LLMs.
Only 4 days after the HL Report was published, the UK Government published the UK Government Response in which its confirmed its intention to scrap the Code of Practice explaining that ‘it is now clear that the working group will not be able to agree an effective voluntary code’ and that its approach will now be to work closely with rights holders and AI developers to explore mechanisms that can be used to enhance greater transparency so that rights holders can better understand if their content is being used to train AI models.
At this stage, and while the UK Government appears to concede that some legally binding measures (rather than a non-binding code of practice) may be required and that co-operation with international counterparts will be needed, the UK Government Response continues to re-iterate the explorative work being carried out and does not appear to be in any immediate rush to legislate. However, further details of the UK Government’s proposals for moving forward, including in the absence of the Code of Practice, are expected shortly.