Skip to main content
YUFAN & CO.
Back to News
Industry

UK scraps opt-out plan for AI training on copyrighted data

Yufan Zheng
Founder · ex-ByteDance · MSc Peking University
1 min read
· Updated
Cover illustration for UK scraps opt-out plan for AI training on copyrighted data

The UK government scrapped its controversial plan to let AI companies train models on copyrighted work using an opt-out system this week. For UK SMEs building or buying AI tools, this means the legal grey area around training data just got much darker, pushing the market toward strict licensing. The decision, published in a statutory report on 18 March, formally abandons the text and data mining exception that tech firms heavily lobbied for.

The opt-out plan is dead

On 18 March 2026, the government published its final report on AI and copyright, confirming it won't introduce a new text and data mining exception. Previously, officials favoured a model where AI developers could scrape data freely unless rightsholders explicitly opted out. Following a House of Lords report and massive pushback from the creative sector, that plan is gone.

The government admitted that current UK copyright law likely prohibits unlicensed general-purpose AI training. Instead of new legislation, ministers will monitor the existing licensing market and gather more evidence. This leaves AI developers relying on human-generated data with a clear mandate to pay for it. The UK creative sector is now converging on a licensing-first approach to AI training data, mirroring the multi-million-pound deals we already see between OpenAI and major news publishers.

Why this changes your data strategy

If you run a 50-person agency or a mid-sized software firm, you might think copyright fights only matter to Microsoft and Getty Images. You'd be wrong. The death of the opt-out exception means any proprietary data you scrape, store, or use to fine-tune local models carries immediate legal risk.

The House of Lords made it clear that copying data to train an AI is a reproduction under existing law, not a protected form of learning. I think this is a necessary correction, but it creates a brutal environment for smaller AI builders. If you build bespoke AI tools for clients, you can no longer assume public web data is fair game. You must secure explicit licences or run your systems on synthetically generated datasets.

For buyers of off-the-shelf AI, the cost of these tools will likely rise. Vendors will inevitably pass on the expense of their newly mandated data licensing agreements to end users.

Three things to check

  1. Audit your fine-tuning pipelines. If your technical team scrapes UK websites to train local models, stop. You need explicit permission or a commercial licence to use that data safely.
  2. Ask your vendors about provenance. Email your primary AI software providers and ask how they handle UK copyright compliance. If they can't explain their licensing strategy, they're a compliance risk to your business.
  3. Watch the Creative Content Exchange. The government plans to pilot this data-licensing hub in summer 2026. Keep an eye on it as a potential source for clean, legally sound training data for your own projects.

Get our UK AI insights.

Practical reads on AI for UK businesses — teardowns, how-to guides, regulatory news. Unsubscribe anytime.

Unsubscribe anytime.