Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

How to Stop Your Data From Being Used to Train AI

Wired: “If you’ve ever posted something to the internet—a pithy tweet, a 2009 blog post, a scornful review, or a selfie on Instagram—it has most likely been slurped up and used to help train the current wave of generative AI. Large language models, like ChatGPT, and image creators are powered by vast reams of our data. And even if it’s not powering a chatbot, the data can be used for other machine-learning features. Tech companies have scraped vast swathes of the web to gather the data they claim is needed to create generative AI—with little regard for content creators, copyright laws, or privacy. On top of this, increasingly, firms with reams of people’s posts are looking to get in on the AI gold rush by selling or licensing that information. Looking at you, Reddit. However, as the lawsuits and investigations around generative AI and its opaque data practices pile up, there have been small moves to give people more control over what happens to what they post online. Some companies now let individuals and business customers opt out of having their content used in AI training or being sold for training purposes. Here’s what you can—and can’t—do…”

Sorry, comments are closed for this post.