Examining the AI Industry’s Mass Ingestion of Copyrighted Works for AI Training

Too Big to Prosecute?: Examining the AI Industry’s Mass Ingestion of Copyrighted Works for AI Training Senate Judiciary Committee – includes video and testimony transcripts, July 16, 2025.

See also ranking member Sen. Dick Durbin’s opening statement – Key Quotes:

  • “While AI can be an incredible tool that unlocks further creativity, writers, artists, musicians, and others are rightfully concerned about what the technology will mean for their livelihoods. Should AI companies be able to use their materials freely, as ‘fair use?’ Or should they receive compensation when their works are used to train AI models?… How can creators compete with AI products that generate content at the push of a button—especially when that content might mimic or even reproduce their own works? These are just a few of the questions that we are going to consider in this hearing.”
  • “As we try to find the right balance between promoting technological innovation, protecting the work of our nation’s creators, and continuing to incentivize creativity in the years to come, we must recognize that AI innovation and protection of intellectual property rights are not mutually exclusive. That is why it is troubling to hear stories about the steps Big Tech companies are taking to train their AI models on copyrighted materials without compensating the creators [of] those works.”
  • “For example, rather than license authors’ works, companies like Meta and Anthropic have obtained copyrighted materials from sites that host pirated copies of the authors’ books and writings… As the judge in the Meta case recently put it: ‘companies have been unable to resist the temptation to feed copyright-protected materials into their models—without getting permission from the copyright holders or paying them for the right to use their works for this purpose.’ This hearing is going to be interesting.”
  • See also Judges Don’t Know What AI’s Book Piracy Means
Posted in: AI, Congress, Copyright, Courts, Internet, Knowledge Management, Legal Research, Libraries