Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Daily Archives: February 20, 2024

Why The New York Times might win its copyright lawsuit against OpenAI

Ars Technica: “The day after The New York Times sued OpenAI for copyright infringement, the author and systems architect Daniel Jeffries wrote an essay-length tweet arguing that the Times “has a near zero probability of winning” its lawsuit. As we write this, it has been retweeted 288 times and received 885,000 views. “Trying to get everyone to license training data is not going to work because that’s not what copyright is about,” Jeffries wrote. “Copyright law is about preventing people from producing exact copies or near exact copies of content and posting it for commercial gain. Period. Anyone who tells you otherwise is lying or simply does not understand how copyright works.” This article is written by two authors. One of us is a journalist who has been on the copyright beat for nearly 20 years. The other is a law professor who has taught dozens of courses on IP and Internet law. We’re pretty sure we understand how copyright works. And we’re here to warn the AI community that it needs to take these lawsuits seriously. In its blog post responding to the Times lawsuit, OpenAI wrote that “training AI models using publicly available Internet materials is fair use, as supported by long-standing and widely accepted precedents.” The most important of these precedents is a 2015 decision that allowed Google to scan millions of copyrighted books to create a search engine. We expect OpenAI to argue that the Google ruling allows OpenAI to use copyrighted documents to train its generative models. Stability AI and Anthropic will undoubtedly make similar arguments as they face copyright lawsuits of their own. These defendants could win in court—but they could lose, too. As we’ll see, AI companies are on shakier legal ground than Google was in its book search case. And the courts don’t always side with technology companies in cases where companies make copies to build their systems. The story of MP3.com illustrates the kind of legal peril AI companies could face in the coming years…”

Politics makes bastards of us all: Why moral judgment is politically situational

Kyle Hull, Clarisse Warren, Kevin Smith. Politics makes bastards of us all: Why moral judgment is politically situational [full text free to read]. Political Psychology, 2024; DOI: 10.1111/pops.12954 – “Moral judgment is politically situational—people are more forgiving of transgressive copartisans and more likely to behave punitively and unethically toward political opponents. Such differences are widely observed,… Continue Reading

Is Google Getting Worse?

Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines. Janek Bevendorff, Matti Wiegmann, Martin Potthast, and Benno Stein.”Many users of web search engines have been complaining in recent years about the supposedly decreasing quality of search results. This is often attributed to an increasing amount of search-engine-optimized but low-quality content. Evidence… Continue Reading

US Census Bureau purposely fudges location data in census to protect people’s privacy

Via Kottke – The U.S. Census Is Wrong on Purpose: “…Full census data is only made available 72 years after the census takes place, in accordance with the creatively-named “72 year rule.” Until then, it is only available as aggregated data with individual identifiers removed. Still, if the population of a town is small enough,… Continue Reading

Does anyone even want an AI search engine?

Fast Company: “You’ve probably already noticed your search engines are starting to evolve. Google and Bing have already added both AI-generated results and conversational chatbots to their respective search engines. The Browser Company, a startup that made a big early splash thanks to its mission statement of building a better internet browser, has launched an… Continue Reading

The National Wetlands Inventory

Data is Plural: “The National Wetlands Inventory, maintained by the US Fish and Wildlife Service, provides interactive maps and bulk data containing “geospatially referenced information on the status, extent, characteristics and functions of wetland, riparian, deepwater, and related aquatic habitats.” With contributions from 160+ organizations, coordinated through a dedicated national standard, the inventory represents “more… Continue Reading

The Dignity Index is designed to prevent violence, ease divisions, and solve problems

“The Dignity Index scores distinct phrases along an eight-point scale from contempt to dignity. Lower scores (1-4) reflect divisive language while higher scores (5-8) reflect language grounded in dignity. In its pilot season, a trained group of students supported by the University of Utah’s Kem C. Gardner Policy Institute and the Hinckley Institute of Politics… Continue Reading