Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

A.I.’s un-learning problem

Fortune: – Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data “…“If a machine learning-based system has been trained on data, the only way to retroactively remove a portion of that data is by re-training the algorithms from scratch,” Anasse Bari, an A.I. expert and computer science professor at New York University, told Fortune. The problem goes beyond private data. If an A.I. model is discovered to have gleaned biased or toxic data, say from racist social media posts, weeding out the bad data will be tricky. Training or retraining an A.I. model is expensive. This is particularly true for the ultra-large “foundation models” that are currently powering the boom in generative A.I. Sam Altman, the CEO of OpenAI, has reportedly said that GPT-4, the large language model that powers its premium version of ChatGPT, cost in excess of $100 million to train. That’s why, to companies developing A.I. models, a powerful tool that the U.S. Federal Trade Commission has to punish companies it finds have violated U.S. trade laws is scary. The tool is called “algorithmic disgorgement.” It’s a legal process that penalizes the law-breaking company by forcing it to delete an offending A.I. model in its entirety. The FTC has only used that power a handful of times, typically directed at companies who have misused data. One well known case where the FTC did use this power is against a company called Everalbum, which trained a facial recognition system using people’s biometric data without their permission…”

Sorry, comments are closed for this post.