Washington Post and free via MSN: “OpenAI’s video generation tool, Sora, can create high-definition clips of just about anything you could ask for — a breakthrough in artificial intelligence expected to transform the entertainment industry. But whose data OpenAI used to create its groundbreaking system is a mystery. With ChatGPT, OpenAI helped popularize the now-standard industry practice of building more capable AI tools by scraping vast quantities of text from the web without consent. With Sora, launched in December, OpenAI staff said they built a pioneering video generator by taking a similar approach. They developed ways to feed the system more online video — in more varied formats — including vertical videos and longer, higher-resolution clips. “You want to use all the data in its native format that exists,” Tim Brooks, the project’s then co-lead, said at an AI hackathon in April 2024. But OpenAI has not specified which videos it grabbed to make Sora, saying only that it combined “publicly available and licensed data.”