AI’s New Training Data: Your Old Work Slacks And Emails

Forbes: “When Shanna Johnson was winding down cielo24, the transcription and captioning company she ran as CEO, she discovered an unexpected asset: its operational exhaust—the digital leftovers that pile up across years of work and collaboration. To close the company out, she worked with SimpleClosure, a startup that specializes in helping companies wind down. SimpleClosure helped her through the usual shutdown paperwork — closing out payroll and taxes, getting investor consents in order, and filing paperwork with the IRS. Then came the part nobody puts in the founder playbook: selling off cielo24’s 13-year digital footprint—every Slack joke, every Jira ticket, emails documenting internal victories or frustrations sitting in employees’ multi-terabyte Google Drives—as training data for the next generation of AI. For that, cielo24 received “hundreds of thousands of dollars,” which Johnson said helped her go from “I don’t know how we are going to pay our bills” to “we can tie this up neatly with a bow and be able to walk away”. “I’m still a bit emotional about shutting the company down,” she told Forbes. “But it’s cool to think that our data could be useful, live on and help other people.” It’s a clean ending for a messy reality: the company didn’t survive, but its work trail did. And in 2026, that trail can be worth real money. Johnson’s data sale isn’t an isolated exit strategy; it is a new frontier in the AI arms race. “Model companies are realizing the noise in the real-world environments is required to accurately test models.”

AI labs started off by training their models on the public internet—Reddit threads, Wikipedia entries, digitized books. But they exhausted that — all of it — by late 2024, according to former OpenAI chief scientist Ilya Sutskever. And what’s more, it’s not super helpful for building “agentic” AI: models that can actually do work. But the hand-crafted work that was done during the daily operations of defunct companies like cielo24? That’s a sort of fossil fuel for AI agents. Turns out that if you’re shooting for AI competence in the workplace, you need examples of what doing the work actually looks like — a lot of them. “Model companies are realizing the noise in the real-world environments is required to accurately test models,” said Ali Ansari, whose company micro1 sells a product to AI labs called “Roots,” a mock holding company where AI agents can practice their skills in tasks like financial services and managing complex calendars. A Gold Rush On Old Paperwork – Demand for workplace data has been a boon for SimpleClosure, whose CEO Dori Yona said that the level of inbound interest in it from AI companies has been “insane”. “There’s a feeling of a gold rush from these companies trying to get their hands on real-world data,” he said. To meet demand, SimpleClosure is launching Asset Hub, where companies shutting down can sell off their inventory of code, Slack archives, emails and whatnot. Parts of Asset Hub are still in beta, Yona said, because SimpleClosure removes all personally-identifiable information from the internal company data, a sensitive and technically difficult process that they want to make sure is “rock solid” before rolling it out more widely…”

Posted in: AI, E-Records, Internet, Legal Research, Search Engines, Social Media